Bible Code Probabilities

Written by Lars Bobeck
2 of May 2024

I no longer have the Bible Code text here, but for those who have downloaded or printed the final version from March 25 year 2015 I have some very late additional text. You can still find the final version on Wayback Machine, search for
I have understood that my first term probability to be used in the final matrix can be hard to understand. Two examples can clearify it. If we have 1000 expected and real occurrences of the first term we will totally get 1000 matrixes. According to the Bonferroni theorem the best matrix is then expected to have 1000 times better probability than the real probability. The reasonable way to compensate for this is to use the text probability for the first term. Compensated for the skip as told in the text. Notice that my kind of probabilities should be called expected occurrences to be accepted by orthodox mathematicians, for small values they are almost the same as ordinary probabilities. For small text probabilities and one occurrence we should be consistent and still use the skip compensated text probability for the first term in the matrix. If a term has a 10 times lower probability in the text, it must reasonably have a 10 times lower probability also in the matrix (for the same text skip). When expected occurrences differ from real, the real number can not alter the probability. Notice that the linearly falling curve I have written about can be crucial, I have not seen it accepted anywhere else. It determines the skip compensation above. The text contains a number of less interesting details, to be reasonably complete. My attempt to be verbally precise in the text has made parts of it a bit difficult to understand. I have assumed that we choose the matrix least probable to get by chance as our final matrix. If we don't want that matrix we should search for another set of terms.

I will try to write a few lines about short text skip research, some want it although full text skip range research includes also short text skips. The first term probability will basically be the number of expected text occurrences in the text skip range we research in. Not compensated for the text skip, since the linearly falling curve is almost horizontal for short text skips. The conversion from text to matrix probability for subsequent terms will basically be text length conversion. The matrix text corresponds to the lowest part of the area below the almost horizontal curve, expected occurrences are then linearly falling with the text length. There are compensations for less interesting factors that can be made, but the statements above are the basics. There may be exeptions, I have only pondered the basics this time.

I can add that the R-value often used is almost always incorrect. We should always calculate the text probability for the text skip range we search in, the R-value calculation almost never does. Except at skip one it gives the probability in a text skip range from minus to plus the text skip where we have the occurrence. The conversion from text to matrix R-value is basically correct only for short text skip search, proper converson for full text skip range search is discussed in my long text from March 25 year 2015. The R-value is known as a probability, we can see above that it is normally not. But what is then the R-value?

The two R-value calculations together are probably consciously designed to favour short text skip occurrences and disfavour long text skip occurrences. Since the faults are very simple it is the probable reason to have them. The original researchers have a religious belief that short text skip findings are less probable to get by chance. They also think that they have got it prooven empirically. But if you expect to get your findings at short text skips, you will probably search more carefully for findings in that region. The R-value seems to adapt the probability figures to their personal religious belief. This is maybe very reasonable for the original researchers. The problem is that the R-value for a finding is presented as a probability, and that is almost never correct.

If we simplify only a little bit, the linearly falling curve proves that short text skip occurrrences are in reality most probable to get by chance. According to the R-value they are least probable. The explanation is that longer text skip letter sequences are more and more cut by the end of the text in both directions, so when the skip gets longer more and more of them cannot form an occurrence. This means that the number of occurrences at our first 500 text skips is very much bigger than at our last 500. You can simply test it yourself, use a 4-letter term to get enough occurrences. If you have 400 occurrences at the lowest 500 text skips your expected occurrences are only 1 at the highest 500. And the latter skip region should have the highest probability for occurrences according to the R-value. I don't know how I got my first example a bit wrong, I know the nature of this curve. I have chosen 400 occurrences to get a fair comparison, if you have one or more unusual letters in your term. So that they can be expected to have a number of occurrences in the intervals. In the Koren version of the Torah commonly used for Bible Code research, the longest text skip for 4-letter terms is 101 601. I should add that the R-value is defined as the base 10 logarithm for expected occurrences with reversed sign.