Bible Code Probabilities

Written by Lars Bobeck
14 of July 2024

I no longer have the Bible Code text here, but for those who have downloaded or printed the final version from March 25 year 2015 I have some very late additional text. You can still find the final version on Wayback Machine, search for https://larbob.se/BCP.html
First an explanation. Cropped matrix means that the term occurrences together are framed in a tight rectangle, to estimate proximity. I have understood that my first term probability to be used in the final cropped matrix can be hard to understand. Two examples can clearify it. If we have 1000 expected and real occurrences of the first term we will totally get 1000 matrixes. According to the Bonferroni theorem the best cropped matrix is then expected to have 1000 times better probability than the real probability, if the search area is possible in all cases. The reasonable way to compensate for this is to use the text probability for the first term. Compensated for the skip as told in the text. Notice that my kind of probabilities should be called expected occurrences to be accepted by orthodox mathematicians, for small values they are almost the same as ordinary probabilities. For small text probabilities and one occurrence we should be consistent and still use the skip compensated text probability for the first term in the cropped matrix. If a term has a 10 times lower probability in the text, it must reasonably have a 10 times lower probability also in the cropped matrix (for the same text skip). When expected occurrences differ from real, the real number can not alter the probability. Notice that the linearly falling curve I have written about can be crucial, especially if the alternative is the R-value. I have not seen it accepted anywhere else. It determines the skip compensation above. The text contains a number of less interesting details, to be reasonably complete. My attempt to be verbally precise in the text has made parts of it a bit difficult to understand. I have assumed that we choose the cropped matrix least probable to get by chance as our final matrix.

I will try to write a few lines about short text skip research, some want it although full text skip range research includes also short text skips. The first term probability will basically be the number of expected text occurrences in the text skip range we research in. Not compensated for the text skip, since the linearly falling curve is almost horizontal for short text skips. The conversion from text to cropped matrix probability for subsequent terms will basically be half the figure from text length conversion. The matrix text corresponds to the triangular lowest part of the area below the almost horizontal curve, expected occurrences are then falling to the average of the triangle height. As usually my figures should not be considered precise.

I can add that the R-value often used is almost always incorrect. We should always calculate the text probability for the text skip range we search in, the R-value calculation almost never does. An example: For full skip range research it gives the probabilities in skip ranges from minus to plus the skips where the terms occur. The conversion from text to cropped matrix R-value is basically correct only for short text skip search, proper converson for full text skip range search is basically the square of the text length relation. The R-value is known as a probability, we can see above that it is normally not. But what is then the R-value?
The two R-value calculations together are probably consciously designed to favour short text skip occurrences and disfavour long text skip occurrences. Since the faults are very simple it is the probable reason to have them. The original researchers have a personal religious belief that short text skip findings are less probable to get by chance. They also think that they have got it proven empirically. But if you expect to get your findings at short text skips, you will probably search more carefully for findings in that region. The R-value seems to adapt the probability figures to their personal religious belief. This is maybe very reasonable for the original researchers. The problem is that the R-value for a finding is presented as a probability, and that is almost never correct.

If we simplify only a little bit, the linearly falling curve proves that short text skip occurrrences are in reality most probable to get by chance. According to the R-value they are least probable. The explanation is that longer text skip letter sequences are more and more cut by the end of the text in both directions, so when the skip gets longer more and more of them cannot form an occurrence. This means that the number of occurrences at our first 500 text skips is very much bigger than at our last 500. You can simply test it yourself, use a 4-letter term to get enough occurrences. If you have 400 occurrences at the lowest 500 text skips your expected occurrences are only 1 at the highest 500. And the latter skip region should have the highest probability for occurrences according to the R-value. I have chosen 400 occurrences to get a fair comparison, if you have one or more unusual letters in your term. So that they can be expected to have a number of occurrences in the intervals. In the Koren version of the Torah commonly used for Bible Code research, the longest text skip for 4-letter terms is 101 601. The total number of letters in that version of the Torah is 304 805, and we have normally almost no skip to the first letter. I should add that the R-value is defined as the base 10 logarithm for expected occurrences with reversed sign.

An other R-value example is the conversion from text to cropped matrix R-value for subsequent terms. the R-value conversion is text length conversion. An ordinary 3-letter term with 4 million expected occurrences in the full text skip range, gets a probability for about 20 000 occurrences in a 39 times 39 matrix. This is obviously wrong. The matrix text length is 0.5 percent. Long ago I both made a test and proved theoretically that the proper conversion is roughly square conversion, so in this example about 100 occurrences should be expected. This more detailed text version is probably the final before I quit, to keep it download or print.