Articles
new articles
section catalog
keyword catalog
title catalog
author catalog
Google

Theology


Torah Codes

From: Chris Ho-Stuart <>
Newsgroups: aus.religion.christian
Subject: Equidistant Letter Sequences in Genesis
Date: 4 Nov 1998 01:21:05 GMT

In this essay, I will attempt to explain what Witztum, Rips and Rosenberg found in their now famous statistical science paper.[1] I will try to explain why their work is taken more seriously than that of other codes researchers, and why it is still not considered to support their conclusions.

What are the Torah Codes?

It is somewhat misleading to call the patterns discovered by Witztum, Rips, and Rosenberg (WRR) "codes".

The word "codes" was not used in the original WRR paper. The patterns they found were not in the form of coded information; unlike patterns purportedly found by some other codes enthusiasts such as Michael Drosnin. The authors of the original WRR paper have spoken out strongly against the use of "codes" to predict future events, and I would guess that most readers here recognize that Drosnin's codes have no mathematical validity.

The methodology used by WRR is much more sophisticated, and does indeed have a sound mathematical basis. The paper is quite technical, but the concepts are not actually hard to grasp. This discussion is intended to explain for interested readers with no mathematical background the salient points of the dispute.

The idea of codes formed by taking letters spaced at regular intervals in text has been around for some time. Such coded words or phrases are called equidistant-letter-sequences (ELS). Fascinating patterns can and have been found with such codes in all kinds of texts, and particular attention is paid to related phrases or words which appear close to one another in the text.

Where WRR stands out from other codes proponents is in the method they use to test whether associated words are closer than can be explained by chance alone.

WRR were the first to propose a mathematically based method for measuring the significance of such patterns; and in the paper they apply their method to determine if the names of famous Jewish rabbis were "close to" the dates of their birth or death.

In their paper they state how the lists of rabbis and dates was obtained. The define a single numeric quantity which captures the idea of how close the rabbis coded names appear to their dates.

They then generate 999999 other lists by associating the names of one rabbi with the dates of another randomly chosen rabbi. With the one semantically correct list, this gives 1000000 lists altogether. All lists are then given a measure of closeness, and ranked accordingly. The experiment considers the rank of the one correct list in a race with the other 999999 random lists. Four such races were conducted, using different measures and name conventions.

The result is that the one correct list placed highly in the four races. Specifically, it placed 453, 5, 570 and 4 out of one million. The probability than a randomly chosen association of names to dates would place 4th or better in one of the four races is 0.000016.

The authors conclude that the extent of the proximity of the ELSs for rabbi names with the ELSs of their dates of birth or death is not due to chance.

Though this is not explicit in WRR, the authors do suggest elsewhere that the proximity is due to a deliberate divine encoding of this knowledge at the time the Torah was written down.

What is the problem with the paper?

First, let's clarify what is not wrong with the paper. It does not contain mathematical errors. The conclusion -- that the performance of the correct list is not due to chance -- is solid.

A large number of criticisms have been made, but one in particular stands out from all the others, and that is the subjective nature of the list of names and dates used in the experiment.

It is important to note that this criticism does not constitute a charge of fraud. It is a straightforward question on the nature of the data, and it central to the significance of the results of the paper.

The experiment involves lists of names and dates, and attempts to discover if dates are unusually close to the related names. The list was constructed by first identifying 32 famous rabbis. Then for each rabbi, a number of different names were given. The list of names is credited to Professor Havlin.

The question of subjectivity of the list is a question of how much freedom there was in choosing the names. Professor Havlin has recently made his criteria for choosing names public, and they do without any question involve a significant degree of arbitrary judgement. That is now a matter of public record.[2]

Furthermore, the extent of subjectivity in the choice of names was not something that referees of Statistical Science could be expected to judge. This is especially true since it is now a matter of public record that the published criteria for finding names was not actually the criteria used. The real criteria involved far more freedom than is apparent from the published paper.

WRR speaks only of a search of the "Responsa" database at Bar Ilan University. (This is a well known reference for debates on Jewish law.) In fact, a huge range of sources were used, and there was no protocol established prior to construction of the list for determining names. According to the authors of the paper, the names reflect the professional judgement of Professor Havlin.

The list uses a great many names; up to eleven for a single rabbi. Many other names could have been chosen; and so the lack of an established protocol for choosing the particular names makes the list, by definition, subjective.

A similar point applies to the dates of birth and death; a number of variant forms were used for writing these dates, up to six in the case of one rabbi. The choice of date forms was also a matter of judgement.

In choosing a list of names for the rabbis, there are a large number of cases where some judgement needs to be applied. If those choices were made differently, no correlation is measured at all. This has been verified by experiment.[3]

The implications of this is that the significance of the phenomenon identified in the paper applies as much to the construction of the list of names as it does to the construction of the text of Genesis.

That is, the correlation demonstrated in the paper is now known not to be a correlation between a body of text, and the dates of birth and death of rabbis. It is a correlation between two bodies of text. Since one of those bodies of text is due to the experimenters, the phenomenon found in the paper is not particularly interesting and certainly not scientifically or mathematically relevant.

This wraps up the case against the "codes". However, I do continue to look at a few other aspects of the experiment.

What is the alternative hypothesis?

The paper by Rips, Witztum and Rosenberg contains a very curious omission for a statistical paper; and that is the absence of a clear hypothesis being tested by the experiment.

Usually a statistical experiment of this kind of intended to test a hypothesis. For example, we might hypothesize that the book of Genesis contains a descriptions of the lives of various famous rabbis encoded by reading letters in a different order.

As a hypothesis, of course, this is far too vague. A "different order" is not something which can be tested; and indeed WRR considers a specific kind of order. They consider ELS patterns. There are a number of other such choices; they look for so-called minimum skip ELS patterns, by applying a weighting function that gives reduced influence to ELSs with longer skips. They also use an extraordinarily complex measure of closeness; certainly not one which would be naturally proposed by a statistician.

We do not need to delve into the actual mathematical methods used to weight minimality, or to measure closeness. The question is: what, exactly, are the actual measures really trying to capture? That is, what is the research hypothesis to be confirmed by the experiment? WRR is rather unclear on this important point.

Whatever effect is proposed as an alternative hypothesis, it should be capable of explaining the following points.

(1) The phenomenon is not a feature of the Torah, or of the bible. It shows up only in Genesis, and is quite absent in the other four books. Singling out Genesis for testing must thus be added to a long list of subjective decisions which further dilute the research hypothesis.

(2) The measure of closeness used by WRR is sensitive to fairly subtle changes in the relative locations of names and dates. Thus, although names and dates in the correct list are shown to be closer on average than in the permuted lists, it is still the case that nearly all the rabbis are closest of all to the wrong dates.

(3) The performance of the "correct" list ranks highly in a race with random permutations, but it does not win. That is, this is not a "code". There are a huge number of random permutations which perform better than the "correct" list, and so it is not possible to identify the correct list using the reported phenomenon.

The distance calculation.

WRR plainly state that the particular formulae used to calculate distance could be chosen differently. In their own words:

"We stress than our definition of distance is not unique. Although there are certain general principles (like minimizing the skip d) some of the details can be carried out in other ways. We feel that varying these details is unlikely to affect the results substantially. Be that as it may, we chose one particular definition, and have, throughout, used _only_ it, that is, the function c(w, w') described in appendix A.2 of the Appendix had been defined before any sample was chosen, and it underwent no changes."

This extract illustrates a couple of points which need to be kept in mind when evaluating the paper.

(1) There are some general principles which have been adopted, such as "minimum skip". The paper nowhere gives any research hypothesis that justifies this principle. They only state that they obtain a statistical anomaly when they focus attention on minimum skip ELSs. Since there is no clear hypothesis to justify minimum skip, this becomes yet another subjective choice; and the significance of the measured phenomenon is further diluted.

(2) The authors do make some testable predictions, of a kind which would not normally be tested in pre-publication review. They suggest that the anomalous result would be likely to persist with other distance measures. We now know from subsequent research that the authors were wrong. The effect reduces substantially with other distance measures tested; which suggests that whatever agency or cause is leading to the anomaly, it is also connected with the choice of the distance function. [4]

(3) The authors recognize the problem presented by subjectivity. They assert that the function was chosen before any sample was chosen, and it underwent no changes.

(4) As a matter of fact choosing the function before the sample only prevents tuning of the function to the sample. But it assists in tuning the sample to the function. What *should* have been done to verify true independence is to either have the function chosen by an independent person (since they concede that other functions could be chosen) and then kept secret until after the data samples were chosen; or else the tests should have been run on a range of functions. The latter has now been done, as indicated above, establishing that the data and the function are in fact strongly correlated.

The particular highly complex distance measure used in WRR is a hold over from earlier research conducted in the eighties, in which they attempted to calculate probabilities directly. This earlier method was mathematically nonsense; the numbers calculated were not probabilities at all, and made many invalid independence assumptions. This has been pointed out by a number of statisticians.

This fact is implicitly conceded by the authors. In the WRR paper, appendix A.5, the numbers P1 to P4 are defined, and justified in the following terms (for example):

"If the c(w,w') were independent random variables [..] then P2 would be the probability that the product PIs(w,w') is as small as it is, or smaller. But as before, we do not use any such uniformity or independence assumptions. Like P1, the statistic P2 is calibrated in probability terms; but [...] one should think of it simply as an ordinal index that enables [comparisons of permutations]."

What is not stated here is that the original research *did* treat the numbers as probabilities! The suggestion of using comparisons with permutations was made by a referee, almost certainly to overcome this defect. However; the authors did not (despite Witztum's subsequent remarks to the contrary!) use the referee's suggestion directly. They continued to use their rather strange "probabilities" in the permutations race (as described above) and (unsurprisingly!) they obtained an exceptionally good result.

Had they actually used the referee's suggestion the result would have been orders of magnitude worse, and the paper would most likely have been rejected as not demonstrating the purported effect.

Some other curious effects of the actual measure used in WRR are worth of note. It is quite brittle. The ranking of the "correct" list can be made many times worse by changing dates for a rabbi whose name does not even appear in Genesis as an ELS! Two of the rabbis in the list have no associated correct dates -- any yet have a strong effect on the result! That is, the measure used has incorporates a significant component of noise.

Also the complexity of the measurement function conceals a number of fairly arbitrary choices intended to assist programming such a complex function. For example, (appendix A.2 of WRR) a cutoff is applied to obtain an expected number of ELSs for a word which was chosen to be 10. A number of other such choices can be found in appendix A.3. We do not need to know the mathematical significance of the number 10, or how it is used. All we need to know is that the results have since been tested using a range of other numbers: nearly always with a corresponding reduction in the published significance level. A paper giving the details of these experiments is in preparation by Brendan McKay and others.

The text used.

The authors selected a particular text of Genesis. There are in fact a number of texts which could have been used in its place.

It is important to note that these different texts are not analogous to the many vastly different versions of the Christian bible which are in common use. They differ only in a few words or spellings, usually. However, such subtle differences have a powerful effect on the patterns of ELSs studied.

Traditional Orthodox Jewish thought does consider the Torah divine as given. There is no question of introducing minor variations to make a "better" translation. The Torah is the Torah, and it should not be changed. However, this ideal of single Torah does not mean that Orthodox Jewish thought insists that there have been no changes. There is a long tradition in Jewish study of the Torah of focusing on textual problems involving extra words or letters; and there is no justification for claiming the Koren edition (used by WRR) as the one perfect edition. When the same experiment is run on other editions, the reported effect is always substantially reduced.

A comprehensive discussion on textual issues is provided by Professor Jeffrey Tigay at http://www.sas.upenn.edu/~jtigay/codetext.html

The question of fraud.

It should be emphasized that none of the points made above constitute an allegation of fraud.

There has been a long string of coincidences identified: and it would be tempting to suggest that this means the authors carefully made each of these choices to assist the result. That does not follow, and this point should be explained.

First, consider the list of choices made, all of which have a significant effect on the given result.

(1) And by far the most important: the names chosen for the rabbis. (2) The formats chosen for the dates. (3) The distance function. (4) Various tunable parameters of the chosen function. (5) The choice of the book of Genesis rather than other books. (6) The choice of a particular text for Genesis.

In each case, any attempt to make independent choices leads to a substantial reduction of the purported effect.

This does NOT mean that each of the choices was deliberately made in order to get a better result. In fact, that is almost certainly not what occurred.

We can conclude, however, that the published significance level is definitely highly subjective. Any proposed hypothesis which explains the result is not finding a code in the torah. It is finding a code in the combination of Genesis, a list of names, a distance function, an edition, a set of tuning parameters, etc.

The mostly likely explanation is that the results are obtained because the names and dates were chosen after the other choices had been fixed, and that the choice of names was not independent of the other choices. All it takes to obtain the long list of coincidences listed is to have some tuning of the data to obtain a good result under the given experimental conditions.

The paper does claim a high degree of independence in the choice of names, in that lists of names were prepared separately. (Although mathematically speaking, the non-independence of the lists of names and the testing functions has the status of a theorem.) However, the official account of the preparation of data is not entirely accurate, as has been shown by looking at lectures by Professor Rips in the mid eighties.[5] There appears to have be ample scope for some amount of information exchange which would invalidate the experiment.

Furthermore, the earlier reports of this "research" do not place the same high emphasis on independent generation of data and test conditions. It is a fairly clear case of investigators being mislead by their own biases.

This does, of course, have implications for the competence of the investigators. But there is no reason to shy away from such a conclusion because it reflects badly on those concerned. One of the reasons for having independent review and investigation is to identify shoddy research.

Professor Rips' professional reputation has, I am quite sure, suffered a blow from which it will never fully recover. He should have recognized the sensitivity of the experiment to the text used for naming rabbis and giving the dates of death and birth. However, he turned over the task of preparing and testing the list to Witztum, and did not establish the kinds of controls needed to ensure independence.

The actual tuning of the sample was almost certainly not directly due to Rips, but the first author of the paper: Doron Witztum.

The response.

WRR has almost no credible supporters. Mathematicians (excepting only Rips and Michelson who are authors of "codes" papers) are unanimous on the subject. The vast majority of Orthodox Jewish Torah scholars consider it nonsense. No independent religious body has come out in support of this idea; and churches generally denounce it as misuse of scripture.

Three further articles are cited in my references from Jewish mathematicians who are concerned to show the errors in WRR.[6,7,8]

Why so much reaction? Vacuous mathematical papers get published from time to time without this level of response. It can't be that the response is from atheists determined to upset a dangerous proof of divine action -- the most vocal response is from those who DO consider the Torah to be divine.

And this explains the reaction. People who care about the bible are the most vocal in refuting vacuous nonsense which if left unchallenged could only serve to bring the bible into disrepute.

Conclusion

The paper describes, at first sight, an interesting phenomenon. The major defect -- subjectivity of the list of rabbis -- was not particularly apparent in the paper. Subsequent investigation has confirmed this beyond doubt: the phenomenon is not a feature of the text of Genesis: it is a feature of two texts -- Genesis, and the text of the data.

That is, the correlation is not with actual dates and rabbis. The correlation is with the way in which the experimenters chose to write down the dates and the rabbis.

The experiment does not set out sufficient controls on information exchange between the sample selection and the distance functions, and its conclusions are worthless.

References.

[1] Witztum, D. Rips, E. and Rosenberg, Y. "Equidistant Letter Sequences in the Book of Genesis" in Statistical Science 1994, Vol 9, No 3, pp 429-438. Abridged version on-line at http://www.fortunecity.com/tattooine/delany/11/genesis.htm

[2] http://www.torahcodes.co.il/havlin.htm

[3] http://cs.anu.edu.au/~bdm/dilugim/report2.html

[4] Bar-Hillel, M. Bar-Natan, D. McKay, B. "The Torah Codes: Puzzle and Solution". in Chance, Vol 11, No 2, 1998, pp 13-19. On-line at http://cs.anu.edu.au/~bdm/dilugim/Chance.pdf

[5] http://cs.anu.edu.au/~bdm/dilugim/ripslect/

[6] http://wopr.com/biblecodes/TheCase.htm

[7] http://www.ma.huji.ac.il/~kalai/bc.html

[8] http://cs.anu.edu.au/~bdm/dilugim/hasofer.html



top of page