a nd: Sead argue | ‘ D. Smith B. Buchanan oe vd es Joke er A note has recently appeared in this journal” entitled "Artificial Intelligence". This note Griticized an earliér paper which used the same phrase in its title, Because _-we-agree-with—thiscriticism—and-because———— thetermartificial—intelligence—is-being- used -among- chemists—more-eaend——-—— tA OCULAR. , ark belt eve ot LS Vays ak : . | —more—frequently,_we_wish to distinguish statistical pattern recognition Chur ee ln ere water iu ——7- rere ~ fe schemes erom ‘axed tt cial intelligence programs. —Verious—stetisticaltechniques—subsumed_bythe-term_'nattern __ ~reeognition™ have—been-considered_e -sub-division—ofartificial_intelligence h Acvel 2. ~in-the—past—, The procedures for pattern recognition in the early 1960's A purehy, were far less statistical and more oriented to semantic information har vevould liA processing than bttern-recognitionwork-of the 1970's. More recently, he wigtic vroblein - gly! IS Ags then 5 ‘eas--have diverged significantly” because of fundamental differences in initial assumptions and computational procedures. Although there is still no precise definition of artificial intelligence (AI), most workers in the area would agree that work in AI is characterized by its use of judgmental rules for reasoning about a problem. The judgmental rules, or heuristics, ctervetmy do not guarantee the solution to a problen- hey anc interded fo However, the heuristies—de keep the reasoning ‘steps of the program within Seoppe et Wine bounds of plausibility. That is, an AI program may not solve a problem, put its reasoning, if the judgmental rules are good, will not be considered totally irrelevant. A second dimension which may help to distinguish AI . ‘ f f programs from-others is that the problems which are of interest are budetinde, ferry 2 Perf -generebiy—the-indsofprobitems=whtth-aré more complex than a knows ¥ how to solve using a straightforward, algorithmic method. Very typically, = — \ Sr Sy ESE PETE ee PT D. Smith B. Buchanan 3/11/74 Page 2 the problems are non-numerical, that is, they are not the kinds of & problems one can solve with a set of simultaneous equations. Computer programs with some degree of AI content are now being applied to chemical problens, for example, the analysis® and synthesis! { of molecular structures. Even a cursory comparison of these reports with descriptions of applications of statistical procedures to chemical 8,9 | problems will reveal fundamental differences in methodology. The statistical procedures, described variously as, for example, machine intelligence” and pattern recognition”, can be valuable techniques if applied with discretion. The fundamental assumption is that there is some relationship between the experimental data and the property (i.e., pharmacological activity’) of interest®*)0, If this assumption is not correct, erroneous hypotheses may appear to be validated because of 3a eccidental clustering, as Clerc et.al.t and Perrin™” have shown. ineaasense, these statistical techniques can be called “ rhrrdt-ess.. weds - Cus; wu tthe stue en Judgmental knowledge used routinely by chemists is not ‘employed by=these - —teghnzaves. As long as theoretical reasons for clustering are lacking, interpretation of the results will be on a questionable footing. This is in sharp contrast to current AI programs where the judgmental knowledge and the -rewreumts reasoning steps are -well-understeod; pia of Lf Urs ere Keanu. apd cpt phone Bp The assignment of the correct number of degrees of freedom poses some of the subtlest problems in statistical analysis. For example, consider the series of chemical names for the alkanes: odd even methane ethane propane butane pentane hexane heptane octane Hanan daxxne It will be noted that for these first eight examples, there prevails a perfect agreement between the parity of the name and of the molecular formula. The statistical significance of this correlation is not to be defended, and its material and historical basis, if any, is a matter of linguistic rather than chemical theory. This "clustering" may, however, well be transmitted to thousands of derivatives whose names may then exhibit highly significant correlations with other properties, This may seem a trivial level of correction. However, more generally, one must keep in mind that the sample of compounds on which characteristic data are available are always highly selected to start with, and that conventional statistical methods may be unable to remove the variety of sources of confounding. On the other hand pattern analysis may be a valuable approach to the furthering of speculations about functional signatures, which can then be subjected to further study for their possible theoretical significance. D. Smith B. Buchanan 3/11/74 Page 3 References 1. J. T. Clerc, P. Naegeli, and J. Seibl, Chimia, 27. 2. K. H. Ting, R. C. T. Lee, G. W. A. Milne, M. Shapiro, and A. M. Guarino, Science, 180 (1973) 417. 3. For additional criticism and a rebuttal, see a) C. L. Perrin, Science, 183 (1974) 551; b) K. H. Ting, Science, 183 (1974) 552. 4. E. A. Feigenbaum and J. Feldman, eds. "Computers and Thought" McGraw-Hill, New York, 1963, pp. 235 ff. 5. ? Newell 6. D. H. Smith, B. G. Buchanan, R. S. Engelmore, A. M. Duffield, A. Yeo, E. A. Feigenbaum, J. Lederberg, and C. Djerassi, J. Amer. Chem. Soc., 9h (1972) 5962. 7. a) E. J. Corey and W. T. Wipke, Science, 166 (1969) 178; b) E. J. Corey, W. T. Wipke, R. D. Cramer III, and W. T. Howe, J. Amer. Chem. Soc., 9h (1972) hel; c) H. Gelernter, N. S. Sridharan, A. J. Hart, S. C. Yen, F. W. Fowler, and J. J. Shue, Fortschr. Chem. Forsch., 41 (1973) 113. 8. 1. L. Isenhour and P. C. Jurs, Anal. Chem., 43 (1971) 20A (August). 9. B. R. Kowalski and C. F. Bender, J. Amer. Chem. Soc., 94 (1972) 5632. 10. B. R. Kowalski and C. F. Bender, J. Amer. Chem. Soc., 96 (1974) 916.