PROTEAN Project 5P41-RR00785-13 PI: Edward A. Feigenbaum and Bruce G. Buchanan Agency: Boeing Computer Services Corporation Grant identification number: W-266875 Total award period and amount: 2/1/85 - 1/31/85, $225,000 (direct and indirect) Current award period and amount: 2/1/85 - 1/31/85, $225,000 (direct and indirect) PROTEAN component is $11,400 (direct & indirect) or 5% of grant Title: Knowledge-Based Systems Research PI: Edward A. Feigenbaum Agency: Defense Advanced Projects Research Agency Grant identification number: N00039-86-0033 Total award period and amount: 10/1/85 - 9/30/88 $4,130,230 (in negotiation) (direct and indirect) Current award period and amount: 10/ 1/85 - 9/30/86 $1,224,241 (direct and indirect) PROTEAN component is $23175, or 1.9 % of grant total Il. INTERACTIONS WITH THE SUMEX-AIM RESOURCE A. Medical Collaborations Several members of Prof. Jardetzky's research group are involved in this research. B. Interactions with other SUMEX-AIM projects Members of the PROTEAN project visits Robert Langridge’s laboratory at the University of San Francisco last year, and informal discussions with him and his group have continued in this year. C. Critique of Resource Management The SUMEX staff has continued to be most cooperative in supporting PROTEAN research. The SUMEX computer facility is well maintained and managed for effective support of our work. III. RESEARCH PLANS A. Goals & Plans Our long-range goal is to build an automatic interpretation system similar to CRYSALIS (which worked with x-ray crystallography data). In the shorter term, we are E. H. Shortliffe 126 5P41-RRO00785-13 PROTEAN Project building interactive programs that aid in the interpretation of NMR data on small proteins. The current version of PROTEAN has five domain and five control knowledge sources that demonstrate the reasoning techniques described above. These knowledge sources develop partial solutions that position multiple alpha helices, random coils, and beta structures at the Solid level and refine those helices using distance, surface, and volume constraints. The proposed research would expand PROTEAN to include knowledge sources that: 1, merge highly constrained partial solutions at the Solid level; 2. refine Solid level solutions in terms of the relative positions of constituent peptide units and side chains at the Superatom level; 3. further restrict the relative locations of peptide units and side chains relative to one another at the Superatom level; 4. propagate emergent constraints at the Superatom level back up to the Solid level to further restrict the relative positions of superordinate helices, beta sheets, and random coils; 5. refine Superatom level solutions at the Atom level; 6. further restrict the relative locations of atoms relative to one another; 7. propagate emergent constraints at the Atom level back up to the Superatom level to further restrict the relative positions of superordinate peptide units and side chains. 8. select instances of structures to be used as starting points for other kinds of refinement procedures, such as the solution of the Bloch equations, which define the NMR spectrum that can possibly arise from a given structure. These equations provide a very strong test of the correctness of our method, as well as providing an additional constraint on proposed structures. 9. position non-protein components with respect to partial solutions to proteins. Cofactors such as the heme group in myoglobin may be very constraining and lead to better structures. 10. develop efficient and effective control strategies for the solution of small, intermediate, and large molecules. 11. reason about mobility of structures when the data indicate that mobility is possible. The research will also develop a set of control knowledge sources to guide PROTEAN's application of constraints to identify the family of legal protein conformations as efficiently as possible. We expect to improve the graphics interface to provide more functionality and options for viewing partial structures. B. Justification for continued SUMEX use We will continue to use SUMEX for developing parts of the program before integrating them with the whole system. We are using Interlisp to implement the Blackboard model and knowledge structures most flexibly and quickly. C. Need for other computing resources In this stage of development we need more computer cycles for geometric computations. 127 E. H. Shortliffe PROTEAN Project 5P41-RRO00785-13 We expect to upgrade the Silicon Graphics IRIS terminal to a workstation for more efficiency in the subprograms doing computational geometry and for more effect graphical display of the results of PROTEAN. E. H. Shortliffe 128 5P41-RRO00785-13 RADIX Project IV.A.5. RADIX Project The RADIX Project: Deriving Medical Knowledge from Time-Oriented Clinical Databases Robert L. Blum, M.D., Ph.D. Department of Computer Science Stanford University Gio C. M. Wiederhold, Ph.D. Departments of Computer Science and Medicine Stanford University I. SUMMARY OF RESEARCH PROGRAM A. Technical Goals - Introduction Medical and Computer Science Goals -- The long-range objectives of our project, called RADIX (formerly RX), are 1) to increase the validity of medical knowledge derived from large time-oriented databases containing routine, non-randomized clinical data, 2) to provide knowledgeable assistance to a research investigator in studying medical hypotheses on large databases, 3) to fully automate the process of hypothesis generation and exploratory confirmation. For system development we have used a subset of the ARAMIS database. Computerized clinical databases and automated medical records systems have been under development throughout the world for at least a decade. Among the earliest of these endeavors was the ARAMIS Project, (American Rheumatism Association Medical Information System) under development since 1969 in the Stanford Department of Medicine. ARAMIS contains records of over 17,000 patients with a variety of rheumatologic diagnoses. Over 62,000 patient visits have been recorded, accounting for 50,000 patient-years of observation. The ARAMIS Project has now been generalized to include databases for many chronic diseases other than arthritis. The fundamental objective of the ARAMIS Project and many other clinical database projects is to use the data that have been gathered by clinical observation in order to study the evolution and medical management of chronic diseases. Unfortunately, the process of reliably deriving knowledge has proven to be exceedingly difficult. Numerous problems arise stemming from the complexity of disease, therapy, and outcome definitions, from the complexity of causal relationships, from errors introduced by bias, and from frequently missing and outlying data. A major objective of the RADIX Project is to explore the utility of symbolic computational methods and knowledge-based techniques at solving some of these problems. The RADIX computer program is designed to examine a time-oriented clinical database such as ARAMIS and to produce a set of (possibly) causal relationships. The algorithm exploits three properties of causal relationships: time precedence, correlation, and nonspuriousness. First, a Discovery Module uses lagged, nonparametric correlations to generate an ordered list of tentative relationships. Second, a Study Module uses a knowledge base (KB) of medicine and statistics to try to establish nonspuriousness by controlling for known confounders. The principal innovations of RADIX are the Study Module and the KB. The Study 129 E. H. Shortliffe. RADIX Project 5P41-RR00785-13 Module takes a causal hypothesis obtained-from the Discovery Module and produces a comprehensive study design, using knowledge from the KB. The study design is then executed by an on-line statistical package, and the results are automatically incorporated into the KB. Each new causal relationship is incorporated as a machine-readable record specifying its intensity, distribution across patients, functional form, clinical setting, validity, and evidence. In determining the confounders of a new hypothesis the Study Module uses previously “learned” causal relationships. In creating a study design the Study Module follows accepted principles of epidemiological research. It determines study feasibility and study design: cross- sectional versus longitudinal. It uses the KB to determine the confounders of a given hypothesis, and it selects methods for controlling their influence: elimination of patient records, elimination of confounding time intervals, or statistical control. The Study Module then determines an appropriate statistical method, using knowledge stored as production rules. Most studies have used a longitudinal design involving a multiple regression model applied to individual patient records. Results across patients are combined using weights based on the precision of the estimated regression coefficient for each patient. More recently, we have undertaken two new components to the RADIX program: a module for automated summarization of patient records, and a knowledge~based discovery module. The goal of the summarization program is to automatically create patient summaries of arbitrary and appropriate complexity as an aid for tasks such as clinical decision making, real-time patient monitoring, surveillance of quality of care, and eventually automated discovery. This program builds on our experience with labelling patient records in RX, and is a natural extension of our work on the interface between AI and medical databases. The goal of the knowledge-based discovery module is to overcome some of the limitations of the original, statistics-based, RX discovery module. In creating disease hypotheses, researchers make extensive use of notions of causation, mechanism of action, tempo, and quantitative sufficiency, as well as detailed knowledge of pathophysiology. We are seeking to automate this process of hypothesis formation by replicating selected discoveries in rheumatology using data from the ARAMIS database. B. Medical Relevance and Collaboration As a test bed for system development, our focus of attention has been on the records of patients with systemic lupus erythematosus (SLE) contained in the Stanford portion of the ARAMIS Data Bank. SLE is a chronic rheumatologic disease with a broad spectrum of manifestations. Occasionally the disease can cause profound renal failure and lead to an early death. With many perplexing diagnostic and therapeutic dilemmas, it is a disease of considerable medical interest. In the future we anticipate possible collaborations with other project users of the TOD System such as the National Stroke Data Bank, the Northern California Oncology Group, and the Stanford Divisions of Oncology and of Radiation Therapy. We believe that this research project is broadly applicable to the entire gamut of chronic diseases that constitute the bulk of morbidity and mortality in the United States. Consider five major diagnostic categories responsible for approximately two thirds of the two million deaths per year in the United States: myocardial infarction, stroke, cancer, hypertension, and diabetes. Therapy for each of these diagnoses is fraught with controversy concerning the balance of benefits versus costs. 1. Myocardial Infarction: Indications for and efficacy of coronary artery bypass graft vs. medical management alone. Indications for long-term antiarrhythmics ... long-term anticoagulants. Benefits of cholesterol-lowering diets, exercise, and so forth. E. H. Shortliffe 130 5P41-RRO00785-13 RADIX Project 2. Stroke: Efficacy of long-term anti-platelet agents, long-term anticoagulation. Indications for revascularization. 3. Cancer: Relative efficacy of radiation therapy, chemotherapy, surgical excision - singly or in combination. Optimal frequency of screening procedures. Prophylactic therapy. 4. Hypertension: Indications for therapy. Efficacy versus adverse effects of chronic antihypertensive drugs. Role of various diagnostic tests such as renal arteriography in work-up. 5. Diabetes: Influence of insulin administration on microvascular complications. Role of oral hypoglycemics. Despite the expenditure of billions of dollars over recent years for randomized controlled trials (RCT's) designed to answer these and other questions, answers have been slow in coming. RCT's are expensive in terms of funds and personnel. The therapeutic questions in clinical medicine are too numerous for each to be addressed by its own series of RCT's. On the other hand, the data regularly gathered in patient records in the course of the normal performance of health care delivery are a rich and largely underutilized resource. The ease of accessibility and manipulation of these data afforded by computerized clinical databases holds out the possibility of a major new resource for acquiring knowledge on the evolution and therapy of chronic diseases. The goal of the research that we are pursuing on SUMEX is to increase the reliability of knowledge derived from clinical data banks with the hope of providing a new tool for augmenting knowledge of diseases and therapies as a supplement to knowledge derived from formal prospective clinical trials. Furthermore, the incorporation of knowledge from both clinical data banks and other sources into a uniform knowledge base should increase the ease of access by individual clinicians to this knowledge and thereby facilitate both the practice of medicine as well as the investigation of human disease processes. The medical relevance of the automated summarization program is readily apparent. A practicing physician or medical researcher, faced with a patient chart, often with dozens of visits and scores of attributes, rarely has time to read the entire chart. He (or she) would like a succinct summary of the important events in that patient’s record to assist his decision making. The objective of our current work is to implement an artificial- intelligence based computer program that produces such summaries. C. Highlights of Research Progress C.l April 1985 to April 1986 Our primary accomplishments in this period have been the following: 1) Design and implementation of a prototype automated summarization program. 2) Design of an intelligent medical hypothesis generator for the discovery module, partially implemented. 3) Identification and comparison of novel Exploratory Data Analysis (EDA) methods for use in automated discovery. 4) Installation of the KEE Knowledge Engineering Environment on the 1108's and training of project members in KEE. 131 E. H. Shortliffe RADIX Project 5P41-RRO0785-13 5) Publication of papers on automated discovery and automated summarization, and presentation of results at medical conferences. 6) Training Post-Doctoral researchers, participants in RADIX, in methods of medical artificial intelligence research. C.1.1 Design and implementation of a prototype automated summarization program We have designed and implemented a prototype for the automated summarization program using KEE. The prototype labels and summarizes a small number of major events in time-oriented medical records of systemic lupus erythematosus patients. The user interface provides an interactive, graphic representation of the record with active regions selectable by the user to display data supporting conclusions or to magnify a region showing greater detail of the patient record. The program uses a hypothetico- deductive algorithm. Disease states are evoked based on attributes with abnormal values, and the evoked disease frames are confirmed by matching their templates against the patient record using a temporal querying syntax. This produces likelihood ratios that are used for Bayesian updating of the evoked disease states. A knowledge base of definitions of medical objects and their causal relations, implemented in KEE, underlies the program. This work is described in Downs, 1986, noted in the publications section. C.1.2 Design of an intelligent medical hypothesis generator for the Discovery Module We have designed an intelligent medical hypothesis generator for the Discovery Module, which we are currently implementing. The design will evolve as we gain experience with the system. In contrast with the original RX Discovery Module, which relied on statistical methods, the new program relies on artificial intelligence methods. In this work we are interested in elaborating the cognitive mechanisms whereby disease hypotheses are formulated in terms of clinical events and the known inter-relationships among agents of disease and organ function. Clearly, in creating disease hypotheses researchers make extensive use of plausible notions of causation, mechanism of action, tempo, and: quantitative sufficiency, and so on. Our overall methodology involves several stages: 1) analysis of important, previous discoveries from the. biomedical literature relevant to our patient database, 2) formulation of a theory of how these discoveries could have been made, 3) development of knowledge representation and reasoning mechanisms adequate to embody the theory in software, 4) computer simulation of the selected discoveries to evaluate and refine the implementation, 5) generalization of the program to other selected discoveries, and 6) operation of the program in a self-guided mode to seek previously unknown findings. This work is described in Walker, 1986, noted in the publications. C.1.3 Identification and comparison of novel Exploratory Data Analysis (EDA) methods for use in automated discovery The field of Exploratory Data Analysis (EDA) is one of the roots of our work in automated discovery. Our work on statistical methods for hypothesis generation has sought to extend our earlier work on the RX Discovery Module, which was based on nonparametric correlations on all variables at several time lags. There are many cases in which data will show no correlation, but still have an interesting structure. For example, if the data fall into clusters, are non-linear, or follow a regular pattern they may suggest an important relationship but have zero correlation. We have identified and compared a number of novel EDA methods for use in these situations, described in Walker, 1986, noted in the publications section. C.1.4 Installation of the KEE Knowledge Engineering Environment on the 1108's and training of project members in KEE We have been fortunate in obtaining use of the KEE Knowledge Engineering E. H. Shortliffe 132 5P41-RRO00785-13 RADIX Project Environment, a very valuable tool for aiding development of artificial intelligence software. KEE, from [ntellicorp, costs approximately $35,000, but was supplied to us without charge. KEE is installed on our machines, and project members are now using KEE for program development. C.1.5 Publication of papers on automated discovery and automated summarization, and presentation of results at medical conferences In addition to the publications noted above, we have submitted and/or had accepted additional papers, noted in the section on publications, and presented results at numerous medical conferences. C.1.6 Training Post-Doctoral researchers, participants in RADIX, in methods of medical artificial intelligence research We have been training three post-doctoral researchers on the project during the current reporting year; Steven Downs, M.D., Isabelle de Zegher-Geets, M.D., and Donald Rucker, M._D.. Steven Downs will complete a thesis as part of Stanford's Medical Information Sciences program this June; Rucker and de Zegher-Geets will undertake theses in the coming year. C.2 Research in Progress Our current research carries forward the work in automated summarization and automated discovery described above. Specifically, we are 1) implementing the intelligent discovery module, and evaluating and modifying its design as we get initial results, and 2) substantially expanding the prototype automated summarization module to be able to deal with a full patient record. We continue to work on problems involved in the representation of medical knowledge, as part of developing the programs for summarization and discovery. These programs act both as test beds for the extant knowledge representation techniques, and forcing functions for the development of new techniques. D. Publications 1. Blum, R.L. and Walker, M. G.: Lisp as an Environment for Software Design Invited paper for Tenth Annual Symposium on Computers in Medical Care (SCAMC), Washington, D.C., October 1986. 2. Blum, R.L.: Computer-Assisted Design of Studies using Routine Clinical Data: Analyzing the Association of Prednisone and Cholesterol. (Accepted for publication in the Annals of Internal Medicine.) 3. Blum, Robert L. and Gio C.M. Wiederhold: Studying Hypotheses on a Time- Oriented Clinical Database: An Overview of the RX Project. In J.A.Reggia and S.Thurim: ‘Computer Assisted Medical Decision-Making’; Springer Verlag, 1985, pp.245--253. 4. Blum, R.L.: Two Stage Regression: Application to a Time-Oriented Clinical Database. Knowledge Systems Laboratory Technical Report. 1985. 5. Blum, R.L.: Modeling and encoding clinical causal relationships. Proceedings of SCAMC, Baltimore, MD, October, 1983. 6. Blum, R.L.: Representation of empirically derived causal relationships. IJCAI, Karlsruhe, West Germany, August, 1983 . 7. Blum, R.L.. Machine representation of clinical causal relationships. MEDINFO 83, Amsterdam, August, 1983. 133 E. H. Shortliffe RADIX Project 5P41-RR00785-13 8. Blum, R.L.: Clinical decision making aboard the Starship Enterprise. Chairman's paper, Session on Artificial Intelligence and Clinical Decision Making, AAMSI, San Francisco, May, 1983. 9, Blum, R.L. and Wiederhold, G.: Studying hypotheses on a time-oriented database: An overview of the RX project. Proc. Sixth SCAMC, IEEE, Washington D.C, October, 1982. 10. Blum, R.L.: Induction of causal relationships from a time-oriented clinical database: An overview of the RX project. Proc. AAAI, Pittsburgh, August, 1982. 11. Blum, R.L.: Automated induction of causal relationships from_a time- oriented clinical database: The RX project. Proc. AMIA San Francisco, 1982. 12. Blum, R.L.: Discovery and Representation of Causal Relationships from a Large Time-oriented Clinical Database: The RX Project. In D.A.B. Lindberg and P.L. Reichertz (Eds), LECTURE NOTES IN MEDICAL INFORMATICS, Springer-Verlag, 1982. 13, Blum, R.L.: Discovery, confirmation, and incorporation’ of causal relationships from a large time-oriented clinical database: The RX project. Computers and Biomed. Res. 15(2):164-187, April, 1982. 14, Blum, R.L.: Discovery and representation of causal relationships from a large time-oriented clinical database: The RX project (Ph.D. thesis). Computer Science and Biostatistics, Stanford University, 1982. 15. Blum, R.L.: Displaying clinical data from a time-oriented database. Computers in Biol. and Med. 11(4):197-210, 1981. 16. Blum, R.L.: Automating the study of clinical hypotheses on a time-oriented database: The RX project. Proc. MEDINFO 80, Tokyo, October, 1980, pp. 456-460. (Also STAN-CS-79-816) 17. Blum, R.L. and Wiederhold, G.: Inferring knowledge from clinical data banks utilizing techniques from artificial intelligence. Proc. Second SCAMC, IEEE, Washington, D.C., November, 1978. 18. Blum, R.L.: The RX project: A medical consultation sysiem integrating clinical data banking and artificial intelligence methodologies, Stanford University Ph.D. thesis proposal, August, 1978. 19. Downs, S., Walker, M.G. and Blum, R.L.: Automated Summaries of Patient Medical Records. Accepted for Medinfo '86. 20. Kuhn, I., Wiederhold, G., Rodnick, J.E., Ramsey-Klee, D.M., Benett, S., Beck, D.D.: Automated Ambulatory Medical Record Systems in the U.S., to be published by Springer-Verlag, 1983, in Information Systems for Patient Care, B. Blum (ed.), Section If, Chapter 14. 21. Walker, M.G.: How Feasible is Automated Discovery? Knowledge Systems Laboratory Technical Report KSL 86-35. Submitted to IEEE Expert. 22. Walker, M.G., and Blum, R.L.: Towards Automated Discovery from Clinical Databases: The RADIX Project Accepted for Medinfo '86. E. H. Shortliffe 134 5P41-RR00785-13 RADIX Project 23. Walker, M.G., Blum, R.L., and Fagan, L.M.: Minimycin: A Miniature Rule- Based System. M.D.Computing, Vol. 2, No. 4., 1985. 24. Walker, M.G., and Blum, R.L.:: An Introduction to LISP. M.D.Computing, Vol. 2, No. 1., 1985. 25. Wiederhold, Gio, Robert L. Blum, and Michael Walker: An Integration of Knowledge and Data Representation. Proc. of Islamorada Workshop, Feb.1985, Computer Corporation of America, Cambridge MA; Chapter 23 of 'On Knowledge Base Management Systems: Integrating Artificial Intelligence and Database Technologies’ (Brodie, Mylopoulos, and Schmidt, eds.), Springer Verlag, June 1986. 26. Wiederhold, Gio: Knowledge versus Data. Chapter 9 of ‘On Knowledge Base Management Systems: Integrating Artificial Intelligence and Database Technologies’ (Brodie, Mylopoulos, and Schmidt, eds.) Springer Verlag, Feb.1986. 27, Missikoff, Michele and Gio Wiederhold: Towards a Unified Approach for Expert and Database Systems. In ‘Expert Database Systems’, Larry Kerschberg (editor), Benjamin/Cummings, 1986, pages 383-399; also in Proceedings of First Workshop on Expert Database Systems, Kiawah Island, South Carolina, Oct.1984, vol.1, pp.186-206. 28. Wiederhold, Gio and Paul D. Clayton: Processing Biological Data in Real Time. M.D. Computing, Springer Verlag, Vol.2 No.6, November 1985, pages 16-25. 29. Wiederhold, Gio: Knowledge Bases. Future Generations Computer Systems, North-Holland, vol.1 no.4; April 1985, pp.223--235. 30. Wiederhold, Gio: Disease Registers: Use of Databases to Generate New Medical Knowledge. In K.Abt, W.Giere and_B.Leiber: ‘Krankendaten, Krankheitsregister, Datenschutz’, vol.58, Medizinische Informatik und Statistik, Springer Verlag, 1985, pp.39-55. 31. Wiederhold, G.: Networking of Data Information, National Cancer Institute Workshop on the Role of Computers in Cancer Clinical Trials, National Institutes of Health, June 1983, pp.113-119. 32. Wiederhold, G.: Database Design (in the Computer Science Series) McGraw-Hill Book Company, New York, NY, May 1977, 678 pp. Second edition, Jan. 1983, 768 pp. 33. Wiederhold, G.: In D.A.B. Lindberg and P.L. Reichertz (Eds.), Databases for Health Care, Lecture Notes in Medical Informatics, Springer-Verlag, 1981. 34. Wiederhold, G.: Database technology in health care. J. Medical Systems 5(3):175-196, 1981. E. Funding Support Status 1) Representation and Use of Causal Knowledge for Inference from Databases Robert L. Blum, M.D., Ph.D.: Principal Investigator National Science Foundation: IST 83-17858 Total award: $89,597 (direct + indirect) 135 E. H. Shortliffe RADIX Project 5P41-RR00785-13 Term: March 15, 1984 through March 14, 1986 2) Deriving Knowledge from Clinical Databases Gio C. M. Wiederhold, Ph.D.: Principal Investigator National Library of Medicine: LM-04334 Total award: $291,192 (direct) Term: May 1, 1984 through November 30, 1986 Il. INTERACTIONS WITH THE SUMEX-AIM RESOURCE A. Collaborations Once the RADIX program is developed, we would anticipate collaboration with some of the ARAMIS project sites in the further development of a knowledge base pertaining to the chronic arthritides. The ARAMIS Project at the Stanford Center for Information Technology is used by a number of institutions around the country via commercial leased lines to store and process their data. These institutions include the University of California School of Medicine, San Francisco and Los Angeles; The Phoenix Arthritis Center, Phoenix; The University of Cincinnati School of Medicine; The University of Pittsburgh School of Medicine; Kansas University; and The University of Saskatchewan. All of the rheumatologists at these sites have closely collaborated with the development of ARAMIS, and their interest in and use of the RADIX project is anticipated. We hasten to mention that we do not expect SUMEX to support the active use of RADIX as an on-going service to this extensive network of arthritis centers, but we would like to be able to allow the national centers to participate in the development of the arthritis knowledge base and to test that knowledge base on their own clinical data banks. B. Interactions with Other SUMEX-AIM Projects During the current reporting year we have had frequent interaction with members of other Sumex projects; for example, to discuss theoretical issues in discovery and automated summarization, practical programming issues, and to assist training of Medical Computer Science Students in the use of KEE, Lisp workstations, and so on. The Sumex community is an invaluable resource for providing such interaction. C. Critique of Resource Management The DEC System 20 continues to provide acceptable performance, but it is frequently heavily loaded at peak hours. The SUMEX resource management continues to be accessible and quite helpful. III. RESEARCH PLANS A. Project Goals and Plans The overall goal of the RADIX Project is to develop a computerized medical information system capable of accurately extracting medical knowledge pertaining to the therapy and evolution of chronic diseases from a database consisting of a collection of stored patient records. SHORT-TERM GOALS -- Our short term goals focus on the two activities described earlier: implementation and E. H. Shortliffe 136 5P41-RR00785-13 RADIX Project further development of the intelligent discovery module, and substantial expansion of the automated summarization program to deal with an entire rheumatology patient record. LONG-RANGE GOALS -- The long-range goals of the RADIX Project are 1) automatic discovery of knowledge in a large time-oriented database, and provision of assistance to a clinician who is interested in testing a specific hypothesis, and 2) development of techniques for automated summarization of patient records. We hope to make these programs sufficiently robust that they will work over a broad range of hypotheses and over a broad spectrum of patient records. B. Justification and Requirements for Continued Use of SUMEX Computerized clinical data banks possess great potential as tools for assessing the efficacy of new diagnostic and therapeutic modalities, for monitoring the quality of health care delivery, and for support of basic medical research. Because of this potential, many clinical data banks have recently been developed throughout the United States. However, once the initial problems of data acquisition, storage, and retrieval have been dealt with, there remains a set of complex problems inherent in the task of accurately inferring medical knowledge from a collection of observations in patient records. These problems concern the complexity of disease and outcome definitions, the complexity of time relationships, potential biases in compared subsets, and missing and outlying data. The major problem of medical data banking is in the reliable inference of medical knowledge from primary observational data. We see in the RADIX Project a method of solution to this problem through the utilization of knowledge engineering techniques from artificial intelligence. The RADIX Project, in providing this solution, will provide an important conceptual and technological link to a large community of medical research groups involved in the treatment and study of the chronic arthritides throughout the United States and Canada, who are presently using the ARAMIS Data Bank through the CIT facility via TELENET. Beyond the arthritis centers which we have mentioned in this report, the TOD (Time- Oriented Data Base) User Group involves a broad range of university and community medical institutions involved in the treatment of cancer, stroke, cardiovascular disease, nephrologic disease, and others. Through the RADIX Project, the opportunity will be provided to foster national collaborations with these research groups and to provide a major arena in which to demonstrate the utility of artificial intelligence to clinical medicine. C. Recommendations for Resource Development The on-going acquisition of personal work-station Lisp processors is a very positive step, as these provide an excellent environment for program development, and can serve as a vehicle for providing programs to collaborators at other sites. Continued acquisitions are very desirable. We also would hope that the central SUMEX facility, the DEC 2060, would continue to be supported. We continue to make constant use of this machine for text-editing, document preparation, file and database handling, communications, and program demos. Responses to Questions Regarding Resource Future Q: What do you think the role of the SUMEX-AIM resource should be for the period after 7/86, e.g., continue like it is, 137 E. H. Shortliffe RADIX Project 5P41-RR00785-13 discontinue support of the central machine, act as a communications crossroads, develop software for user community workstations, etc. A: In our opinion, the SUMEX 2060 should continue to be supported. The machine continues to be of value to us for text-editing (TV edit and EMACS) and for document preparation (SCRIBE) and for communications and mail. We also depend on it as a central, reliable facility for program demos, for manipulating large databases, and maintaining central program files. It would be a real loss if it was discontinued. ~ Software for community work stations. Yes. Making good utility programs available to all users sounds like a good idea. Q: Will you require continued access to the SUMEX-AIM 2060 and if so, for how long? A: Yes. For the foreseeable future and for the above reasons. Q: What would be the effect of imposing fees for using SUMEX resources (computing and communications) if NIH were to require this? A: We would pay them. The 2060 is worth it to us. Of course, if the fees were high, we would consider alternatives. Q: Do you have plans to move your work to another machine workstation and if so, when and to what kind of system? A: We are currently using two of the SUMEX Xerox 1108's for the development of our project. We will stay with these for the foreseeable future. E. H. Shortliffe 138 5P41-RR00785-13 National AIM Projects IV.B. National AIM Projects The following group of projects is formally approved for access to the AIM aliquot of the SUMEX-AIM resource. Their access is based on review by the AIM Advisory Group and approval by the AIM Executive Committee. In addition to the progress reports presented here, abstracts for each project and its individual users are submitted on a separate Scientific Subproject Form. 139 E. H. Shortliffe INTERNIST-I Project 5P41-RR00785-13 IV.B.1. INTERNIST-I Project CADUCEUS Project (INTERNIST-!) This project is unfunded at the present time. J. D. Myers, M.D. University Professor Emeritus (Medicine) University of Pittsburgh 1291 Scaife Hall Pittsburgh, Pa., 15261 I. SUMMARY OF RESEARCH PROGRAM A. Project rationale The principal objective of this project is the development of a high-level computer diagnostic program in the broad field of internal medicine as an aid in the solution of complex and complicated diagnostic problems. To be effective, the program must be capable of multiple diagnoses (related or independent) in a given patient. A major achievement of this research undertaking has been the design of a program called INTERNIST-1, along with an extensive medical knowledge base. This program has been used over the past decade to analyze many hundreds of difficult diagnostic problems in the field of internal medicine. These problem cases have included cases published in medical journals (particularly Case Records of the Massachusetts General Hospital, in the New England Journal of Medicine), CPCs, and unusual problems of patients in our Medical Center. In most instances, but by no means all, INTERNIST-I has performed at the level of the skilled internist, but the experience has highlighted several areas for improvement. B. Medical Relevance and Collaboration The program inherently has direct and substantial medical relevance. The development of the QUICK MEDICAL REFERENCE (QMR) under the leader ship of Dr. Randolph A. Miller has allowed us to distribute the INTERNIST-I knowledge base in a modified format to over 20 other academic medical institutions. The knowledge base can thereby be used as an “electronic textbook” in medical education at all levels -- by medical students, residents and fellows, and faculty and staff physicians. This distribution is continuing to expand. The INTERNIST-I program has been used in recent years to develop patient management problems for the American College of Physician's Medical Knowledge Self- assessment Program, and to develop patient management problems and test cases for the Part II] Examination and the developing computerized testing program of the National Board of Medical Examiners. C. Highlights of Research Progress C.1 Accomplishments this past year The group of us (Myers, Miller and Masarie) together with assigned residents in internal E. H. Shortliffe 140 5P41-RR00785-13 INTERNIST-I Project medicine are continuing to expand the knowledge base and to incorporate the diagnostic consultative program into QMR. The computer program for the interrogative part of the diagnostic program is the main remaining task. An editor for the QMR knowledge base, as modified from the INTERNIST-I knowledge base, is nearing completion. The entire QMR program can be accommodated in, maintained (particularly edited) and operated on individual IBM PC-AT computers. In the near future our group will be ready to incorporate into the QMR diagnostic consultant program the modifications and embellishments of the INTERNIST-I knowledge base, e.g. the use of “facets” of diseases or syndromes. This addition and modification is expected to improve the performance of the diagnostic consultant program. The medical knowledge base has continued to grow both in the incorporation of new diseases and the modification of diseases already profiled so as to include recent advances in medical knowledge. Several dozen new diseases have been profiled during the past year. C.2 Research in progress There are four major components to the continuation of this research project: 1. The enlargement, continued updating, refinement and testing of the extensive medical knowledge base required for the operation of INTERNIST-I and the QMR modification. 2. Institution of field trials of QMR on the clinical services in internal medicine at the Health Center of the University of Pittsburgh. 3. Expansion of the clinical field trials to other university health centers which have expressed interest in working with the system. 4. Adaptation of the diagnostic program and data base of INTERNIST-I and the QMR modification to subserve educational purposes and the evaluation of. clinical performance and competence. Current activity is devoted mainly to the first of these, namely, the continued development of the medical knowledge base, and the implementation of the improved diagnostic consulting program. D. List of relevant publications 1. Myers, J.D.: Educating future physicians: Something old, Something new. Ohio State Univ. Proceedings of Symposium, Medical Education in the 21st Century. 1985. 2. Myers, J.D.: The process of clinical diagnosis and its adaptation to the computer IN The Logic of Discovery and Diagnosis in Medicine, University of Pittsburgh Series in the Philosophy and History of Science, edited by Kenneth F. Schaffner, Univ. of California Press, pp. 155-180, 1985. 3. Masarie, Jr. F.E., Miller, R.A., First, M.B. Myers, J.D: An Electronic Textbook of Medicine. Proceedings of Ninth Annual Symposium on Computer Applications in Medical Care. Baltimore, Maryland, November 1985. 4. Masarie, Jr. F.E., Myers, J.D. Miller, R.A: INTERNIST-I PROPERTIES: Representing Common Sense on Good Medical Practice in a Computerized 141 E. H. Shortliffe INTERNIST-I Project 5P41-RR00785-13 Medical Knowledge Base. Computers and Biomedical Research. 18: 458-479, October 1985. 5. Myers, J.D., Chairman. Medical Education in the Information Age. Proceedings of the Symposium on Medical Informatics. Association of American Medical Colleges, 1986. E. Funding support 1. Clinical Decision Systems Research Resource Harry E. Pople, Jr., Ph.D. Professor of Business Jack D. Myers, M.D. University Professor Emeritus (Medicine) University of Pittsburgh Division of Research Resources National Institutes of Health 5 R24 RRO1101-08 07/01/80 - 03/31/86 - $1,658,347 07/01/84 - 09/30/85 - $354,211 09/30/85 - 03/31/86 ~ $50,690 2. CADUCEUS: A Computer-Based Diagnostic Consultant Harry E. Pople, Jr., Ph.D. Professor of Business Jack D. Myers, M.D. University Professor Emeritus (Medicine) University of Pittsburgh National Library of Medicine National Institutes of Health 5 RO1 LM03710-05 07/01/80 - 03/31/86 - $853,200 07/01/84 - 09/30/85 - $210,091 09/30/85 - 03/31/86 ~ $35,316 3. Diagnostic-Internist: A Computerized Medical Consultant Randolph A. Milter, M.D. Associate Professor of Medicine University of Pittsburgh Department of Medicine National Library of Medicine - Development Award Research Career National Institutes of Health 1 KO4 LM00084-01 09/30/85 - 09/29/90 - amounts to be determined annually 09/30/85 - 09/29/86 - $55,296 Il. INTERACTIONS WITH THE SUMEX-AIM RESOURCE A,B. Medical Collaborations and Program Dissemination Via SUMEX INTERNIST-I and QMR remains in a stage of research and particularly development. As noted above, we are continuing to develop better computer programs to operate the diagnostic system, and the knowledge base cannot be used very effectively for collaborative purposes until it has reached a critical stage of completion. These factors have stifled collaboration via SUMEX up to this point and will continue to do so for the next year or two. In the meanwhile, through the SUMEX community there E. H. Shortliffe 142 5P41-RR00785-13 INTERNIST-I Project continues to be an exchange of information and states of progress. Such interactions particularly take place at the annual AIM Workshop. C. Critique of Resource Management SUMEX has been an excellent resource for the development of INTERNIST-[. Our large program is handled efficiently, effectively and accurately. The staff at SUMEX have been uniformly supportive, cooperative, and innovative in connection with our project's needs. Ill. RESEARCH PLANS A. Project Goals and Plans Continued effort to complete the medical knowledge base in internal medicine will be pursued including the incorporation of newly described diseases and new or altered medical information on “old” diseases. The latter two activities have proven to be more formidable than originally conceived. Profiles of added diseases plus other information is first incorporated into the medical knowledge base at SUMEX before being transferred into our newer information structures for QMR. This sequence retains the operative capability of INTERNIST-I as a computerized “textbook of medicine” for educational purposes. B. Justification and Requirements for Continued SUMEX Use Our use of SUMEX will obviously decline with the adaptation of our programs to the IBM PC-AT. Nevertheless, the excellent facilities of SUMEX are expected to be used for certain developmental work. It is intended for the present to keep INTERNIST-1 at SUMEX for comparative use as QMR is developed here. Our best prediction is that our project will require continued access to the 2060 for the next year or two and we consider such access essential to the future development of our knowledge base. After that time, our work can probably be accomplished on our personal work stations. C. Needs and Plans for Other Computing Resources Beyond SUM EX-AIM Our predictable needs in this area will be met by our newly acquired personal work stations. 143 E. H. Shortliffe CLIPR - Hierarchical Models of Human Cognition 5P41-RR00785-13 IV.B.2. CLIPR - Hierarchical Models of Human Cognition Hierarchical Models of Human Cognition (CLIPR Project) Walter Kintsch and Peter G. Polson University of Colorado Boulder, Colorado I. SUMMARY OF RESEARCH PROGRAM A. Project Rationale The two CLIPR projects have made progress during the last year. The prose comprehension project has completed one major project, and is designing a prose comprehension model that reflects state-of-the-art knowledge from psychology (van Dijk & Kintsch, 1983) and artificial intelligence. During the last four years, Polson, in collaboration with Dr. David Kieras of the University of Michigan, has continued work on a project studying the psychological factors underlying device complexity and the difficulties that nontechnically trained individuals have in learning to use devices like word processors. They have developed formal representations of a user's knowledge of how to operate a device and of the user-device interface (Kieras & Polson, 1985) and have completed several experiments evaluating their theory (Polson & Kieras, 1984, 1985; Polson, Muncher, and Engelbeck, 1986). B. Technical Goals The CLIPR project consists of two subprojects. The first, the text comprehension project, is headed by Walter Kintsch and is a continuation of work on understanding of connected discourse that has been underway in Kintsch’'s laboratory for several years. The second, the device complexity project, is headed by Peter Polson in collaboration with David Kieras of the University of Michigan. They are studying the learning and problem solving processes involved in the utilization of devices like word processors or complex computer controlled medical instruments (Kieras & Polson, 1985). The goal of the prose comprehension project is to develop a computer system capable of the meaningful processing of prose. This work has been generally guided by the prose comprehension model discussed by van Dijk & Kintsch (1983), although our programming efforts have identified necessary clarifications and modifications in that model (Kintsch & Greeno, 1985; Fletcher, 1985; Walker & Kintsch, 1985; Young, 1985). In general, this research has emphasized the importance of knowledge and knowledge- based processes in comprehension. We hope to be able to merge the substantial artificial intelligence research on these systems with psychological interpretations of prose comprehension, resulting in a computational model that is also psychologically respectable. The goal of the device complexity project is to develop explicit models of the user- device interaction. They model the device as a nested automata and the user as a production system. These models make explicit kinds of knowledge that are required to operate different kinds of devices and the processing loads imposed by different implementations of a device. C. Medical Relevance and Collaboration The text comprehension project impacts indirectly on medicine, as the medical E. H. Shortliffe 144 5P41-RR00785-13 CLIPR - Hierarchical Models of Human Cognition profession is no stranger to the problems of the information glut. By adding to the research on how computer systems might understand and summarize texts, and determining ways by which the readability of texts can be improved, medicine can only be helped by research on how people understand prose. Development of a more thorough understanding of the various processes responsible for different types of learning problems in children and the corresponding development of a successful remediation strategy would also be facilitated by an explicit theory of the normal comprehension process. The device complexity project has two primary goals: the development of a cognitive theory of user-device interaction in including learning and performance models, and the development of a theoretically driven design process that will optimize the relationships between device functionality and ease of learning and other performance factors (Polson & Kieras, 1983, 1984; Polson, Muncher, and Engelbeck 1985). The results of this project should be directly relevant to the design of complex, computer controlled medical equipment. They are currently using word processors to study user-device interactions, but principles underlying use of such devices should generalize to medical equipment. Both the text comprehension project and the device complexity project involve the development of explicit models of complex cognitive processes; cognitive modeling is a stated goal of both SUMEX and research supported by NIMH. . Several other psychologists have either used or shown an interest in using an early version of the prose comprehension model, including Alan Lesgold of SUMEX's SCP project, who is exporting the system to the LRDC Vax. We have also worked with James Greeno -- another member of the SCP project -- on a project that will integrate this model with models of problem solving developed by Greeno and others at the University of California, Berkeley. Needless to say, all of this interaction has been greatly facilitated by the local and network-wide communication systems supported by SUMEX. The mail system, of course, has also enabled us to maintain professional contacts established at conferences and other meetings, and to share and discuss ideas with these contacts. D. Progress Summary The version of the prose comprehension model of 1978 (Kintsch & van Dijk, 1978), which originally was realized as a computer simulation by Miller & Kintsch (1980), has been extended in a major simulation program by Young (1985). Unlike the earlier program, Young includes macroprocessing in her model, and thereby greatly extends the usefulness of the program. It is expected that this program will be widely useful in studies of prose where a detailed theoretical analysis is desired. The general theory has been reformulated and expanded in van Dijk & Kintsch (1983). This research report of book length presents a general framework for a comprehensive theory of discourse processing. It has been applied to an interesting special case, the question of how children understand and solve word arithmetic problems, by Kintsch & Greeno (1985). A simulation for this model, using INTERLISP, has been supplied in Fletcher (1985). The device complexity project is in its fourth year. They have developed an explicit model for the knowledge structures involved in the user-device interaction, and they are developing simulation programs. Their preliminary theoretical results are described in Kieras & Polson (1985). They have also completed several experiments evaluating the theory (Polson & Kieras, 1984, 1985; Polson, Muncher, and Engelbeck, 1986) and have shown that number of productions predicts learning time and that number of cycles and working memory operations predicts execution time for a method. 145 E. H. Shortliffe CLIPR - Hierarchical Models of Human Cognition 5P41-RR00785-13 E. List of Relevant Publications 1. Fletcher, R.C.: Understanding and solving word arithmetic problems: A computer simulation. Technical Report No. 135, Institute of Cognitive Science, Colorado, 1984. 2. Kieras, D.E. and Polson, P.G.: The formal analysis of user complexity. Int. J. Man-Machine Studies, 22, 365-394, 1985. 3. Kintsch, W. and van Dijk, T.A.: Toward a model of text comprehension and production. Psychological Rev. 85:363-394, 1978. 4. Kintsch, W. and Greeno, J.G.: Understanding and solving word arithmetic problems. Psychological Review, 1985, 92, 109-129. 5, Miller, J.R. and Kintsch, W.: Readability and recall of short prose passages: A theoretical analysis. J. Experimental Psychology: Human Learning and Memory 6:335-354, 1980. 6. Polson, P.G. and Kieras, D.E.: Theoretical foundations of a design process guide for the minimization of user complexity. Working Paper No. 3, Project on User Complexity, Universities of Arizona and Colorado, June, 1983. 7. Polson, P.G. and Kieras, D.E.: A formal description of users’ knowledge of how to operate a device and user complexity. ‘Behavior Research Methods, Instrumentation, & Computers, 1984, 16, 249-255. 8. Polson, P.G. and Kieras, D.E.: A quantitative model of the learning and performance of text editing knowledge. In Borman, L. and Curtis, B. (Eds.) Proceedings of the CHI 1985 Conference on Human Factors in Computing. New York: Association for Computing Machinery. pp. 207-212, 1985. 9. Polson, P.G. and Jeffries, R.: /nstruction in general problem solving skills: An analysis of four approaches. In (Eds.) Siegel, J.. Chipman, S., and Glaser, R. Thinking and learning skills: Relating instructions to basic research: Vol. 1. Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 414-455. 10. Polson, P.G., Muncher, E., and Engelbeck, G.: Test of a common elements theory of transfer. In Mantei, M. and Orbeton, P. (Eds.) Proceedings of the CHI 1986 Conference on Human Factors in Computing. New York: Association for Computing Machinery. pp. 78-83, 1986. 11. Van Dijk, T.A. and Kintsch, W. STRATEGIES OF DISCOURSE COMPREHENSION. Academic Press, New York, 1983. 12. Young, S.: A theory and simulation of macrostructure. Technical Report No. 134, Institute of Cognitive Science, Colorado, 1984. 13. Walker, H.W., Kintsch, W.: Automatic and strategic aspects of knowledge retrieval. Cognitive Science, 1985, 9, 261-283. F, Funding Support Status 1. Text Comprehension and Memory Walter Kintsch, Professor, University of Colorado National Institute of Mental Health - 5 RO! MH15872-14-16 7/1/84 - 6/30/87: $145,500 (direct) E. H. Shortliffe 146 5P41-RR00785-13 CLIPR - Hierarchical Models of Human Cognition 7/1/83 - 6/30/84: $56,501 2. Understanding and solving word arithmetic problems Walter Kintsch, Professor, University of Colorado National Science Foundation 8/1/83 - 7/31/86: $200,000 3. The Application of Cognitive Complexity Theory to the Design of User Interface Architectures David Kieras, Associate Professor, University of Michigan Peter G. Polson, Professor, University of Colorado International Business Machines Corporation 1/1/85 - 12/31/86: $500,000 (direct+indirect) 1/1/86 - 12/31/86: $250,000 (direct+indirect) Ul. INTERACTIONS WITH THE SUMEX-AIM RESOURCE A. Sharing and Interactions with Other SUMEX-AIM Projects Our primary interaction with the SUMEX community has been the work of the prose comprehension group with the AGE and UNITS projects at SUMEX. Feigenbaum and Nii have visited Colorado, and one of us (Miller) attended the AGE workshop at SUMEX. Both of these meetings have been very valuable in increasing our understanding of how our problems might best be solved by the various systems available at SUMEX. We also hope that our experiments with the AGE and UNITS packages have been helpful to the development of those projects. We should also mention theoretical and experimental insights that we have received from Alan Lesgold and other members of the SUMEX SCP project. The initial comprehension model (Miller & Kintsch, 1980) has been used by Dr. Lesgold and other researchers at the University of Pittsburgh, as well as researchers at Carnegie~Mellon University, the University of Manitoba, Rockefeller University, and the University of Victoria. B. Critique of Resource Management The SUMEX-AIM resource is clearly suitable for the current and future needs of our project. We have found the staff of SUMEX to be cooperative and effective in dealing with special requirements and in responding to our questions. The facilities for communication on the ARPANET have also facilitated collaborative work with investigators throughout the country. II. RESEARCH PLANS A. Long Range Projects Goals and Plans The goal of the prose comprehension project is to develop a computer system capable of the meaningful processing of prose. This work has been generally guided by the prose comprehension model discussed by van Dijk & Kintsch (1983), although our programming efforts have identified necessary clarifications and modifications in that model (Kintsch & Greeno, 1985; Fletcher, 1985; Walker & Kintsch, 1985; Young, 1985). In general, this research has emphasized the importance of knowledge and knowledge- based processes in comprehension. We hope to be able to merge the substantial artificial intelligence research on these systems with psychological interpretations of 147 E. H. Shortliffe CLIPR - Hierarchical Models of Human Cognition 5P41-RRO00785-13 prose comprehension, resulting in a computational model that is also psychologically respectable. The primary goal of the device complexity project is the development of a theory of the processes and knowledge structures that are involved in the performance of routine cognitive skills making use of devices like word processors. We plan to model the user-device interaction by representing the user's processes and knowledge as a production system and the device as a nested automata. We are also studying the role of mental models in learning how to use them. B. Justification and Requirements for Continued SUMEX Use Both the prose comprehension and the user-computer interaction projects have shifted their actual simulation work from SUMEX to systems at the University of Colorado and the University of Michigan. Both projects use Xerox 1108 systems continuing their work in INTERLISP. However, we consider our continued access to SUMEX critical for the successful continuation of these projects. Access to SUMEX provides us with continued contact with the SUMEX community, which is especially critical for the prose comprehension project. Knowledge representation languages, e.g. UNITS, and other tools developed by SUMEX are critical for this project. Alternative sources of such software are typically unsatisfactory because the systems have only been developed for use on one project and are typically very poorly documented and less than completely debugged. We hope that our continued membership in the community will be offset by the input that we have been and will continue to provide to various projects: our relationship has been symbiotic, and we look forward to its continuation. Access to SUMEX's mail facilities are critical for the. continued success of these projects. These facilities provide us with the means to interact with colleagues at other universities. Kintsch is currently collaborating with James Greeno, who is at the University of California at Berkeley, and Polson’'s long-term collaborator, David Kieras, is at the University of Michigan. In addition, our access to the Xerox 1108 (Dandelion) user's community is through SUMEX. We currently use four computing systems for the VAX 11/780, and three Xerox 1108s, one of which is at the University of Michigan. The VAX is used primarily to collect experimental data designed to evaluate the simulation models and to do necessary statistical analysis. C. Needs and Plans for Other Computational Resources SUMEX provides us with two critical needs. The first is communication, which we discussed in the preceding paragraph. The second is technical advice and access to various knowledge representation languages like UNITS. We envisage our future needs to be communication currently served by the SUMEX 2060 and technical advice and necessary software provided by the SUMEX staff. D. Recommendations for Future Community and Resource Development Our future needs are for the SUMEX-AIM resource to act as a communications crossroad and to develop software and provide technical support for user community work stations. We have no preferences as to how such services are provided: either with a communication server on the network, or with the central machine like the current 2060. We will continue to need access to the SUMEX-AIM 2060 in order to access communication networks and to interact with the SUMEX-AIM staff and community. E. H. Shortliffe 148 §5P41-RR00785-13 CLIPR - Hierarchical Models of Human Cognition If communications and access to the staff are provided through some other mechanism, then we would no longer need access to the 2060. We would be willing to pay fees for using SUMEX communication resources if required by NIH. However, our willingness is price sensitive. Any charges over $1,000 a year would mean we should communicate with people directly by long-distance telephone. 149 E. H. Shortliffe MENTOR Project 5P41-RR00785-13 IV.B.3. MENTOR Project MENTOR Project Stuart M. Speedie, Ph.D. School of Pharmacy University of Maryland Terrence F. Blaschke, M.D. Department of Medicine Division of Clinical Pharmacology Stanford University I. SUMMARY OF RESEARCH PROGRAM A, Project Rationale The goal of the MENTOR (Medical EvaluatioN of Therapeutic ORders) project is to design and develop an expert system for monitoring drug therapy for hospitalized patients that will provide appropriate advice to physicians concerning the existence and management of adverse drug reactions. The computer as a record-keeping device is becoming increasingly common in hospital-based health care, but much of its potential remains unrealized. Furthermore, this information is provided to the physician in the form of raw data which is often difficult to interpret. The wealth of raw data may effectively hide important information about the patient from the physician. This is particularly true with respect to adverse reactions to drugs which can only be detected by simultaneous examinations of several different types of data including drug data, laboratory tests and clinical signs. In order to detect and appropriately manage adverse drug reactions, sophisticated medical knowledge and problem solving is required. Expert systems offer the possibility of embedding this expertise in a computer system. Such a system could automatically gather the appropriate information from existing record-keeping systems and continually monitor for the occurrence of adverse drug reactions. Based on a knowledge base of relevant data, it could analyze incoming data and inform physicians when adverse reactions are likely to occur or when they have occurred. The MENTOR project is an attempt to explore the problems associated with the development and implementation of such a system and to implement a prototype of a drug monitoring system in a hospital setting. B. Medical Relevance and Collaboration A number of independent studies have confirmed that the incidence of adverse reactions to drugs in hospitalized patients is significant and that they are for the most part preventable. Moreover, such statistics do not include instances of suboptimal drug therapy which may result in increased costs, extended length-of-stay, or ineffective therapy. Data in these areas are sparse, though medical care evaluations carried out as part of hospital quality assurance programs suggest that suboptimal therapy is common. Other computer systems have been developed to influence physician decision making by monitoring patient data and providing feedback. However, most of these systems suffer from a significant structural shortcoming. This shortcoming involves the evaluation rules that are used to generate feedback. In all cases, these criteria consist of discrete, E. H. Shortliffe 150