Table of Contents Part I. Title Page Part I. Description of Research Progress I1.A. Scientific Subprojects Il.A.1. Collaborative Research and Service 11.A.2. Core Research and Development 1L.A.3. Training I.B. Books, Papers and Abstracts ILC. Resource Summary Table Part I. Narrative Description Tl.A. Summary of Research Progress TI.A.1. Service IH.A.1.a. Scientific Consulting - Class I, I and IV Support I1].A.1.b. Scientific Case Studies Using BIONET TH.A.1.c. KERMIT Lending Library IIL.A.1.d. Subscription Fee and Its Effects I0.A.1.e. Reapplication Procedure Il.A.2. Collaborative Research Ill1.A.3. Core Research Il].A.3.a. Hardware Text Searching Machines 01.A.3.b. BIONET Satellite Program Il].A.4. BIONET Training Program Tl].4.4.a. A Brief Review I11.4.4.b. Some Lessons Learned Ill.A.4.c. A New Strategy Tll.A.5. Resource Facilities III.A.5.a. Computer Hardware and Telecommunication Networks If.A.5.b. Summary Statistics on Machine Use HI.A.5.c. Computer Software - Core Library T.A.5.d. Computer Software - System Library ].4.5.e. Computer Software - Contributed Library W.A.5.f. Database Library I.B. Highlights IIl.C. Administrative Changes 11.0.1. Facilities Tl].C.2. Personnel I}.D. Resource Advisory Committee and Allocation of Resources IL.E. Dissemination of Information on Resource’s Capabilities T1.E.1. Community Interactions and Awareness IN.E.2. Electronic Communications I0.E.2.a. Bulletin Boards I.E.2.b. Bibliographies T.F. Suggestions and Comments I. Letter to BIONET Scientists Il. Justification of $400 Fee Il. Letter on Class IV Access IV. Reapplication Form for BIONET Access V. Program for the Rutgers/Waksman Workshop “ro ww NN bt 2 li VI. Descriptions of the BIONET Satellite Program VU. Text of Advertisement to Appear in Nucleic Acids Research VIII. The BIONET Brochure Mailed to NIH Grantees 76 79 80 Figure II-1: Figure III-2: Figure II-3: Figure [II-4: Figure [I-5: Figure ITI-6: Figure III-7: il List of Figures Pieslice Allocation of the DEC-2060 Computer Actual Use of the DEC-2060 for the Month of October, 1985 BIONET’s Percentage Use of the DEC-2060, 12/84 - 11/85 BIONET’s Prime Time Use of the DEC-2060, 12/84 - 11/85 BIONET’s Non-Prime Time Use of the DEC-2060, 12/84 - 11/85 BIONET’s Total Use of the DEC-2060, 12/84 - 11/85 Total Telenet and UNINET Network Use, 12/84 - 11/85 40 41 42 45 47 49 51 Table II-1: Table ITI-1: Table IT-2: Table IT-3: Table ITI-4: Table HI-5: Table ITI-6: Table II-7: Table II-8: Table III-9: iv List of Tables BIONET User Community Summary of Distribution of Questions. Breakdown of Program Questions by Function Breakdown of Program Use, by Category BIONET Prime Time CPU Minutes BIONET Prime Time Connect Hours BIONET Non-Prime Time CPU Minutes BIONET Non-Prime Time Connect Hours BIONET Total CPU Minutes BIONET Total Connect Hours DEPARTMENT OF HEALTH AND HUMAN SERVICES Public Health Service National Institutes of Health Division of Research Resources Biomedical Research Technology Program Annual Progress Report PART I, TITLE PAGE 1. PHS AWARD NUMBER: l1lul4;rtjriryO} ty 6 8T5 | YO 2 2. TITLE OF AWARD BIONET, National Computer Resource for Molecular Biologv 3. NAME OF RECIPIENT INSTITUTION: IntelliGenetics, a subsidiary of IntelliCorp 4, HEALTH PROFESSIONAL SCHOOL (If applicable): N/A 5. REPORTING PERIOD: 5a. FROM (Month, Day, Year): } 9} 3] ~ | Of 1] —] 8 5 Sb. TO (Month, Day, Year): | Of} 27 —~ | 2] 8] 7] 8 6 6. PRINCIPAL INVESTIGATOR: 6a. NAME: Ralph E. Kromer 6b. TITLE: President, IntelliCorp 6c. SIGNATURE: Pp ge S Pe __. Le w 7. DATE SIGNED (Month, Day, Year): 8. TELEPHONE (Include Area Code): 4] 1 Part Il. Description of Research Progress This section of the Report provides statistical information on use of the BIONET™ Resource. The period covered is 12/84 - 11/85. Its individual sections have been prepared under guidelines discussed with BRTP staff. In the one year since BIONET formally began accepting and approving applications for access, the BIONET community has expanded to 533 Principal Investigators (PI’s), representing about 1800 total investigators, whose classification is as summarized in Table IJ-1. As shown in the Table, of the 537 accepted Class I PI’s, 27 chose to drop their membership when the subscription fee for telecommunication access was announced. To review, Class I PI’s represent the service component of the Resource, Class II the collaborative component, and Class III is reserved for those persons who are responsible for local computing facilities, and who have agreed to act to support the local community of scientists accessing BIONET. Table 1-1: BIONET User Community Class I PI’s 537 Declined (27) Total Class I 510 Class II PI’s 12 Class III PI’s 3 Nat. Advis. Comn. 8 Net Total §33 Il.A. Scientific Subprojects .A.1. Collaborative Research and Service We report Collaborative Research and Service first, because it represents the bulk of the Resource’s activities in this grant year. We have not reported the research abstracts in the interests of saving space. These abstracts are maintained on our PC database of BIONET users, and are available to any interested party. We report the "Usage Factor" as both central processor units (cpu time), in minutes, and connect time, in hours, for each PI. These values are the sums of all usage by the PI and his or her group members (*Sub-I’s"). We report data only on those PI groups that have accessed the Resource. Of the 522 Class I and II PI’s, 361, representing about 1010 individual investigators, have logged on to the Resource. The data for Sub-I’s is maintained on our PC database and is available to any interested party. We do not report Resource staff hours nor BRTP funds allocated for individual PI's because it is impossible to allocate these rationally to such a large user community. Summary information for all PI’s is given in Part II, Section C., the Resource Summary Table. I.A.2. Core Research and Development We report summary information for our two current Core Research projects. The BRTP funds allocated are calculated on the basis of the sum of individual salaries for the time spent, plus a percentage of the total Resource budget devoted to providing supporting services. The Resource’s primary commodity is cpu cycles made available to all investigators, including BIONET staff. Thus, the percentage used in the above calculation is the cpu time used by the indicated investigators compared to the total cpu time used on the system. The calculation for the actual cost is this percentage times the basis of costs of providing this commodity to local staff; ie., the budget categories of Supplies, Computer Expenses, Software Licenses and Services and Documentation. This basis is, using our anticipate expenditures in these categories through February 28, 1986, $265,500. “TF oSed ‘guopjonmeyI 693 = °PeeN SeFboTOUpaL samosaj B.INOH BATD I * Peay] SHpPOULUUNpy, Sameewy mg pers /~ n¢°L7 $] OLY £29 é sqoefqadg °ON 37NOL ZALIVERED uosfTivdwodi9quy pue [eAefA}zat adUaNbes [BOFSOTOFY OF so} aUeDTTTAIUVI “4 suopyeoptdde aoy sauTysew Yo1eas [uP uenbas ‘aiempiey jo SapPsoTouyoa Mau ayy dupaortdxg oz*91 $f OV 9L2 ‘ °q ‘dep nag OL Bel GY oybz Oud ‘y sTuueg ‘yatws *e fgg *Z4 6 aupyseW yoreas [eFjuanbas sO sueasoid pue satyTj Beep SsuyIeT [Ng pue [pew o9pUuoA_IaTO Jo adupyoxa pydes 10j SAIT[[AIeS asayy BuPAUF] + (Sa91n0s9y SET TEe3eS) saq ys aqoue1 Al Teo fFydessoad so}, Jouey)TLTeI4] °4 ‘yayq0 03 LANOLY JO SapTafp[ Poesy YUoOTPeOPFUNuMOD pue peuopz¥gndwos ayy Jo UOFINGFLISTG e°4l § Biv ve hs prarg ‘y Sapooy 76 ad 0902 Jad ‘y spuuad ‘yatus "ek | 77 ‘Of 6 weidotd S9FT[8IeS LANOLE ~ *Qeul 360H-UON"SD ema ais 7 quanzeded’q 379 | NIW [ABOTOMpaL! (TeTaTUl SPP 3 II I quan QEOOT TY | somosag! 1.1 | SOmouns GUNN 2ST} d ‘CIN SVT) spy | ST MINE dL wold en eam (s)aQgebrasenuy’e =| SpOD SoUusTOY (eseqoezeup, pe) OTS CaTatysoeed __ ©) (y (c) (z) a __ CMINIVEL SOLAZS 3 HONEST ZALDAQIOGYTIOO UEATONAE! 9 ROGERS BOD [xXx | sharpROTTOS SA 7O SUD HS *buyupErL 20 SATWwIOWITO " dorsi y 9861 82 Aateniqed OB C861 1 YW2RN "yoga" g SIPIIUSDTTTAIVI ~=sNOLUALLIONE @ mame IT S06 I.A.3. Training We report summary information for our Training program. The sites at which BIONET provided some level of training are named here, and are discussed in more detail in Part Ii], the Narrative Description. The BRTP funds allocated include the salaries of the personnel involved, the estimated costs of the training program itself for this year ($5,500), and a percentage allocation of cost for cpu use, as discussed in the previous section. -[} efed ‘euoQONNSUI 893 “Peen SeToTOUpaL eomosal simoH eaTD A ‘pean, SeyToOO[OuRPAL RAMEY Ag }wery. / Bre T9Sh 92 E91E | sqaefqadqre -on t9INIOL ZALDY ERED eo sopqouagT{[eqy "4 LY ZZ “a *y Sspuuaq *YIFus Ov 912 ‘ se[snoqg ‘sep, nag Ov IZ i‘ aouainey] *sapay * (eysodwAy 103UFM-PHW £02 6L27 “ Tay ‘apuzy pweTW Gg, Y2aIoFg SAdOTOTY 1eTHIOLOW 6GZ 89 ‘a opAiPW 41a] Me] 89 uf ajepdn aanjeNn ‘ueusyey/S193 NY *qasva) 1cT Lyy 0902-940 aurTeTd ‘prarysuew | 270" 6 weisoig SupuperL LANOId "vB ~ *30UT 3S0H-UON’S emon faisn| 7 queen zeded°q gyre | NIW ,|AoToMpas| (TeTITUI STPPTH 3 II I qa SQEOOTTY ndd | eomossy jamen 26I7a ‘CON 2881) syxy | STW UM dL wok wan quan (8) 20QebTISenUT*e =| SPAD SOUsPOR (emqoezeup 98) O1T3TS Sayadysoeeg «) (y (c) (z) tBupAOTTOS SA JO suo 3HAH XXXX nm) com eo [| | *fasyUTeXL 20 SATWIOWTTCO texg 203 wang 2oefoRkey eqezedes © WO TI aoradiy Zz Cool Ty Adtntqay O&® Coot T MIeW Foggy oll ste To Tr OT fal te} a} | fesmene cuvev sJLVUdSTTT2IUL «63 NOILEALLIGNT V maces IT 2 ee $0d DRCOG SLALLEILS ws) II.B. Books, Papers and Abstracts We report the publications by members of the BIONET scientific community on a version of the special form as generated by our local database Management system. We report only the category of Collaborative Research and Service. There have been no publications in the Core Research program. The published materials used to support our Training program (the Introduction to BIONET, the BIONET Reference Manual and the BIONET Training Manual) have been described before and thus are not reported separately. 10 Part II, Section B Award Number 1U41RRO1685-02 INSTITUTION: Intelligenetics REPORT PERIOD: March 1, 1986 to February 28, 1986 COLLABORATIVE RESEARCH * Vogt, M., Haggblom, C., Swift, S., Haas, M., Envelope gene and long terminal repeat determine the different biological properties of Rauscher, Friend, and Moloney Mink Cell Focus-inducing viruses, J. Virol. 55, 184-192, 1985 * Upton, C, McFadden, G., DNA Sequence Homology between the Terminal Inverted Repeats of Shope Fibroma Virus and an Endogenous Cellular Plasmid Species, Molecular and Cellular Biology, Jan. 1986 * Andersen, R.D., Birren, B.W., Taplitz, S.J., Herschman, H.R., The Rat Metallothionein-] Structural Gene and Three Pseudogenes, One of Which Contains 5’-Regulatory Sequences, Molecular and Cellular Biology, 1985 * Andersen, R.D., Taplitz, S.J., Briston, G., Herschman, H.R., Rat Metallothionein Multigene Family, in Proceedings of the Second International Meeting on Metallothionein and Other Low Molecular Weight Metal-binding Proteins, Birkhauser Verlag, Boston, Aug. 21, 1985 Katzen, A.L., Kornber, T.B., Bishop, J.M., Isolation of the proto-oncogene e-myb from Drosophila melanogaster, Cell 41: 449-456, June 1985 Simon, M., Drees, B., Kornberg, T., Bishop, J.M., The Nucleotide sequence and tissues specific expression of Drosophila e-src, Cell, Oct 1985 * Lohe, Allan R., Brutlag, Douglas L., Multiplicity of Satellite DNA Sequences in Drosphila Melanogaster, PNAS, 1985 * Calhoun, David, Bishop, David T., Bernstein, Harold S., et. al., Fabry Disease: Isolation of a cDNA clone encoding human alpha-galactosidase A, PNAS, 1985 * Cooke, N.E., David, E.V., Serum Vitamin D Binding Protein is a Third Member of the Albumin and alpha-fetoprotein gene family, J. Clin. Investig, 1985 * Glaichenhaus, N., Leopold, P., Cuzin, F., et. al., Changes in the expression of cellular genes inn cells immortalized or transformed by polyoma virus, Cancer Cells, Cold Springs Harbor Laboratory, 1985 Biggs, J., Searles, L.L., Greenleaf, A.L., Structure of the Eukaryotic Transcription Apparatus: Features of the Gene for the Largest Subunit of Drosophila RNA Polymerase II, Cell 42: 611-621, Sept. 1985 * Hamori, Eugene, Novel DNA Sequence Representations, Nature 314: 585, 1985 * Nakauchi, H., Nolan, G.P., Herzenberg, L.A., et. al., Molecular cloning of Lyt-2, a membrane glycoprotein marking a subset of mouse T lymphocytes: Molecular homology to its human counterpart, Leu-2/T8, and to immunoglobulin variable regions, PNAS USA 82: 5126-5130, 1985 * Hogness, D.S., et. al., Regulation and Products of the Ubx Domain of the Bithorax Comples, Cold Springs Harbor Symposia Vol L, 1985 * Allison, L.A., Moyle, M., Shales, M., Ingels, C.J., Extensive homology among the largest subunits of eukaryotic and prokaryotic RNA polymerases, Cell 42: 599-610, 1985 * James, D., Leffak, 1.M., Polarity of Replication Through the Avian Alpha-Golbin Locus, Mol. Cell. Biol., 1985 * Singer, P.A., Oshima, R.G., Molecular Cloning and Characterization of the Endo B Cytokertin Expressed in the Preimplantation Mouse Embroyos, J. Biol. Chem., 1986 * Knott, T.J., Rall, S.C., Scotts, J., et.al., Human Apolipoprotein B. Structure of Carboxy]l-Terminus Part II, Section B li Award Number 1U41RR01685-02 INSTITUTION: Intelligenetics REPORT PERIOD: March 1, 1985 to February 28, 1986 COLLABORATIVE RESEARCH Domains, Sites of Gene Expression, and Chromosomal Location, Science, 1985 * Regier, J.C., Pacholski, P., PNAS USA 82: 6035-6039, 1985 * Robinson, H.L., Miles, B.D., Avian leukosis virus-induced osteopetrosis is associated with the persistent synthesis of viral DNA, Virology 141: 130-143, 1985 * Miles, B.D., Robinson, H.L., High frequency transduction of c-erB in avian leukosis virus-induced erythroblastosis, J. Virology 54: 295-305, 1985 * Shank, P.R., Schatz, L.M., Robinson, H.L., et. al., Sequences in the gag-pol-5’ env region of avian leukosis viruses confer the ability to induce osteopetrosis, Virology 145: 94-104, 1985 * Robinson, H.L., Jensen, L., Coffin, J.M., Sequences outside of the LTR determine the lymphomogenic potential of Rous assoicated virus-1, J. Virology 55: 752-759, 1985 * Robinson, H.L., Avian leukosis viruses as vectors for the development of vaccines, Proceedings of 34th Annual National Breeders Roundtable: 64-92, 1985 * Robinson, H.L., Gagnon, G.C., Patterns of proviral insertion and deletion in avian leukosis virus induced lymphhomas, J. Virology, 1985 * Miller, R.H., Robinson, W.S., Common Evolutionary Origin of Hepatitis B Virus and Retroviruses, PNAS, 1985 * Schumacher, M., Camp, S., Taylor, P., et. al., Primary Structure of Torpedo californica Acetylcholinesterase Deduced from dDNA Sequence, Nature, 1985 * Timmerman, K.P., Tu, D., Compete sequences of IS3, NA R13: 2127-2139, 1985 * Kemper, B., Molecular Biology of Parathyroid Hormone, Critical Reviews in BioChem, 1986 * Maratea, D., Young, K., Young, R., Deletion and Fusion Analysis of the Phi 0X174 E Lysis Gene., Gene, 1985 * Machida, C.A., Bestwich, R.K., Kabat, D., A Weakly Pathogenic Mutant of Rauscher Spleen Focus- Forming Virus Has Lost the Carboxyol-Terminal Membrane Anchor of Its Envelope Glycoprotein, J. Virol. 53: 990-993, 1985 * Machida, C.A., Bestwick, R.K., Boswell, B.A., Kabat, D., Role of a Membrane Glycoprotein in Friend Virus-Induced Erythroleukemia: Studies of Mutant and Revertant Viruses., Virology 144: 158-172, 1985 * Bestwick, R.K., Hankins, W.D., Kabat, D., Roles of Helper and Defective Retroviral Genomes in Murine Erythroleukemia: Studies of Spleen Focus-Forming Virus in the Absence of Helper., J. Virol. 56, 1985 Li, J.-P, Bestwick, R.K., Machida, C.A., Kabat, D., Role of a Membrane Glycoprotein in Friend Viral Erythroleukemia: Nucleotide Sequences of Non-Leukemogenic Mutant and Spontaneous Revertant Viruses., J. Virol. 57, 1986 * Gustafson, T.A., Markham, B.E., Morkin, E. , Analysis of Thyroid Hormone Effects on Myosin Heavy Chain Gene Expression in Cardiac and Soleus Muscles Using a Novel Dot-Blot mRNA Assay., BBRC Vol. 130, No.3 1161:1167, 1985 * Morkin, E., Sheer, D., Gustafson, T.A., et. al., Regulation of Cardiac Myosin Iscenzymes by Thyroid Hormone, UCLA Symposia on Mol., Vol.20, 1985 Black, D.L., Chabot, B., Steitz, J.A., U2 as well as U1 Small Nuclear Ribonucleoproteins are Involved in Pre-Messenger RNA Splicing, Cell 42, 1985 * Smith, D. H., BIONET: National Computer Resource for Molecular Biology, Abstracts, Federation of 12 Part I, Section B Award Number 1U41RR01685-02 INSTITUTION: Intelligenetics REPORT PERIOD: March 1, 1985 to February 28, 1986 COLLABORATIVE RESEARCH American Socieites for Experimental Biology, Apr 22-25, 1985, Anaheim, CA * Smith, D.H., BIONET: National Computer Resource for Molecular Biology, Abstracts, International Congress on Computers and Biotechnology, Jan 30-31, 1986, Baltimore, MD * Smith, D.H., Brutlag, D., Friedland, P., Kedes, L., BIONET: National Computer Resource for Molecular Biology, Nucleic Acids Research, 1986 13 II.C. Resource Summary Table The Resource Summary Table includes the totals from the previous sections under Training and Core Research and Development. The totals for the Collaborative Research and Service categories were arrived at as follows. The Usage Factor, again in cpu minutes, represents the grand total of all use summarized on the previous forms, plus staff use not allocated to the above two categories. The BRTP funds allocated include the remainder of the direct costs estimated for this year (see Section III of the “Application for Continuation Grant"), i.e., those funds not allocated to the above two categories. The category of Administration /Miscellaneous includes only the Usage Factor of BIONET’s share of the cpu time (minutes) for computer facility staff and DEC-2060 system overhead accounts (see Table III-8). No funds for this cpu time or staff time are allocated; such funds are considered part of the support of the user community and are distributed on the basis of cpu time to the categories above, as described previously, The Funds Allocated include only the items of capital equipment purchased during the year. The category of Down Time includes the sum of scheduled and unscheduled maintenance on the DEC-2060 computer. In the period 12/84 - 11/85, there was a total of 139 hours (8340 cpu minutes) of down time: e 93 hours (5580 cpu minutes) were scheduled down time, including 69 hours of downtime for the move of the computer over the weekend of August 17-18, 1985; the remainder was scheduled maintenance. e 46 hours (2760 cpu minutes) of down time were due to unscheduled maintenance. The down time reported on the Summary Table is BIONET’s 50% of the total, or 4,170 cpu minutes. Note that the unscheduled maintenance of 2760 cpu minutes is 0.5% of the total cpu time available for the year. Thus, the DEC-2060 system has been available more than 99% of the time, 24 hours a day, seven days a week. No funds have been allocated to this category. “AT 33BG “SUO}PQVIMASUL FFD Pee ee rena oe - — ne 4 009°L6$ -0- “15S *9E9$ CILL JeLy*ey 6201 ZY é €9€ SINIOL ON OL1‘Y SALL Nod 000‘S S19 ‘LI GNOGNYTEDSIN AOLDRLISININVN -Uu- -a- Bre 19 9€L C9Ol°€ 0902 4d 9 0 ; ONINTWULL ne ‘ ‘ DIAS 3 -Q- oad 009°26$ 0 092 ‘07S L069 S60‘8ET| 0902 O10T o 19¢ Lens nat -0- -o- ens‘ directory. We will continue working closely with Dr. Pearson as we extend DFASTP and produce the necessary documentation, but support for this version will be supplied by IntelliGenetics and BIONET. D. Mount/Arizona. Dr. Mount originally proposed to make his PC software package available to the 28 BIONET community through down-loading of software to PC’s. However, the slowness of file transfer programs such as Modem and KERMIT at 1200 baud make the time required prohibitively long. Recently, the Molecular Biology Computer Research Resource (MBCRR) at Dana Farber has begun floppy disk export of Mount’s software. A bulletin to that effect was posted on BIONET-NEWS and has subsequently been moved to the CONTRIBUTED-SOFTWARE bulletin board. Thus, we do not expect to continue with our earlier plans to distribute the software directly from BIONET, but will direct requests to the MBCRR. We have written a letter of support to Dr. Mount for his application for a molecular biology computing resource. We feel that close collaboration among resources is essential to avoid duplication of effort. T. Smith/Dana Farber. Dr. Smith, Director of the Molecular Biology Computer Research Resource, has been granted Class II access to BIONET by courtesy. This was done to facilitate cooperation and collaboration between BIONET and the MBCRR. Dr. Smith uses the bulletin board system on BIONET to announce availability of new software and data on the MBCRR system. For example, the Workshop on Problems in Genetic Sequence Analysis, scheduled for August, 1986, was announced to the BIONET community this way. Recently, the MBCRR has contributed to BIONET a version of the NBRF protein database restructured into functional categories. For example, all DNA-binding proteins, all immunoglobulins, and all cytochromes are grouped in individual files, and the files are in the standard format for use in the Core Library programs. We are currently testing the database prior to release of it to the BIONET community. C. DeLisi/NIH. Dr. DeLisi has proposed contributing software for prediction of higher-order protein structures. Currently, the programs he feels are of most importance are still under development on his DEC-VAX facility. G. Rose/Pennsylvania State. Dr. Rose has recently been accepted as a Class II collaborator and will be contributing software for protein secondary structure prediction. C. Lawrence/NYS Dept. Health. Dr. Lawrence has recently been accepted and will be contributing software for statistical analysis of molecular biological data. He requires access to a library of statistical routines on BIONET, and the IMSL package of subroutines for statistics has been ordered for him and for other persons requiring access to these tools. G. Stormo/Colorado. Dr. Stormo has recently been accepted and will be contributing software for quantitative sequence evaluation, analysis of binding sites, and sequence “landscapes” to display patterns of strings shared by two or more sequences. The last application represents another approach to solving the multiple sequence alignment problem. 29 W. Barker/NBRF,PIR. Dr. Barker is Director of the Protein Identification Resource (PIR), and has been given Class II status by courtesy. She represents our liaison with the PIR community. W. Goad/Los Alamos. Dr. Goad heads the Los Alamos efforts related to the collection of nucleic acid sequence data for GenBank. He has been given Class II status by courtesy to foster communications with the GenBank Resource. He also collects sequences from BIONET submitted to him by electronic mail. He will contribute to BIONET programs for form-driven entry of sequences so that community members can submit their data in the correct format directly to GenBank. R. Roberts/Cold Spring Harbor. Dr. Roberts is a member of our National Advisory Committee, so is grouped on the system in that category. However, he has spent a substantial amount of time working with BIONET on automated methods for updating his restriction enzyme database. Recently, he was able to transfer to us the latest version of this database in a format directly compatible with the Core Library software. Work remains to be done on automatic sending of messages about updates, and automatic logging of changes and testing of the new file, and we will assist him in completing these tasks. The goal is simple. We want BIONET scientists to have access to the latest data on restriction enzymes, rather than having to wait many months for its appearance on-line. Separately, Dr. Roberts is supplying a file of commercially-available enzymes, and we have already organized that into a form such that a user can programmatically select just those enzymes available from a selected supplier. I11.A.3. Core Research Because of budgetary restrictions and the almost complete devotion of BIONET personnel and resources during the previous year to developing and consolidating the service, training, and collaborative components of the resource, Core Research has been limited to detailed planning of two major research goals for the next year of BIONET operations: e Hardware Text Searching Machines. We are investigating specialized text searching hardware to optimize biological database searching; e BIONET Satellite Program. We are investigating both hardware and software methods for the networking of BIONET with other regional, national, and international biologically- related computational resources. Ill.A.3.a. Hardware Text Searching Machines A common operation on BIONET involves the searching of one of the major nucleic acid or protein sequence databases for specific patterns of nucleotides or proteins. The Core Library of software has two programs that access these databases. The first is IFIND, which searches the database for sequence homologies using a specific query sequence. The second is QUEST, which is a sequence database search and retrieval program. QUEST uses a finite state machine that allows complex, often ambiguous 30 patterns to be found in a database. Searches using either program against a large database such as the rapidly growing GenBank may require execution times ranging from cpu minutes to hours. Indeed, the growing use of batch jobs during nights and weekends (see Paragraph III.A.5.b) is a measure of the time required. Such searches represent a major use of cpu time on BIONET. Anything that can be done to reduce this time is not only scientifically interesting, it is essential in freeing up time for other scientists to perform their computations. Recent hardware developments have led us to believe that we can vastly decrease the search time for complex patterns in QUEST. Such hardware may also increase the speed of the first phases of IFIND searches, and this application will be pursued after QUEST. One device, the Fast Data Finder (FDF) produced by TRW, Inc., can pass an entire database as one long character string through pattern matching hardware at a rate between 7 and 9 million characters per second. The databases are stored on a Fujitsu 2350 hard disk (474 Megabyte unformatted) driven by a Concept 21 disk controller which allows the formation of a very rapid data stream by interleaving data from several disk reading heads on the Fujitsu simultaneously. This multiplies the fundamental disk streaming rate from 1 to 1.5 megabytes per second per head, up to 7 to 9 megabytes per second, the limit of the FDF hardware. Transient rates above 10 megabytes per second are buffered in cache memory. The implications of these speeds are profound. For example, the GenBank database of nucleic acid sequences is now 12-14 Mbytes, including all comments. The FDF is capable of searching this database in 1.5 - 2 seconds. The pattern to be found is stored in a series of cells in the FDF, one character per cell, and the data stream is passed through this series of cells. As the stream is passed from cell to cell through the FDF it reports a hit on the target when the pattern in each cell matches. The minimum number of cells (we are proposing initially 1,000 cells with the ability to upgrade to 10,000 cells in one year) would allow a maximum target size of 1000 characters. Much of the standard QUEST search key syntax (strings, ranges, fixed and variable length don’t cares, Boolean relations etc.) is already built into the FDF hardware so that a straightforward translation of QUEST keys to FDF syntax is possible. We are proposing that TRW provide us with translations from our current pattern matching language into their syntax and also provide us with access to their Programmer Interface Language for interacting with the FDF. This will allow us to emulate QUEST in the easiest fashion. The FDF has several advantages over the current QUEST program. First, the cells in the pattern matching hardware can be subdivided so as to search for several patterns simultaneously (maximum 248 patterns and each pattern utilizes a minimum of 24 cells although the patterns themselves may be smaller than this). Secondly, the FDF also allows up to seven mismatches within a defined character string within the pattern. These abilities to search for many patterns simultaneously and to permit mismatches in strings will allow the future development of DNA sequence alignment algorithms including rapid 31 searches for homologies that including indefinite insertion/deletion gaps in addition to mismatches. For this later important application (after one year of use) we will need additional pattern matching cells, preferably near the hardware limit of 10,000. A further important property of the FDF is the ability to report regions of high density of specific sequence patterns. This has long been a major aim for QUEST development. We intend to use our standard QUEST program on the BIONET DEC 2060 as the interface for users to prepare their search keys and to specify the database to be searched. We would also like the physical interface between the FDF (currently integrated with a SUN workstation) and the DEC 2060 to be flexible enough so that the FDF could be driven by identical software running on a VAX or on a SUN, the two other major machines that run QUEST. The simplest solution to this would be for the FDF to receive patterns and return results via a SUN based Ethernet connection. The eventual goal is for the QUEST program to recognize when a proposed search will take more than a few moments of elapsed time and then ship the request over the Ethernet to the FDF hardware. The results of the search will be passed back to the QUEST program, so that the scientist using QUEST need make no special provision for long searches. IIl.A.3.b. BIONET Satellite Program We have begun the BIONET Satellite program in earnest. This program has the goal of distributing the BIONET Resource among computers throughout the academic community, while at the same time establishing better communication links among BIONET, its Satellites and other computing resources in molecular biology. Descriptions of the program with a more detailed statement of goals and objectives can be found in Appendix VI. As can be seen from the Appendix, the actual software license is a business arrangement between the Satellite institution and IntelliGenetics. BIONET’s responsibility is to forge the communication links to ensure that scientists can communicate easily with one another. We have previously described the initial, collaborative arrangements established with other resources in Subsection III.A.2. This is the first step toward the goal of linking the Resources. We currently have a Satellite established at the Salk Institute, and will soon establish two others, one at the US Department of Agriculture, the other at Fort Dietrick (US Army RIID). We are following two approaches to communication with other facilities, ARPANET and a phone line based network that we are simply calling the BIONET Network for the moment. ARPANET. BIONET has arranged Internet access to the ARPANET through a DARPA-funded project with IntelliCorp. In exchange for our assistance with the mechanics of the connection to ARPANET, BIONET will be able to make use of this connection for communications, especially electronic mail. 32 DARPA approved the IntelliCorp connection in October, 1985. We expect our connection to be operational in April, 1986, following the necessary lead times for the leased line provided by DARPA according to government procedures. Network services available through ARPANET include file transfer, mail, virtual terminal service, and others. Since there are mail gateways from the ARPANET to other communications networks, this connection will do much to expand BIONET’s reach. Most notably, mail interchange will be possible both with BITNET which includes EARN in Europe and with the NSF-originated CSNET. BITNET/EARN was undertaken collaboratively by a number of Universities with some help from IBM. Additionally, since the ARPANET uses the TCP/IP internetwork protocols, a great many other networks with gateways to ARPANET will be fully accessible as well. These include the MILNET and local area networks at many major universities and research centers around the US and even in some foreign countries. BIONET’s central DEC-2060 resource will need to be connected to an IntelliCorp local area network which will in turn be a part of the Internet which includes ARPANET. This will mean that we will have to license software for the TCP/IP protocols for use on the 2060, and obtain an Ethernet interface to the local area network. At the same time, the IntelliCorp DARPA contract is purchasing the necessary gateway which will connect the IntelliCorp network to the leased line provided to an ARPANET network node processor, or IMP. The bandwidth of the ARPANET connection will be 56 kilobits per second, which will of course be shared by BIONET with the IntelliCorp DARPA users. BIONET Network. As in the case of BITNET and CSNET, the ARPANET will form only a part of the communications backbone for the BIONET Network. The anticipated BIONET Network sites, or satellites, will vary in size and funding and an economical communications option is needed. We are currently examining options for hardware and software to provide this service. We anticipate that asynchronous dial-up modems will be used to provide the economical link. As CSNET-RELAY does in CSNET, the BIONET central DEC-2060 resource will serve as the relay host for communication between BIONET’s network sites. Most of the BIONET satellites are expected to be some model of the DEC VAX computer. BIONET has no-cost access to a MicroVAX II at IntelliGenetics and will develop the mechanism for mail exchange on this computer. We may wish to add a cache buffer memory to the DEC-20 front-end processor in order to increase the throughput possible for such communication. 33 Iil.A.4. BIONET Training Program IlI.A.4.a. A Brief Review The training program for BIONET has been severely restricted this year due to the budget cuts mentioned previously. However, we have been able to perform some trainings and demonstrate the use of BIONET at several national and regional meeting of molecular biologists. The presence of BIONET at meetings is not training in a formal sense, but there were many opportunities to answer specific questions and demonstrate use of BIONET for specific problems. These meetings also provide opportunities to inform potential BIONET applicants about the Resource. The following summarizes our previous activities and those planned prior to the end of the current grant year. e FASEB Meeting. The Federation of American Societies for Experimental Biology meeting was held April 22-25, 1985, in Anaheim. At this meeting we made a formal presentation about the BIONET Resource, in a Workshop on International Genetic Sequence Resources. In addition, we participated in a booth, jointly sponsored by IntelliGenetics and BIONET. e Rutgers/Waksman Institute Workshop. A workshop, entitled INTRODUCTION TO BIONET: A National Computer Resource for Molecular Biology was held under the auspices of the Waksman Institute of Microbiology, at the Piscataway campus of Rutgers University, June 17-19, 1985. There were two parts to the Workshop, a one-day lecture program on June 17, attended by 79 persons, followed by two additional days for 23 people, all of whom attended the first day. The program for this Workshop is shown in Appendix V. The two- day session allowed all attendees access to terminals connected to a DEC-2060 machine at Rutgers running the Core Library software and emulating the BIONET bulletin board and electronic mail systems. The reports from all attendees on their reactions to the training were extremely positive. All left feeling they know much more about the use of computers in molecular biology in general, and the use of BIONET in particular. The most frequent negative comment was that there was too much material covered in the one-day session. e NATURE Meeting. The NATURE meeting entitled Update in Molecular Biology was held October 7-9, 1985 in San Francisco. IntelliGenetics and BIONET jointly sponsored a booth at the show. e BIOTECH °85. The BIOTECH ’85 International Conference and Exhibition was held October 21-23, 1985 at the Washington Convention Center, Washington, DC. BIONET and IntelliGenetics jointly sponsored a booth at the exhibition. e International Congress on Computers in Biotechnology. This congress will be held January 30-31, 1986 at the Baltimore Convention Center. A talk will be presented on the BIONET Resource in a session titled “Systems and Resources". BIONET information will be available at the IntelliGenetics booth set up in conjunction with other, overlapping conferences sponsored this same week at the Convention Center. e Miami Mid-Winter Symposia. BIONET will sponsor a booth at the Mid-Winter Symposia in Miami, February 3-7, 1986. We are arranging for two training sessions at the meeting, organized around new training materials discussed below. 34 IIl.A.4.b. Some Lessons Learned The trainings at Stanford late in the first year of our grant, the training at Rutgers/Waksman and our experience in assisting the scientific community at trade shows and in our extensive scientific consulting all lead to the same conclusion. People, especially those unfamiliar with computers, get very little out of lectures on use of software. Without the ability to use a system under careful guidance, the amount of information transferred is only slightly above zero. There must be terminals and/or PC’s, at least one per two trainees, access to the BIONET software and communication facilities if not the actual computer itself, and carefully chosen examples to illustrate use of both system and application software. Despite our efforts to write documentation for the new user, it is clear that available documentation and training manuals are useful only after a person has mastered some basic techniques. TII.A.4.c. A New Strategy We are going to develop a new training program, built around examples of application of our software to problems described in the language of molecular biology. This will differ substantially from our current materials, which are focused on specific programs and what they will do, rather than on a specific problem and how to solve it. Our experience has shown us that the following kinds of topics would cover the questions asked most often (these examples are part of a bulletin that was sent to potential participants at Miami): e BIONET: FACILITIES AND COMMUNICATIONS. o What programs and features are available to BIONET users: descriptions of what each is typically used for and how you can access them o How to master UNINET o How to find important information of the bulletin boards o How to keep your directory within allocation © How to send electronic mail--including how to find out who else is on BIONET o How to make your backspace key work e ENTERING AND EDITING DNA AND PROTEIN SEQUENCES o Using the screen-oriented editors (ESEQ); deciding what type of "terminal" you are for GENED; how to move the cursor in the editor © How and when to use ambiguity codes o Entering proteins by three-letter codes o Creating subsequences out of known sequences o Selecting and saving a sequence from the database for your own use 35 e GENERATING RESTRICTION MAPS--FINDING RESTRICTION ENZYME CUT SITES o Listing all or a subset of restriction enzyme cut sites of your sequence o Generating restriction maps from fragment size or mobility data o Generating restriction maps of a given sequence o Creating and using an individualized restriction enzyme list e CONSTRUCTING VECTORS © Locating and using existing maps of common vectors o Cleavage and recombination of fragments o Generating a cloning vector restriction map o Excising fragments to customize recombinant plasmids o Testing directional cloning and insertional inactivation in cloning vectors e ASSEMBLING SEQUENCES TO GENERATE A CONSENSUS SEQUENCE o Entering gel sequence information o Automatically merging together data from multiple gels o Editing consensus sequence--how to propagate changes through to constituent gels o Error checking and sequence comparison o Handling of both dideoxy and chemical sequencing data e SEARCHING and ALIGNMENTS o How to find out if your sequence is in the database oe Comparison of your sequence vs. the entire database o Comparison of your sequence vs. taxonomic or some functionally similar partitions of the database o Explanation of indirect files © How to search for sequences with key words or literature references o What alignment methods are available, and which to use when e OTHER COMMON ANALYSES o Searching for optimal regions to design probes o Reverse translation 36 o Hydropathicity plots (and what each method’s graphs mean) o Secondary structure prediction o Calculating amino acid composition © Translation o Searching for dyad symmetries o Locating internal repeats o Calculating base composition e FILE TRANSFER o How to get your PC to act like a terminal © How to get data to and from BIONET I1I.A.5. Resource Facilities There have been several changes in the management of and personnel assigned to the BIONET computer facilities. These changes are summarized in Section III.C, Administrative Changes. The present section is devoted to a description of the current facilities and summary statistics on use of the Resource. The statistics cover the twelve months since our last Annual Report, 12/84 - 11/85. Il.A.5.a. Computer Hardware and Telecommunication Networks Hardware. The BIONET Central Resource Machine is a Digital Equipment Corporation 2060 computer. The configuration was augmented this year to include an additional RPO7 disk drive. Rather than simply providing additional disk space, this drive allows us a fallback in the event of the failure of one of the primary RPO7 drives. (This happened during the month of October, 1985, and the existence of the additional RPO7 did in fact greatly reduce the necessary downtime.) The primary drives are combined into a single disk structure and must both be functional in order for the system to run. In addition, the third RPO7 is used as an additional storage place for files which are not essential in a short-term fall-back operation. The hardware configuration is as follows: KLi0O-E Model R Processor: 2 MF20/MG20 Memory controllers 2 MW MG20 Memory .75 MW MF20 Memory MCA20 Cache Buffer Memory 2 RH20 Massbus Channels 37 Console and Front End Processor: PDP-11/40 CPU, 32 KW 16 bit memory RXO2 Dual floppy disk drives 8 DHi1 Terminal interfaces 8 * 16 TTY lines each = 128 lines RH1i1 Massbus Channel LP20 Line printer interface DN20 Front End Processor: PDP-11/34 CPU, 128 KW 16 bit memory DMR1i Network interface Peripherals: 3 RPO7 disk drives 111MW each RPO6 disk drive 3OMW 372 MW Total disk storage TU78 1600/6250-BPI tape drive LP26 600 LPM Line printer Imagen Imprint-8/300 Laser Printer Disk space (data storage) Public structure (PS:) disk space use on the 2060 is dynamic. The following snapshot is representative of typical usage, and is taken from December 1985. Total disk space 433,000 (pages--222 million words) Overhead/Common <148,000> (Core, System and System Support Libraries) Swapping Space < 25,000> File system Overhead < 70,000> (Directories and index pages) 190,000 BIONET Allocation 95,000 (Half of the available space) Bionet Usage 12/85 < 53,000> Unused space 42,000 (Available for BIONET growth) Note that file system overhead varies greatly depending on the size of the files involved. Since BIONET users have many small files, BIONET growth may increase file system overhead, altering the above distribution. Terminal Lines Because the usage of a particular terminal line varies greatly, and because many BIONET users share a single line in succession, there was in the past an imbalance in the allocation to BIONET of terminal lines. However, with the departure of the IntelliCorp KSD users from the system (see Section III.C}), additional terminal lines were freed for BIONET. These are not regularly needed by BIONET at this time, 38 but may be used intermittently or for growth. Current system terminal line distribution is as follows: Total lines 128 Overhead < 10> (Shared devices, BCRG staff) 118 Allocated BIONET 59 (Half of the available lines) BIONET Users < 18> (Public Data Network, Local Dial-Ups) BIONET Staff < 6> Unused lines 35 (Available for BIONET growth, temporary use for trainings, replacement of a bad line before it is repaired) Public Data Network Connection. BIONET is accessed principally over the UNINET Public Data Network. An X.25 PAD (packet assembler/disassembler) is located on-site. This is known as the Host PAD, or HPAD. It provides individual terminal ports which are cross-connected to those on the DEC-20. The Uninet trunk line operates at 9600 baud synchronously, and the PAD converts this into up to 16 asynchronous ports whose speed is typically 1200 baud. A handshaking protocol] is employed to smooth over bursts of data during the multiplexing. UNINET we originally chosen as a replacement for Telenet because of its better response time and its lower cost. The lower cost was achieved through a very favorable fixed price per port arrangement that we negotiated with UNINET. Currently 12 UNINET host ports are used by BIONET, and usage is monitored carefully in the event more are needed. The ports are accessed in sequence, with those higher in the sequence not being used while any lower port is free. The number of connect hours per month drops off after the first 6 ports. The usage on these first 6 ports therefore represents many more sessions than does the usage of ports 7 through 12. Our monitoring of the port use also has revealed that it would be cheaper for BIONET to lease the higher-numbered ports on a use, or traffic, basis. We currently are leasing 8 ports fixed, 4 on traffic, and will change this distribution as required for the lowest possible cost. We have been examining the replacement of the UNINET-supplied leased HPAD with a BIONET owned HPAD. The consideration is the savings of lease charges while maintaining adequate reliability. We plan to make this replacement before the end of the current grant year. 10.A.5.b. Summary Statistics on Machine Use The cpu cycles of the DEC-2060 computer are allocated to the user community, including BIONET, by the system’s class scheduler. This scheduler is given the percentage of the machine to allocate to each class of users. Any cycles not consumed by a given class ("windfall")are available to the rest of the user 39 community. This method was chosen so that cpu cycles not consumed by one segment of the community could be used by other segments if needed, i.e., no cpu cycles are wasted if someone needs them. The current percentage allocations ("pieslices") are shown in Figure IfI-1. As summarized in the figure, BIONET Class I (and III and IV) are allocated 30% of the machine, and Class II and staff 10%. The 20% overhead (system overhead, batch and computer staff and operations) is allocated one-half to BIONET, for a total of 50%. These allocations remain the same as last year. However, there are substantial changes to the other classes of users for reasons discussed in Section III.C. Note that the BATCH class is assigned 1% of the system during prime time. In off prime time, the percentage allocation is increased substantially in response to demands by the BIONET community. The actual use of the machine by the BIONET community is now substantially greater than 50% of the total cpu cycles actually used. As an example, the percentage use of the machine for the month of October, 1985 is shown in Figure IIJ-2. It is clear that BIONET is receiving more than its fair share of the cpu cycles. Note that BIONET scientists’ use of BATCH is charged to the individual accounts by the accounting program. Thus, extensive use of BATCH shows up in this pie chart as BIONET Class | (or TI) use, rather than in in the category BATCH Jobs. The data for BIONET percentage of system use are plotted in histogram form in Figure II]-3. This figure demonstrates that BIONET has consumed more than 50% of the total cpu cycles used (data on % of available are given below) on the 2060 since February, 1985, and is now consistently consuming 65 - 75% of the total cpu cycles used on the system. In the following series of tables and figures, we provide further details on the actual use of the system by the BIONET community. Looking first at use of the system in prime time (8 AM - 8 PM, M-F, PST), data for cpu time and connect hours for the indicated segments of the community are given in Tables Iil-4 and Ill-5 by month, and totals. The cpu data in Table III-4 is also plotted in histogram form in Figure IJ-4. (The figures for the facilities group staff and overhead for November, 1985 are artificially low because the statistics were computed before Thanksgiving weekend, before the end of the month operator totals were added in.) There are several important facts that can be determined from these data. Looking first at cpu time, and given that there are about 12,000 cpu minutes (total cpu minus 20% for overhead) available prime time in the average month for the entire system, BIONET (Users plus Staff) has been consuming well over 50% of available cycles. The category of BIONET Users (Classes I-III) compete for 30% of the machine. The class has consumed more than 30% of available cycles since March, 1985, and have thus been able to take advantage of considerable windfall. 40 Figure IHI-1: Pieslice Allocation of the DEC-2060 Computer System Class Scheduler 7% 12% 30% BOMOBEaAGBNA %, O - System Overhead and not-logged-in jobs 1 - BIONET Class 1 users 2 - BIONET Class 2 users and BIONET Staff 3 - IntelliCorp Staff and Customers 4 - IntelliGenetics Customers 5 - Computer Staff and Operations 6 - Batch jobs | 7 ~ IntelliGenetics Staff 41 Figure I]-2: Actual Use of the DEC-2060 for the Month of October, 1985 Actual Use October 1985 by class 5.60% 0.10% 0.10% 11.10% 12.00% 3.80% EBNOUG@®@AN YZ O - System Overhead and not-logged-in jobs 1 - BIONET Class 1 users 2 - BIONET Class 2 users and BIONET Staff 3 - IntelliCorp Staff and Customers 4 ~ IntelliGenetics Customers 5 - Computer Staff and Operations 6 - Batch jobs 7 - IntelliGenetics Staff 54.20% Use of the DEC-2060, 12/84 - 11/85 TI-3: BIONET’s Percentage Figure DDbMBG.CQMKKQK 9 8 ']DCDb>?$RMWIGKKW|W, 2 'D™D?—?>36.[$;CC.W.W WW 3 DB QA Q\Q(g Vv £ IBCMMKKUQvj_2 BIONET Percentage of Total System Use 43 The total number of connect hours, prime time (Table TI-5), for the category BIONET Users has remained in the range 1800 to 2200 since May, and the relationship between connect hours and epu minutes remains relatively constant over those months. The data for non-prime time (weekends and 8 PM - 8 AM M-F) are shown in Tables III-6 and Tl-7, and the data on cpu time are plotted in histogram form in Figure III-5. Particularly notable in these data are the dramatic increases in cpu time over the past year, especially in the last three months, due almost entirely to BIONET use. These increases are due primarily to the extensive use of overnight batch runs to perform time-consuming analyses involving database searches, using the IFIND homology and the QUEST database search and retrieval programs. Thus, the community has gravitated naturally toward off-hours use of these programs for such analyses. Given that there are about 22,000 cpu minutes (total minus 20% overhead) available each month in non- prime time, BIONET (Users plus Staff) has recently been consuming more than 50% of the amount available. Given low use of the system by other classes in non-prime time, BIONET consumes most of the cpu cycles actually used during these times. The data for total use of the Resource by BIONET are presented in Tables II-8 and IJI-9 and the total cpu time is summarized in Figure IIJ-6. BIONET Users and Staff, since May of 1985, have consumed 40% or more of all the cpu cycles available on the system (total minus 20% overhead). One important conclusion from all these data is that the Resource is rapidly approaching saturation. Certainly, during prime time, the system load is becoming a barrier to rapid computation. At this point, limitations on the number of access ports keep the load average under control by limiting the number of concurrent users. However, as we add additional telecommunication ports, we will quickly become limited by available cpu time. Another important conclusion we have reached from these data is in regards to the effects of the subscription fees on use of the Resource. The total use by BIONET scientists (not including staff) increased steadily from November, 1984 through May, 1985. In the summer months of June through August, use leveled off, beginning before the subscription fee was announced, which we attribute to summer vacations more than any effect of subscription fees. Beginning in September, 1985, use increased steadily again to a level substantially above the months prior to initiation of the fee. Summary data for use of our telecommunications network are presented in Figure III-7 by month for the past 12 months’ use of the Telenet (until mid-July, 1985) and UNINET (beginning early July, 1985) networks. Three factors distort this Figure. First, the value for July is artificially high because we were running the two networks simultaneously and performing extensive tests on UNINET. Second, we noticed Table IIl-4: 44 BIONET Prime Time CPU Minutes BIONET Users BIONET staff BCRG &£ Total BIONET (except staff) System Overhead Use December 769 397 385 1551 January 2598 1054 579 4231 February 3368 1091 644 5103 March 4236 571 473 5280 April 5169 861 529 6559 May 6791 776 515 8082 June 5004 905 530 6439 July 5575 1094 564 7233 August 5132 1248 508 6888 September 4854 798 509 6161 October 6476 1330 455 8261 November 6135 473 88 6696 TOTAL 56107 10598 5779 72484 Table I-5: BIONET Prime Time Connect Hours BIONET Users BIONET staff BCRG & Total BIONET (except staff) System Overhead Use December 328 519 1218 2065 January 761 1164 1368 3293 February 1137 829 1340 3306 March 1206 638 347 2191 April 1452 764 1353 3569 May 2177 737 1473 4387 June 1908 577 1567 3690 July 2291 916 1661 4643 August 1846 700 1374 3767 September 1777 606 1585 3810 October 2101 763 1688 4537 November 2187 689 156 3032 TOTAL 19171 8902 15130 42290 45 O50. 12.84 ~ 11/85 oy BIONET's Prime Time Use of the DIC- Figure I1-4: BIONET Usage during Prime Time in CPU minutes RSE SB AANASASASASAS REAR | PV AAAAASNANSASASN Ribbed 3 A AAASASAN Rhy S INANNAASNASI Enns b KAS AANNAANAAY Rn & WAASAAASSA Rn hn: mm hhh 8 ASASASSASASA RR cd SB AASAANNAS RON Ps NAANASAN RSS PE A ANANAI RA F AAS Py LN 9000 ; o 8000 fT 7000 t 6000 fF 5000 + 4000 f 3000 + 2000 f 1000 + Mar Apr May — Jun Jul Aug Se Oct Nov HE 50% of Computer Total BIONET use Z AIBIONET users Ml BIONET staff Jan Staff and system overhead (except Staff) Table II-6: 46 BIONET Non-Prime Time CPU Minutes BIONET Users BIONET staff BCRG & Total BIONET (except staff) System Overhead Use December 366 91 225 682 January 1673 128 826 2627 February 3848 357 159 4364 March 4169 26 404 4599 April 3386 356 1370 5112 May 6777 206 1300 8283 June 6567 1129 1415 9111 July 6956 850 1613 9419 August 5396 1244 1238 7878 September 7056 1192 876 9124 October 9553 1407 1103 12063 November 12326 111 86 12523 TOTAL 68073 7097 10615 85785 Table HI-7: BIONET Non-Prime Time Connect Hours BIONET Users BIONET staff BCRG & Total BIONET (except staff) System Overhead Use December 117 159 1751 2027 January 420 145 1749 2314 February 562 208 1680 2450 March 697 121 142 960 April 601 149 1859 2609 May 949 221 2002 3172 June 1246 194 2210 3650 July 1197 230 2519 3946 August 887 192 1843 2922 September 1109 202 2590 3901 October 1213 190 2343 3746 November 1746 173 117 2036 TOTAL 10744 2184 20805 33733 47 BIONET’s Non-Prime Time Use of the DEC-2060, 12/84 - 11 &5 Figure DI-5: BIONET Usage during Non-Prime Time in GPU minutes Rn VUBULELELELEBEBES ESSSSSSSS Es pS ASA AS SAAN Rn SS bd I SAAANANAI RAS ee RUSS SSUES ‘a I SAAAAAAS PSS PS ANAAAN ESS ASN SAASASAS Ry pir KAANI RAD DAN SAS ESS KS AAN EG AS 14000 4000 + 2000 f 12000 + 10000 8000 ¢ 6000 Aug Sep Oct_ Nov Jul Ei 50% of Computer Total BIONET use May — Jun Apr Jan Mar M2 AIBIONET users Ml BIONET staff Dec Staff and system overhead (except staff) Table ITI-8: 48 BIONET Total CPU Minutes BIONET Users BIONET staff BCRG & Total BIONET (except staff) System Overhead Use December 1136 489 1015 2640 January 4271 1182 1407 6860 February 7216 1449 1414 10079 March 8405 597 877 9879 April 8556 1217 1899 11672 May 13568 982 1816 16366 June 11571 2035 1946 15552 July 12531 1945 2178 16654 August 10528 2492 1747 14767 September 11911 1990 1386 15287 October 16029 2737 1559 20325 November 18462 585 174 19221 TOTAL 124184 17700 17418 159302 Table I-89: BIONET Total Connect Hours BIONET Users BIONET staff BCRG & Total BIONET (except staff) System Overhead Use December 445 678 2969 4092 January 1181 1309 3117 5607 February 1699 1037 3020 5756 March 1903 759 489 3151 April 2053 913 3212 6178 May 3126 958 3475 7559 June 3154 771 3777 7340 July 3488 1146 4180 8589 August 2733 892 3217 6689 September 2886 808 4175 7711 October 3314 953 4031 8283 November 3933 862 273 5068 TOTAL 29915 11086 35935 76023 49 BIONET’s Total Use of the DEC-2060, 12/84 - 11/85 Figure TI-6: in CPU minutes Total BIONET Usage ERAN ASSAASAANAASNA EES F ANASASNAAAAN SRA % AS ASA AA! RS Eo PNASAANAS REE BI DA SASAAAN Ry IAA SANNA] Rh PAS ANAAAN ESSMSESSSSSSSSS KASAAN RSs PIA AAAAL ESS ce] SANASN RM % AS SS 25000 + 20000 15000 ¢ 10000 f 5000 oO Aug Sep Oct Nov Jun Jul Ei 50% of Computer May Apr Feb Mar (7] AI BIONET users [Ml BIONET staff Jan & Total BIONET use staff and system overhead (except staff) 50 that many users were leaving their terminals after completing their work without logging off BIONET, thereby tying up the network port and preventing other users from accessing that port. Therefore, we implemented an “idle zapper“ which monitors the cpu use for each BIONET job, sends a warning message after 10 minutes of cpu idle time, and detaches the job after 5 more minutes of idle time, as a good compromise based on comments on the idea from the user community. Thus, an idle job can tie up a port for no longer than 15 minutes. The job is still available to the user, who can reattach to it and continue from where he or she left off. The zapper has been very effective in freeing up network ports. Third, the data for October, 1985 are artificially low because of UNINET network problems, which have since been resolved. IH.A.5.c. Computer Software - Core Library Through our license agreement with IntelliGenetics, we have provided all Core Library software releases to the community. There have been two major releases so far this grant year, and another will occur at the end of January, 1986. One important addition to the Core Library was requested by Dr. Yanofsky of our National Advisory Committee, the addition of the DIGITIZER program to the suite of software. Up until recently, access to the software to use a sonic digitizer for entry of gel data (restriction digests, sequencing ladders} has not been possible for BIONET scientists. We have made arrangements to modify the software license agreement with IntelliGenetics, and digitizer access is now possible. A bulletin to that effect has gone out to the community, and a small number of laboratories have purchased the necessary hardware to use DIGITIZER. Ill.A.5.d. Computer Software - System Library During the course of the year, the following additions have been made to the system support library described in last year’s report. Communication. FINGER--Displays an information message or “plan“ optionally provided by a user for other users advising them of travel itinerary or other contact information, and also displays the date the user in question was last on the BIONET system. WHOIS--Directory lookup program for BIONET investigators. During the course of the year this utility was upgraded to have more generalized search capability and to permit searches of mixed case text. The WHOIS database of BIONET users was extended to include research titles for each PI, to enable other PI’s to identify investigators with similar research interests. 51 Figure II-7: Total Telenet and UNINET Network Use, 12/84 - 11/85 BIONET Network Use Connect Hours 2000 + 1800 + 1600 + 1400 + 1200 + 1000 + 600 + 400 + 200 T Hi re Q- mek a on Bese Dec Jan Feb Mar Apr May Jun Ju! Aug Seo Oct Nov Prime Time EH Non-Prime Time 52 File Search. FIND, DFIND--These utilities were added. They permit a concise display of matching text in a file, displaying respectively either a whole paragraph or a single line containing the match. Complicated search keys may be specified and the output can be directed to a file. System utilities. Several of these were added: TTYINI (sets terminal parameters more accurately by default), CALC (numeric calculations), RTTY (terminal display of a file in reverse order), DIRED and CLEAN (display and hard copy directory management), MPW (suggest good random passwords to users), LASTN (displays last N lines of a file). Programming Languages. C--We installed the KCC compiler for the C programming language from Stanford University together with a new version of the FAIL assembler. O1.A.5.e. Computer Software - Contributed Library We have set up a special directory on BIONET, the directory, as a repository for contributed software and databases from the outside community. This directory is protected so that staff and BIONET investigators have access, but other users of the DEC-2060, for example customers of IntelliGenetics, do not. This was done so that persons who wish their software to be accessible only for not-for-profit research have a mechanism to do so. The software and databases that have been contributed are summarized in Subsection TI.A.2, above. I0.A.5.f. Database Library We maintain all major collections of biological sequence data on BIONET, including the GenBank and European Molecular Biology Laboratory (EMBL) nucleic acid sequences, and the Protein Identification Resource database of protein sequences. We also maintain the Cold Spring Harbor database of restriction enzymes (contributed by Dr. Roberts, see Subsection I1.A.2). We provide VectorBank'™, from IntelliGenetics, for use in programs designed to model cloning experiments. These databases are updated immediately on receipt from the suppliers. Where necessary for use in Core Library programs, we reformat these databases; the original databases are also maintained in separate directories for programs designed to access those formats. 53 I1.B. Highlights The following are highlights of the BIONET Resource’s second year of operation. e 560 PI’s have been granted access to the BIONET Resource, substantially exceeding our early estimates of the size and interest of the community. These scientists have already published over 50 scientific papers in which the Resource played an important role in obtaining the results of their investigations. e Collaborative research projects have brought to the Resource both new computer software to complement what was available already, and new databases, bibliographies, and computer- readable textbooks. These augmentations of the Resource are widely used by BIONET scientists. e Our investigations of special computer hardware for rapid searches of nucleic acid and protein sequence databases have led to identification of and writing of specifications for a machine to be used in conjunction with existing facilities. Such a device would dramatically improve such searches, thereby increasing the amount of computer time available to other investigators. e Our BIONET Satellite program for establishing a loosely-linked network of computers has gotten off to a promising start. The communications and networking facilities of BIONET will be used to maintain electronic mail and bulletin boards accessible by all Satellite resources. HI.C. Administrative Changes Il.C.1. Facilities In July, 1985, IntelliGenetics acquired control of the DEC-2060 and related facilities from IntelliCorp. Thus, all facilities used by BIONET are now totally under the management of IntelliGenetics. 50% of the machine is still devoted to BIONET as part of the Cooperative Agreement under which we are funded. As discussed previously in Paragraph III.A.5.b, BIONET actually consumes substantially more than 50% of the computer (cpu) time. As part of these changes, the Knowledge Systems Division (KSD) of IntelliCorp separately purchased several DEC-VAX systems and over the period of September and October, 1985, reduced its share of the 2060 from 20% to 4%. The pie chart describing the current percentage allocations of the system to each category was presented previously in Figure III-1. fiI.C.2. Personnel As part of the reorganization of the computer facility, several changes in personnel have taken place. R. David Roode joined IntelliGenetics in the role of Biotechnology Computer Facilities Manager. Mary Yardley became Operations Supervisor and Lauri Kanerva continues as Senior Computer Operator. Andrea Gorman and John Shelton assumed other roles in IntelliCorp. Other major changes in personnel 54 were brought about by budget cuts as discussed at the beginning of this report. We are currently engaged in a search for a BIONET Scientist to assume the role of Ms. Ari Azhir, who has recently left the company. On November 15, Marcia Allen was added as a second Scientific Consultant to the BIONET staff. This was made possible by the improved financial condition of the Resource. This addition was chosen specifically to improve our user consultation services and to enable us to revitalize our training program. Ii.D. Resource Advisory Committee and Allocation of Resources Our methods of allocating resources (staff and computer time) are relatively simple. The DEC-2060 computer uses its windfall scheduler to allocate cpu time to the various categories of users and overhead, as described in detail in Subsection III.A.5. These cpu cycles are distributed on a first-come, first-served basis. Because of the limited number of communication ports into the 2060 (currently 18, see Paragraph III.A.5.a), we have asked the community not to have more than one person per PI group using BIONET at the same time during prime time. The community has done an excellent job in complying with this policy. In October, 1985, we doubled the disk space available to each PI’s group. This has eased substantially the problems several groups encountered in managing large sequencing projects or individual databases of sequences. We have devoted most of our staff time to the Collaborative Research and Service components of the Resource. In the next year we will devote additional staff time to foster collaborations and to our Core Research activities. This will be possible because the community is becoming more sophisticated in its use of BIONET and we have already augmented our staff in support of the Service component. On August 21, 1985, we held a meeting with "local" members of the National Advisory Committee (see below) Tom Rindfleisch, Joshua Lederberg and Charles Yanofsky, to take advantage of Joshua Lederberg’s visit to the Bay Area. One question which was posed at that meeting was how fairly the Resource, in terms of computer cycles, was being distributed to the community. One answer to this has been obtained by summary statistics on the top 20 users of BIONET over the past twelve months. Our data indicate that 38% of the total cpu cycles were delivered to the top 20 users. This statistic must be interpreted in the light of two important facts. One, the top 20 users on a month-by-month basis show substantial differences, reflecting the nature of computing in the area (significant use of the computer is often followed by additional laboratory studies suggested in part by the computational results). Second, the needs of different PI’s differ substantially. Those who require frequent access to the databases will use substantially more cpu time than other PI’s who may be doing restriction mapping or assembly of 55 consensus sequences. The former group will use a disproportionate share of the Resource, and this fact is what has prompted us to seek an alternative solution by applying new hardware to serial search, as described previously under our Core Research program (Subsection III.A.3). The last meeting of our full National Advisory Committee was held March 23, 1985. The next scheduled meeting is February 24, 1986. The current membership of the Committee is as follows: e Professor Joshua Lederberg, MD, PhD. (Chair), President, The Rockefeller University. e Dr. Saul Amarel, PhD., Director, Information Processing Techniques Office, Defense Advanced Research Projects Agency, Department of Defense. e Professor Alan Maxam, PhD., Dana Farber Cancer Institute, Harvard Medical School, Harvard University. e Dr. Richard J. Roberts, PhD., Senior Staff Investigator, Molecular Biology, Cold Spring Harbor Laboratory e Thomas Rindfleisch, MS, Director, Knowledge Systems Laboratory, Department of Computer Science, Stanford University. e Professor Charles Yanofsky, PhD., Department of Biological Sciences, Stanford University. e Professor Fotis Kafatos, PhD., Department of Cellular and Developmental Biology, Harvard University. Ill.E. Dissemination of Information on Resource’s Capabilities We discuss two areas related to dissemination of information about the Resource that we have pursued this grant year. The first is interactions with the scientific community through participation at conferences, advertising the availability of BIONET, and mailing information about the Resource to NIH grantees. The second is use of the electronic mail and bulletin board facilities of the Resource itself to keep the BIONET community aware of changes and improvements. Ill.E.1. Community Interactions and Awareness We have used three methods this year to inform the community about BIONET and to solicit applications for access to the Resource. The first method has been participation at major conferences. where we have presented papers and/or have had booths at exhibitions. These efforts were summarized previously under Training, Subsection III.A.4. At these conferences, we have distributed the standard application packets to scientists, after demonstrating to them the capabilities of the Resource. The second method is advertising. Due to our limited budget, we have placed only one advertisement this year, in the special computer issue of Nucleic Actds Research, which appears in January, 1986. The text of the ad is provided in Appendix VII. 56 The third method is a mass mailing to NIH grantees whose research areas were characterizable as related to molecular biology, biological chemistry, and so forth. This list was provided to us by Dr. Charles Coulter, BRTP/DRR/NIH, who obtained the list and associated mailing labels by searching the NIH database of information on research areas of its grantees. The keywords chosen for the search were purposefully general to capture as many potential investigators as possible. Thus, there were some investigators chosen who are working in tangential areas, for example, classical genetics. There were approximately 4030 mailing labels from this list. A brochure describing BIONET was sent to each of these investigators on November 10, 1985. The brochure itself is shown in Appendix VIII. The brochure is two-sided, and folded in half along perforations. Persons wishing an application packet need only fill out the requested information, tear off the return half and send it back to us. So far we have received 315 requests back for application forms. Many of these returns also have requested information on the BIONET Satellite program. Il.E.2. Electronic Communications The electronic communication facilities of BIONET provide another important way to disseminate information about the Resource. In addition, electronic mail and bulletin boards provide a mechanism for scientific and technical interchanges among members of the community. Information on the types of electronic mail communications with BIONET was summarized previously in discussion of the Service component of the Resource (see Subsection III.A.1). We have also established a second mechanism for sharing information electronically, on-line bibliographies. Bulletin boards and bibliographies are discussed in the next sections. TI.E.2.a. Bulletin Boards The electronic bulletin boards are an important component of the BIONET Resource. They provide BIONET users with a facility for the exchange of data, laboratory techniques and ideas with others of like mind. For example, a laboratory just beginning a study of the conservation of DNA sequences might experience some frustrating technical problems. A message to the MOLECULAR-EVOLUTION bulletin board, describing the problem, will probably reach and be read by more than 1000 BIONET users, some of whom will have experienced similar problems and can offer solutions. Obviously, the users represent a wealth of knowledge. Communication is the key to accessing and disseminating that knowledge. BIONET?’s bulletin board system consists currently of 20 bulletin boards of varied topics. The topics were selected from user requests and from a survey of the most frequently asked questions. We have designed the system so that each BIONET user automatically receives messages that are of concern to all users, but can decide independently which other bulletin boards he or she would like to subscribe to. Subscribing to a bulletin board is an automated procedure which results in the automatic presentation of new messages 57 upon logging onto the system. However, all users have access to all bulletin boards, whether or not they are subscribers, through the electronic mail facility. The bulletin board topics are as follows: BIBLIOGRAPHIES BIONET-NEWS CONTRIBUTED-SOFTWARE EMPLOYMENT IMMUNOLOGY LIBRARIES MOLECULAR-BIOLOGY-LAB-METHODS MOLECULAR-EVOLUTION ONCOGENES PC-COMMUNICATIONS PC-SOFTWARE PLANT-MOLECULAR-BIOLOGY POLITICS PROGRAM-APPLICATIONS PROTEIN-ENGINEERING RESTRICTION-ENZYMES Instructions for using BIONET’s online bibliographies that have been contributed by members of the BIONET community. Information relevant to all BIONET users. All BIONET users automatically subscribe. Reviews/instructions for using the software that has been contributed to BIONET by BIONET users. Postings of job opportunities by BIONET users. Information/inquiries relating to immunology. Requests/postings of availability for/of cDNA libraries or codon usage tables. Information/inquiries relating to laboratory techniques. Information/inquiries relating to the study of evolutionary relationships of genes or proteins. Information/inquiries relating to oncogenes. Instructions/inquiries relating to using a PC to communicate and to transfer files to and from the BIONET computer. Reviews/inquiries of software packages for any type of personal computer. Information/inquiries relating to the study of plant genetics. Concerns/opinions which may or may not be related to research in molecular biology. Instructions/suggestions for using the BIONET programs for special applications or research projects. Information/inquiries relating to protein engineering. Information/inquiries relating to restriction 58 enzymes. STARTUP For new users, a quick introduction to BIONET including the most frequently asked questions from new users and their solutions. TOPS20-HINTS Instructions/suggestions for Managing your directories, copying files, running batch jobs Or any other system facilities/commands. VECTORS Information/inquiries relating to vectors and vector construction. YEAST-GENETICS Information/inquiries relating to yeast genetics. There are about 170 different messages on the bulletin boards, 18 are suggestions for lab techniques, 17 which involve the trading of libraries, and 21 reviews of software used for molecular biology research. Surprisingly enough, the most popular bulletin boards are the PC topics, and the least used bulletin board is ONCOGENES. The number of messages reflects an under-utilization of this resource, which we will address with our more aggressive plan described below. But interviews with BIONET users and feedback from the reapplication forms indicates that the number of messages on the bulletin boards does not entirely reflect the level of interaction among members of the BIONET community. In many instances, the bulletin boards have served as a catalyst for the dissemination of information across the community. With limited staff, it was impossible to actively promote the bulletin board communities. However, with the implementation of subscription fees and the hiring of additional personnel, we have begun to select active members of the BIONET community to serve as bulletin board leaders. These scientists will submit and solicit articles, reviews and information from the community and post them on a bulletin board. They will receive special access privileges, and will monitor and update the messages. The bulletin board leaders will also post monthly updates on BIONET-NEWS describing the new messages sent to their bulletin board. This way, every member of the community will have the opportunity to remain informed about collaborations or new developments in molecular biology without having to read messages that may not be in his or her field. It will also eliminate the problem of having messages on multiple bulletin boards. We feel that with the implementation of our more aggressive plan for bulletin board community leaders, the bulletin boards on BIONET will prove to be an even more important component for the dissemination of information among the molecular biology research community. 59 I.E.2.b. Bibliographies In response to a suggestion from the National Advisory Committee, Bionet set up a procedure whereby Bionet users could contribute their personal bibliography files for the use of others. The various format- independent text searching programs available on Bionet makes this feasible. The program FIND allows users to search for particular text words or other character patterns in single files and to print out specified amounts of text around the patterns. The program XSEARCH allows users to search many files for patterns, and the program QUEST permits the combined flexibility of both FIND and XSEARCH. The most important aspect of these tools is that they are general, context-independent searching methods so that bibliographic data in almost any format could be searched. Users were informed of these text searching tools by a BBOARD message and two sample bibliographic files were announced by the Bionet co-investigators. They included the following files: PS : CHROMOSOME. BIB. 25 COMPUTER .BIB.30 ; References on chromosome structure ; Computer algorithms and methods DNA.BIB.5 ; DNA structure and topology DROSOPHILA .BIB.16 ; Drosophila molecular genetics GENETICS .BIB.17 ; Molecular and classical genetics METHODS .BIB.29 ; General laboratory methods RESTRICTION. BIB.16 ; Rich Roberts restriction enzyme refs TOPOISOMERASE .BIB.48 ; DNA topoisomerase references PS : MUSCLE.BIB.1 > References on muscle proteins and genes These files were not greatly referenced nor was there a lot of feedback from the community. One other user, (Tom Broker) volunteered a complete bibliography of work performed on papovaviruses and we are helping him mount this extensive bibliographic database. The primary problem with this approach is that personal databases are just that, very personal, limited in scope and generally not kept up to date. When scientists want to access bibliographic information online, they usually go to more complete collections such as MEDLINE or DIALOG. What would be much more useful to the BIONET community would be to provide access to full text copies of well known reference works, such as the bibliographies of Drosophila by Herskowitz, Genetic Maps by O’Brien, Genetic variations of Drosophila melanogaster etc. Most of these works are not prepared in computer readable form. Genettc Variations of Drosophila melanogaster Fortunately, Dr. Dan Lindsley did prepare this extensively used reference work in a computer readable 60 form. It is fondly known as the Redbook and it has been the bible of Drosophila genetics and a primary research resource for 17 years. Recently Dr. Lindsley has undertaken to produce a new edition of this work. Moreover, both he and his publisher have agreed to make both editions of the Redbook available on BIONET for online access. The advantages of having the book available online is that one can effectively cross index the entire book for any word that appears in the book. For example, using the text searching tools mentioned above, one can find all mutations that affect bristles as well as all suppressors of bristle mutants. This kind of cross indexing is not possible in any other way. Methods for finding all known mutants at any genetic map position or in any region of the polytene chromosome are also possible. While there are Appendices in the back of the Redbook to allow the location of the genes by location, online access allows one to have descriptions of all genes that are similar in 1) location, 2) function or 3) which interact with each other. We have made the Redbook available in both chapter form, identical with the chapters as they appear in the book itself, and in two large sections of the books including all the point mutations (in the file MUTANTS) and all the chromosome rearrangements (in the file OTHERS). These latter files aide in finding mutants using the FIND program. We have just received a tape containing initial chapters of the new version of the Redbook and we intend to make this available in the same way, with updates as the work is completed. Currently the new version is about 60% complete. Cloned Segments of the Drosophila Genome In addition to Genetic Variations of Drosophila melanogaster, Dr. John Merriam (UCLA) has compiled a list of all the molecular markers on the Drosophila chromosomes. This compilation includes all cloned segments of Drosophila DNA that have been mapped to specific genetic positions as well as all rearrangements whose break points have been cloned. These cloned segments are extremely useful to molecular biologists who wish to isolate specific genes from Drosophila using the walking procedure. Moreover, this compilation will eventually develop into a complete molecular map of the Drosophila genome to complement the genetic one. We are working with Dr. Merriam to provide this useful resource on Bionet. Once both the new edition of the Genetic Variations of Drosophila melanogaster and the Cloned Segments of the Drosophila Genome are available, this will be written up for publication as letters to the editors of various molecular and genetic journals. A description of the databases will be submitted to Drosophila Information Services, where most geneticists currently go for this kind of information. Once these works are made available and properly announced we will evaluate their usefulness to the community as judged by the number of read accesses and how many individuals access these databases. 61 We will also determine how many new users apply for BIONET use primarily to access these genetic databases. Currently WHOIS reports that there are 14 laboratories who mention the word Drosophila in their research title. There are probably many other laboratories that are concerned with Drosophila as a research organism but who do not have Drosophila in their research title (i.e. HOGNESS). We can also consider recruiting other genetic databases which currently exist or will shortly in a computer readable form. Some examples would be the E. coli genetic map (BLATTNER) and the human genetic maps (RUDDLE and WHITE). These extensive genetic and restriction maps are a natural complement to the DNA and protein sequence information that BIONET now provides and we hope that these higher order rearrangements of genetic function will be as useful in the future as are the sequence databases at the present. l.F. Suggestions and Comments We have two suggestions that would dramatically improve the productivity of the BIONET Resource, and would increase its availability and utility to the scientific community. The first suggestion we have relates to the relationship between the Resource and the NIH staff. We have, in general, received a great deal of support from staff at the Office of Grants Management and BRTP itself. However, we simply must have more warning regarding decisions at the federal level that affect our budget. The dislocations we experienced in the first months of this grant year were substantial, and much time was wasted on administrative, as opposed to scientific, problems. The NIH is in a delicate position, in that it cannot afford to alarm its awardees about potential budget cuts that might not take place. But the down side risks of this approach, in our opinion, are more dangerous. We know that there may be additional NIH budget problems in fiscal 1986. We request that we be kept informed, even if information is tentative and subject to change, about any decisions that could affect our Year 3, and subsequent awards. The second suggestion relates to the fact that there is increasing number of computer resources for molecular biology and related areas funded through DRR, alone. These resources make up a significant fraction of the BRTP/DRR budget, yet each has been funded through the traditional grant proposal and review process. What has resulted is a number of resources with complementary goals and facilities that have no means of communication with one another to easily exchange their stock in trade, electronic mail, new software and updated databases. Our program for BIONET Satellites is designed to take some steps toward network interconnections, but we do not have the budget nor the mandate to solve this problem alone. Obviously, a proposal can be generated for the necessary funds to build such a network, but it is not clear whether there would be sufficient new science to be successful. The networking problem has already been solved by others, so any proposal would be primarily technological. We suggest that DRR 62 Staff and Council, who must already be cognizant of the situation, investigate and propose some future programmatic goals. We would be pleased to participate in this effort. 63 I. Letter to BIONET Scientists BIONET™ National Computer Resource for Molecular Biology c/o IntelliGenetics 124 University Avenue Palo Alto, CA 94301 14 June 1985 eeeee IMPORTANT ANNOUNCEMENTS ABOUT BIONET ***** SUBSCRIPTION FEE SWITCH to UNINET BULLETIN BOARDS BATCH JOBS Dear BIONET Scientist: ] am writing this letter to bring you up to date on a number of changes that will be occurring in the near future on the BIONET Resource. By most measures, our first year of operation has been very successful. We currently have more than 450 Principal Investigators, representing over 1400 individual scientists, who have received approval for access to BIONET. In late March, our National Advisory Committee met with us to review progress and set directions for the next year (we are on a March 1 to February 27 funding cycle). Some of the changes that will be occurring are an outgrowth of that meeting, especially with regards to funding problems, communications and the use of bulletin boards and electronic mail. As most of you know from a bulletin on BIONET (you can review the bulletin by issuing the MM command BBOARD BIONET-NEWS and reading the bulletin entitled "Budget cuts for BIONET*) budget cuts at the NIH have led to substantia] reductions in our budget for this grant year. For the past three months, with the help of consultants and our Advisory Committee, we have been studying solutions to the problems posed by these cuts. To reiterate, we have had to reduce staff from 8.5 full time equivalents to about 5.5, and we project that our telecommunications budget will be consumed late this fall unless some prompt action is taken. The following sections on Subscription Fee and Switch to UNINET represent our best solutions to these problems, solutions that will enable us to continue to provide you with high quality service and to build the Collaborative and Core Research aspects of BIONET. Most importantly, the changes mean that you will not have to pay the full cost of your telecommunications. This was crucial to us in order that those Jaboratories that are suffering BIONET c/o bteBGanetica, Inc. 124 Universit, Averue = Palo Al, Calilorria 94301-1675 Telephone (415) $24-GENE 14 June 1985 64 Page 2 from funding problems still be able to access BIONET. Subscription Fee There are several possible solutions to the problems of staff reductions and a substantial projected deficit in our telecommunication budget. We have initially focused our attention on the latter problem, because without telecommunications access, there would be no Resource! Many solutions to this problem would have the effect of making access prohibitively expensive for those laboratories with limited resources; such laboratories are often those that need access to BIONET the most. Fortunately, we have found what seems to be a good compromise. With the concurrence of our National Advisory Committee, we are instituting a nominal subscription fee to BIONET, which, when combined with available funds, will make up the deficit. The fee will also allow us to expand our telecommunication services as the BIONET community grows, by increasing the number of network lines to the computer. We regret to have to impose a fee on such short notice. Our long range plans indicated that some type of access charge would have to be leveled eventually, but we had not anticipated it happening this quickly. We received notice of our budget cuts on February 26, three days before the beginning of our second year. Since that time we have tried unsuccessfully to have the cuts rescinded, but the NIH, Congress and the OMB are still at odds over the 1985 NIH budget and no rapid solution appears to be forthcoming. We are also exploring other funding sources, but have no assurances at this time that any additional money will be available. To cover our costs, the fee we must request is $400 per year per Class I Principal Investigator (PI). We hope that virtually every PI on BIONET will be able to afford this nominal expense. This fee should be received by us on or before August 1, 1985. Checks should be made payable to IntelliGenetics, marked clearly as BIONET Subscription Fee, and attached to the enclosed Subscription Fee form. Class I PI’s in Alaska, Hawaii, Canada and Mexico must pay the subscription fee because their telecommunications costs will be billed to BIONET by UNINET. PI's in other foreign countries, who will be billed separately by their network vendors, uill NOT have to pay a subscription fee. Switch to UNINET We will soon be moving over to a new vendor for telecommunications services, UNINET. The primary reason for this change is that UNINET has offered very attractive pricing for use of BIONET. They will be charging us a fixed rate per month per communications line, independent of use. This charging scheme means that we can do much more precise planning for our current and future budgets. In addition, other organizations that have recently switched from Telenet, our current vendor, to UNINET report better network response and reliability, plus more rapid action on problems with the network. We will still pay careful attention to minimizing your need for long periods of connection to BIONET. In particular, we are working on simplifying BATCH submissions (see below) so that you can logout from BIONET, leaving the line available for another person to use. We have also implemented a procedure that will detach your job after a period of time of non-use (currently two hours) should you accidentally forget to logout your job. Thie ewitch will take place on or about July 1; after that time Telenet will no longer be available for access to BIONET. You will receive a mailing from UNINET with complete instructions on use of the network prior to that time. You will be reminded by bulletins on the BIONET-NEWS bulletin board when you login about precise schedules. Use of UNINET is even less complicated than Telenet. The primary difference will be in the telephone number that is dialed plus slightly different recognition characters to “wake up® the network. Depending on location, and because UNINET’s distribution of local nodes is different from Telenet’s, you may have to call a longer distance for access, or you may 65 14 June 1985 Page 3 be lucky, and find a UNINET node nearer to you! Bulletin Boards The primary use for bulletin boards is the sending of messages to a group of people interested in specific topics. It is often much simpler to send a bulletin than to send electronic mai) because for the latter you must maintain an accurate mailing list. The BIONET system keeps track of who wants to read which bulletin boards, thereby saving you considerable effort. Our current system is based on the bulletin board topics you selected as part of your application to BIONET. These topics are automatically added to your LOGIN.CMD file as part of setup of your account on BIONET. Everyone has BIONET-NEWS in this file. You are notified automatically when you login that you have bulletins to read, and are given their titles and a choice to read them or to ignore them. We have had some successful use of the bulletin boards, but also several disappointments. On the positive side, there have been several valuable exchanges among members of the community, and several of you have expressed an interest in being a community leader, albeit on topics more specific than the current, generic ones. On the negative side, many people are ignoring important bulletins by skipping over the reading of them on login to BIONET. Also, the current topics are too limited to inspire use of the bulletin boards. For these reasons, we are going to change our approach somewhat, to include the following steps: © We will use a new list of topics, selected from your suggestions and current patterns of use of the bulletin boards. Periodically, we will add new topics to the list based on your suggestions. This will make the set of topics more useful to you and your colleagues. e We will revise your existing LOGIN.CMD file to contain only the BIONET-NEWS bulletin board, and make it simple for you to add other topics at your discretion. e We would like BIONET-NEWS to be read by EVERYONE, all the time. We will continue to encourage you to post bulletins that are of wide community interest on BIONET- NEWS. To prevent the occasional user of BIONET from being overwhelmed by bulletins on this bulletin board, we will periodically move outdated bulletins to topical bulletin boards, or delete them, so that only a few important and timely bulletins remain to be read. Attached to this letter you will find a revised list of bulletin boards that represents a start to meeting some of our goals, together with instructions op how to add one of the new topics to your LOGIN.CMD file so that you are notified about interesting subjects when you login. 14 June 1985 66 Page 4 BATCH Jobs Many members of the community have asked how they can submit lengthy computations to the system and logout and go away, to return at some later time to review the results. That capability is available through the BATCH system on the BIONET computer. Attached to this letter are some simple instructions on how to SUBMIT such computations to run in BATCH overnight. We hope they are useful to you. In conclusion, I want to thank you for your participation in what has been a successful experiment in making powerful computational facilities available to a very large number of scientists. Your enthusiastic contributions to this success are appreciated, and we at BIONET hope they will continue unabated despite the budgetary problems discussed previously. Sincerely, Dennis H. Smith Resource Manager, BIONET 67 Il. Justification of $400 Fee The following message was sent by electronic mail to Dr. Rich Roberts, a member of BIONET’s National! Advisory Committee, in response to questions from him on the justification for setting the fee at $400. 7-Aug-85 12:17:13-PDT ,3352;000000000001 Mail-From: SMITH created at 7-Aug-85 12:16:59 Date: Wed 7 Aug 85 12:16:58-PDT From: Dennis Smith Subject: Re: charges To: ROBERTS cc: AMAREL, KAFATOS, LEDERBERG, MAXAM, RINDFLEISCH, YANOFSKY, SMITH In-Reply-To: Message from “Richard J. Roberts * of Wed 7 Aug 85 06:34:44-PDT Rich, The figure was determined by taking a hard look at our budget to see what was needed and then to estimate, based on the profile of accepted PI vs. those who actually used the system, how many would sign up. In brief - we need about $50,000 to meet telecommunications costs, i.e., to avoid going into the red. We desperately need another consultant and a scientist, the latter for core research, and that would cost 50-60K depending on obvious factors. Given a target of roughly $100K, we’d need 250 PI’s signed up at $400 each. Our estimates out of a group of 450 PI’s was that 175 - 200 was a good guess. The first 125 would pay our projected deficit and the remainder MIGHT allow us to hire another user consultant, our highest priority. The message to the community soliciting comments yielded only a few comments, almost all of which were essentially, “Siggghh, but it’s worth it" and two complaints. Since the fee was announced, we have received nearly the 125 target of checks or requests for invoices, and 8 refusals. None of the eight was a significant previous user of the system. I have made it clear that we would not turn anyone away for lack of funds. There are two people in this situation, and I will not cut them off if they can’t raise the full amount. One was the type of person you described, who was mainly interested in communication. We give people at least 30 days after subscription is due, and a much longer period if they let us know about their problems, so they have at least 60 days and more if they need it. 68 So, those are the current facts. An issue remaining from your message is treatment of those who wish only access to communication facilities, and that should certainly receive some discussion. There are several mechanical problems with this, but, more importantly, I have been discouraged by the community’s response to the communication facilities. We have excellent use of electronic mail, but only sporadic use of bulletin boards. The reality of the last 8 months of reasonable use of the system is that lots of people are banging away at it to solve their problems, and use mail and links, and advising, and phone calls to generally very good purposes - they have some really interesting problems that involve sophisticated use of the programs, i.e., little chaff or nonsense. But it seems to be a cottage industry out there, and despite our efforts to simplify further the bulletin boards and our letter and messages to demonstrate their use and make it easy to add and promote new topics, only that segment of the community with previous "electronic" experience is making any use of those facilities. Comments and suggestions welcomed. Dennis PS - I am trying to get in touch with Lederberg, who is now in CA to schedule at least a local meeting of NAC members and the next full meeting. Do you have any constraints over the next few months? —-—-- 69 Il. Letter on Class IV Access BIONET™ Important Notice on Reduced Subscription Fee for Communication Dear BJONET Scientist: I am writing to you because our records indicate that you either have not logged in to BIONET or that you have used the system but have not chosen to pay the subscription fee we had to implement late this summer. Your account still exists on the system and will remain there temporarily. It is not currently accessible by you pending further information about your intentions to remain a participating member of the BIONET community. We know that the current funding situation is as serious for many of you as it has been for BIONET. Many scientists have made special arrangements with us to maintain access to the system pending availability of new grant funding, institutional support and so forth. We want to be sure that you understand that we will not prevent access to BIONET simply due to lack of funding. If you have special circumstances and peed access to BIONET to support your research, I encourage you to contact Mary Warner or myself ((415) 965-5576,77). One of the goals of BIONET is to foster active communication and collaboration among molecular biologists. Therefore, we would like everyone to be able to access at least the electronic mai! and bulletin board facilities, even if access to analysis programs is not desired. Therefore, we have instituted a new class of BIONET membership, Class IV, which offers access to electronic mail, electronic bulletin boards and file transfer programs. These facilities will allow you to interact and exchange information with your colleagues on BIONET. Because such interaction does make use of our telecommunications facilities, we must also charge a subscription fee for access, but at the reduced rate of $100 per Principal Investigator per year. We have included a Subscription Fee Form which lists this option We very much hope that you will choose at Jeast this method for participating in the BIONET community. Sincerely, ll Cath Dennis H.Smith Resource Manager, BIONET 70 IV. Reapplication Form for BIONET Access BIONET™ 26 September 1985 Dear BIONET Principal Investigator: The National Institute of Health requires, as part of our Annual Report, that we review the status of all BIONET subscribers each year. Thus, we need information from you on any changes, from your original application, with respect to institutional affiliation, address, funding status, sub-investigators, etc. Most importantly, we need a list of all publications in which BIONET played a role. We also ask that at this time you reaffirm your original agreement for access to BIONET. Use the enclosed forms to: e Update your title, affiliation, mailing address and/or phone number e Update the list of your sub-investigators e Note change in status of funding e Provide a list of current publications resulting, in part, from the use of the BIONET Resource (Remember to cite the BIONET Grant # 1 U41 RR-01685-02 in all such publications.) e Provide a brief description on how BIONET was used in your research. We would also like to give you this opportunity to comment on the BIONET resource - what role it is playing in your research and any suggestions/requests for improvement. Because we must prepare our Annual Report in December, we need you to return this re-application to us no later than November 1, 1985. Thank you for your cooperation. Sincerely, [{ Wey Mi (ogee Mary Lou Warnbr \) BIONET Administrator 71 BIONET RESOURCE Reapplication Fiscal 1986 Principal Investigator (full name and title): Affiliation: Department, School and Institution (Changes Only): Mailing Address (Changes Only): Area code and phone number(Changes Only): BIONET Agreement As Principal Investigator of this grant to use the BIONET Resources, I agree to adhere to all conditions and restrictions for use of the BIONET Resource, as described in the document "The BIONET'™ Resource, Description and Applications Form" and such further regulations as may be issued from time to time by the NIH or the NAC. The BIONET Resource will not be used for any commercial purpose which is not specifically identified to and approved by the NAC. Any pertinent change in sponsorship, continuity of grant support, or use made of BIONET will be reported promptly to the BIONET Resource Manager. I have also furnished a copy of this re-application to the responsible officer of my institution, whose signature appears below. I also assume full responsibility for all users listed on this applications form and will monitor their compliance to the conditions and restrictions for access to the BIONET Resource. I will inform the BIONET Consultant, (electronic mail address: BIONET), by electronic mail, immediately about any changes in this group of users, i.e., departure of existing user or addition of new staff qualified to use the resource. I will inform new users of the above mentioned conditions and restrictions. Date: Name of official: Signature of Principle Investigator Signature of Official 72 P.I.Name BIONET re-application page 2 Current Sub-investigators If your group of sub-investigators has changed since your last correspondence, please note the changes below: NAME Title Phone number Change Funding Status Please note any change in status of your funding, including Institution, Grant Number, title of grant, and duration of grant. Current Publications Please list current publications resulting, in part, from the use of the BIONET Resource (use standard bibliographic format). Remember to cite the BIONET Grant # 1 U41 RR-01685-02 in all such publications. A sample citation would be: Computer resources used to carry out our studies were provided by the BIONET™ National Computer Resource for Molecular Biology, whose funding ia provided by the Biomedical Research Technology Program, Division of Research Resources, National Institutes of Health, Grant # 1 U41 RR-01685-02. 73 PI Name: BIONET Re-application Page 3 Use of BIONET Briefly describe how BIONET has been used in conjunction with your research: COMMENTS We invite your comments, suggestions and requests about the BIONET Resource. Which programs are the most useful to you - the least? Should the bulletin boards be broader in scope - more specific? Would you like more interaction with other users? What else would you like to see included in the BIONET Resource, for example, other computer programs. Would you like more information about the BIONET Satellite Resources? 74 V. Program for the Rutgers/Waksman Workshop “FAIJON-CO 3D puany FO} uO Sandos Emo med seed ‘@antpoag sip jo Ado auo ury) a20W anjaoas non J) ABOOIg JEINI2Z}0K Joy BdsNOSay Jaindwo) yeuopen y ‘g LINO 03 vopNonponuy] doysysom Aeq-aamyy Jo Aeg-2u0 65.074 890 2559 won Anarosg "mang aen DOGS 6S. XO8 Od ABojorgo.DIW JO aimUNSU] UTUSHE wold UonmE pS fruorssyo.g Suinunue5 ASSET MN 3D AUSZBAINT diwis 4; ———— eee INTRODUCTION TO BIONET®; A National Computer Resource for Molecular Biology One-Day Workshop—June 17, 1985 Three-Day Workshop—June 17-19, 1985 The Waksman Institute of Microbiology —— THE STATE UNIVERSITY OF NEW JERSEY RUTGERS o Now Brew ABOUT THE WORKSHOP LEADERS Brutiag. PbD., Associate Professor of Biochemusin a Stanford Unrversin Dr Brodag, counvesugator of B/ONET. ts an authorin on nucierc aad enzymology Hus work has focused on the evohition of nucleic acid sequences and the assembly of chromosomes Dr Brudag has also had a long-term interest in Computer systems and programmung and was a commvesugator of the MOLGEN project at Saanford. Laurence Kedes, M.D. Professor of Medicine at Stanford Unwersity School of Medicine De Kedes. coinvesugator of B/ONET, was also a coinvesugator of the MOLGEN project and has been responsible for overseeing the development of several sofrware proyects in molecular biology As a molecular and cellular genevas, Dr Kedes spends the majonty of his research efforts uwesugaung the nature of gene organization and reguianon in animal celis Elaine Mansfield. Ph.D., Training Manager and Consulung soenust ro Bf ONET baieluGeneves. Inc Pr Maumee. nas worked for intelbGeneues. Inc.. for wo years in Maiaetg ae Customer Truning and Support She has conde cel ower ous an-site trainings and writen user manuals tor sevural tine inteliiGeneucs programs Dr Mansficis .orrenis overaces Ine traning programs and collaboras Coser support senices tor BYONET She holds 3 PhD in human hiowhemical genetics from the Lruversin of Cabtortud at Kerkeics YES - Introduction to BIONET™: 75 A National Computer Resource for Molecular Biology One-Day Workshop—June 17, 1985 Three-Day Workshop—June 17-19, 1985 The Waksman Institute of Microbiology PURPOSE OF THIS WORKSHOP BIONET is a national computer resource sponsored by the N.1.H_ and established to provide academic scientists wats an interactive timesharing computer, up-to-date sequence databases and analysis programs. and powerful communicat: b bor rapidh exchanging information with colleagues. This one. or three-day workshop will provide participants with ar. ar de rtunding of the programs, database organization. and advanced computer resources available on BIONET. On the first day, the workshop leaders will use video projection systems to show actual on-line interacthon. ot protein, and DNA sequence analysis programs. electrom¢ bulletin boards, and electronic mail faciliues A more in-depth presentanon wall be conducted as a two-day hands-on session for BIONET principal invesugators or their representatives Applations presented will include how to use the computer for efficienuy managing DNA sequencing projects. sequence Comparison methods. and optumal! Probe design In addition. the procedure of transferring data from a personal computer to the BJONET computer wall be demonstrated PROGRAM CONTENT * An overview of B/ONET and the computer system — Resource organization and goals —Core program library, sequence databases, and developmen: ibrar: —Introduction to electronic mail and bulletm boards — communication tools for collaboration —System commands and dtreaory organization © Restriction mapping tools — Restriction fragment length calculation —Constmuction of restriction maps from enzyme digests — Strategies for generating large restriction maps Simulation and design of recombinant DNA experiments Sequence entry, verificanion and editing The Generic Editor * Managing large DNA sequencing projects —Methods for elhrmatmng vector sequences from your gels —Customization to chemical. or dideoxy sequencing metbods —Stratepies for asembling mukiple gels and generating rehable consensus sequences * Sequence database organization and searching methods —Companng NH GenBank. EMBL, and NBRF databases —Rapid sequence alignment and similarin: searches — Sequence remieval using exact or ambiguous patierns * Sequence analvsis programs —Nuclew acid sequence analysts, comparison and manipulation —Amino acid sequence analysis. comparison and manipulanon * Sequence comparison methods —The algorithms, speed. precision and imitations * Distributed processing. moving data to and from the BIONET computer Schedule: 8:30 AM Registration and coffee 900-noon Moming session 1:30-4:30 Pm Afternoon session LOCATION Waksman Institute of Microbiology Ruygers. The Sate University of New jersey Hoes Lane and Frelinghuysen Road Piscataway. New Jersey 08854-0759 INTRODUCTION TO BIONET®: A National Computer Resource for Molecular Biology 5 Direct Mail WHO SHOULD ATTEND This session should be of interest to all BIONET reciments In addition, principal investigators involved in mule ular biology research would benefit from learning about the resource IntelliGenetics commercial customers, and other molecular biologists are welcome to atiend the one-dav session Anendance at the two-day hands-on session is designed primarily for BIONET recipients and may be lamited tu one representative per laboratory group WORKSHOP LEADERS Douglas Brutlag. Ph.D.. Department of Biochemistry. Stanford Universit Larry Kedes, M.D.. Department of Medicine. Stanford University Medical Center Elaine Mansfield. Ph.D.. BJONET IntelliGeneucs. Inc.. Palo Alto California REGISTRATION AND FEE The registration form and fee must be received in advance of the program date This fee includes course admission. class materials. and beverage breaks For additional information or to reserve a space. contact Selma Ginterman. Director, Continuing Professiona] Education Program at 201/932-4258 berween 1:00-4:00 PM EST CONTINUING EDUCATION UNITS CEUs are awarded to participants in thus program The CEU gives formal recognition to persons continuing their education and keeping up-to-date in their chosen field or profession ATTENDANCE Artendance is limited for both workshops The three-day session will be limited 10 30 participants Please register as early as possible How did you learn about this course’ O Other June 17,1985 Course Fee: One-Day Workshop: $100 O From Supervisor june 17-19, 1985 Three-Dzy Workshop: $450 CO From Posted Material Check workshop desired [ One Dav C Three Day Other warkshop topics 1 would be interested in Pasiuon/Title Company ‘Institution Please keep me on the mailing hst Address D Yes ONo City Suate zip Also add the folowing persons to the mailing list Telephone Name Please make check payable to Rutgers, The State University of New ; Jersey. Return form with payment to the Conunuing Professional Education Company/Insutution Program, Waksman Institute of Microbiology. P.O. Box 759, Piscataway. Address New Jersey 08854-0759 PLEASE POST City State zip 76 VI. Descriptions of the BIONET Satellite Program I am writing because you have expressed interest in learning more about our new program for BIONET Satellites. As you undoubtedly realize, the community requests for access to BIONET are already pushing the resource to its limits during the middle of the day, and applications for access continue unabated. The response from the community of molecular biologists was expected, given our past experiences with the community’s need for access to computer software. What was not expected was that the Resource would approach saturation after only six months of operation! The issue was the topic for discussion at the March 23, 1985 meeting of BIONET’s National Advisory Committee (NAC). At that meeting. we proposed a program for BIONET Satellites that would essentially distribute the BIONET Resource to a number of additional sites. In order words, rather than trying to enlarge the existing central timesharing computer, we are trying to take advantage of a large installed base of computers accessible to molecular biologists. This approach to expanding BIONET reflects the changing emphasis in use of computers throughout the scientific community, from centralized to more distributed systems. The NAC was supportive of this approach because it offers the possibility for a rapidly- growing network of computers and cooperating scientists. At the same time, the NAC and the BIONET staff agreed that distribution of the Resource carries with it the danger of isolation of Satellites from the rest of the community. Therefore, an integral part of our proposal is to extend the collaboration and communication possible on BIONET to the Satellites. In this way, access to electronic mail and bulleting boards, and file transfers of programs and data, will maintain communication within the community. In a separate letter from Dr. Michael Kelly, Genera! Manager of IntelliGenetics. the BIONET Satellite program is described in more detail, including information on the requirements for full participation as a Satellite. ] am hoping that you will become an integral part of BIONET’s efforts to make available to the community the latest in analysis programs and sequence data. Sincerely yours, Dennis H. Smith, Ph.D. Resource Manager, BIONET ~d sd intelliGenetics. Inc Aninteliccrs Come, WO7EE Can EO Mountg’ \ ew .- a Teleshone 205 ae Dear BIONET Scientist: IntelliGenetics The BIONET™ computing resource, a cooperative agreement between IntelliGenetics and the National Institutes of Health is now in its second year of operation. The success of this program is now beginning to outdistance the available resources with over 1200 investigators currently using the system. In order that we may serve the molecular biology community with more of the capabilities of BIONET™ we have instituted a new and exciting program, Bionet Satellites. Bionet Satellites are designed to provide all of the functionality of BIONET at the loca! level. Utilizing existing Digital Equipment VAX or 2060 computers on your site all of the programs, bulletin boards and electronic mail functions of BIONET™ will be available. Bionet Satellites existing service capability to the user is available now. It will reflect the three major goals of BIONET: e To provide computational assistance in data analysis and problem solving to molecular biologists and researchers in related fields. @ To serve as a focus for development and sharing of software. e To promote rapid sharing of information and collaboration among a8 national community of scientists. Messages. Collaborative research and other community interactions depend on electronic mail and bulletin boards. Exchange of messages across a distributed network is required to maintain this important aspect of BIONET. File Transfer. Rapid sharing of software and data among the Satellites and the central 2060 is Tequired to ensure that investigators have access to the latest programs and DNA and protein sequence data. Up- and down- loading of files among computers on the network wil] make this possible. In designing our plan for BIONET Satellites, we are paying special attention to communication. We have identified communication as an important Core Research project for BIONET. We have begun implementing both short and long range plans to accomplish this goal: e Short Term. We will take advantage of existing communication software and our telecommunication network, augment it for our purposes, and use it for low bandwidth transfer of messages and files among the computers comprising a distributed BIONET. This software will be made available to Satellites to allow relatively transparent communication with the 2060 and other Satellites. eLong Term. Longer term, the major barrier to high bandwidth, automatic transfer of messages and files is the current lack of access of the BIONET community to existing networks (ARPANET, CSNET) that support such activities. In other words, the technology 78 exists, but the community and many potential sites for Satellite do not have ready access to it. Jt is our goal to make that technology available to a distributed BIONET. We are currently exploring funding resources and existing networks that will allow us to meet that goal. We also plan to link BIONET with other National Resources, including GenBank™ the Protein Identification Resource (PIR), the newly-established Molecular Biology Computer Research Resource at Dana Farber, and other national and international resources. For example, it would be possible to use the communications facilities described above to obtain new programs from Dana Farber, use them in concert with the latest GenBank data and complementary analysis software on BIONET or one of its Satellites, and forward resulting sequence data directly to GenBank or the PIR. Your participation as a BIONET Satellite in this enterprise will allow you to take immediate advantage of these facilities. A Bionet Satellite can be installed on your site in a short period of time. The BIONET staff will provide training to assist your colleagues in using the core group of genetic engineering programs. Accessing this service only requires the purchase of a software license from IntelliGenetics. A special purchase program has been arranged to make it easy for academic institutions such as yours to join the Bionet Satellite program. The cost of the license for this service is: DEC MicroVAX II (DH-630Q4) - $20,000 per year for three years DEC VAX 11/750 or larger - $20,000 per year for three years DEC 2060 - $24,000 per vear for three vears. 7 2060 the cost per user is $400 per year. This price software at least six times a year. At the end of the by purchase of a maintenance agreement which is ting resource, please write to me. D. 1 West 94040-2216 n and provide advice in setting up your Bionet am. We look forward to providing you the ultimate gy. Assuming 50 users on the VAX or 60 users on a DEC includes the update of databases and upgrading the three year period the software will be maintained. currently $6,000 per year. If you would like to join this extremely useful compu Michael] J. Kelly, Ph General Manager IntelliGenetics, Inc. 1975 E] Camino Rea Mountain View, CA We will give your application immediate attentic Satellite. Thank you for your interest in this progr in computing and communication for molecular biolo Sincerely yours, Michael J. Kelly, General Manager IntelliGenetics, Inc. October 1985 VUil. Text of Advertisement to Appear in Nucleic Acids Research IntelliGenetics INVITES YOU TO JOIN BIONET AN N.I.H. COMPUTER RESOURCE FOR MOLECULAR BIOLOGY OVER 1500 SCIENTISTS ALREADY BENEFIT FROM: @ ACCESS TO THE LATEST DATABASES OF SEQUENCES @ ACCESS TO COMPUTER PROGRAMS -TO ENTER AND ANALYZE SEQUENCES -TO COMPARE SEQUENCES TO DATABASES -TO HELP PLAN CLONING EXPERIMENTS -TO PROVIDE TOOLS FOR PROGRAM DEVELOPMENT @ ACCESS TO HUNDREDS OF YOUR COLLEAGUES AND THEIR RESEARCH RESULTS USING ELECTRONIC MAIL AND ELECTRONIC BULLETIN BOARDS CONTACT US TODAY... BIONET™ c/o IntelliGeneties, Inc. An IntelliCorp Company 1975 E] Camino Real West Mountain View, California 94040-2216 Telephone (415) 965-5575 80 Vill. The BIONET Brochure Mailed to NIH Grantees ADOTOIS UW TNOAIOWN AOA FWANOSAA AALAdMWOD “H'TN NV "Sn NIOL SZSS-S96 (Sip) auOydajay 9122-OPOPE BIUJO;HED ‘MalA UIEJUNOW IS9M [Bay OUIWED (3 S/61 V0 Weg Oqayy Auedwog duojyjaju; uy vee # me NINg "9Uj ‘S9}9UIHI aul 0/9 wLAINOWd Dear Scientist: Your name is included in a list of National Institutes of Health grantees given us by the NIH grant management staff. This indicates that your research may be in MOLECULAR BIOLOGY or a related field involving ANALYSIS OF PROTEIN AND NUCLEIC ACID SEQUENCE DATA: IF you would benefit from: @ Access to the latest databases of sequences @ Access to computer programs —to enter and analyze sequences -to compare them to databases -to help plan cloning experiments -to provide tools for program development @ Access to hundreds of your colleagues and their research results via electronic mail and electronic bulletin board facilities return this card for an upplication to the BIONET Resource. 81 BIONET™ isa national computer resource sponsored by the National Institutes of Health and established to provide scientists at non-profit institutions with an interactive timesharing computer, up-to-date sequence databases and analysis programs, and powerful communication tools for rapidly exchanging information with colleagues: © The Core Library, consisting of nine programs that Manipulate and analyze nucleic acid and protein sequence data, plus additional programs contributed by the BIONET community. © The Database Library, containing existing databases of nucleic acid and protein sequences, including GenBank.™ the European Molecular Biology Laboratory (EMBL) database, the National Biomedical Research Foundation (NBRF) library of protein sequences, VectorBank,™ and Cold Spring Harbor Restriction Enzyme database. @ The System and Programming Support Library, providing tools for Program development, including: programming languages (Fortran, C, Pascal, BASIC, MAINSAIL, and Interlisp) and system utility programs (MLAB, EMACS, TVEDIT and SCRIBE™); facilities for electronic mail and electronic bulletin boards; KERMIT and MODEM for file transfer: the UNINET™ telecommunications network. The BIONET resource will admit researchers from academic and non-profit institutions who can demonstrate that they are supported by governmental, philanthropic, or unrestricted institutional funds and that their research can be assisted by Resource facilities. The BIONET staff will consider applications funded from proprietary or restricted sources, and make recommendations to its National Advisory Committee which will make final decisions on all access to the Resource. There is an annual subscription fee of $400 to cover telecommunication costs. If you would like more information about the BIONET Resource and an application form, please fill in and mail the attached card. BIONET has also instituted a program of Satellite Resources, whereby investigators can run BIONET software on local DEC 20™ or VAX.™ or SUN™ computers. Check the box on the return card to obtain more information. Dr., Mr., Ms. PLACE Position/Title STAMP wo HERE Company/Institution Address City State ZIP Telephone (Please print clearly—This will be used as your mailing label) BIONET™ c/o IntelliGenetics, Inc. An IntelliCorp Company 1975 El Camino Real West Mountain View, California 94040-2216 () Check this box to obtain more information on BIONET Satellite Resources