CELL-FREE PROTEIN SYNTHESIS AND THE GENETIC Code by J. Heinrich Matthaei and Marshall W. Mireaberg National Institutes of Health, Bethesda, Md. Introduction Biochemistry is still largely in its analytical stage; people attempt to get exact knowledge of single reactiona occurring in the intact cell. Therefore, they need simplified systems, containing possibly only compounds involved in the reaction under investigation. However, since protein synthesis involves a whole variety of chemical reactions, biechemistsa had to etert working with crude systems, including extacts from Escherichia coli (a bacterium living in our intestines). These extracts can be obtained after grinding the fresh cells with alumimm oxide, followed by extraction with buffer solution and removing all remaining cells, proteplasts and debris by centri - fugation. One may call these extracts a “suspension of ribosomes* in a solution of enzymes and nucleic acids.“ All smaller molecules are thoroughly removed by dialysis to make it possible to have those * Ultra microscopical particles consisting of ribonucleic acids, very basic proteins and several enszyme-proteins. compounds added in defined amounts, which are necessary for protein synathesia: 20 amino acids as the building blocks of proteing, ATP and GTP as energy sources for the reactions and salts for a certain fonic environment. Mercaptoethanol serves as a stabiliser which allows to prepare extracts and store these in frozen state without loss of activities requirad. The incor- poration ia measured by using ch amino acids and counting radto< activity in the protein precipitated with trichloroacetic acid. rvey of processes involved in tein thesis All the major processes involved in protein synthasiz seem to occur in this crude system (seq fig. 1): l. The activation of the amino acids, including an energy~ transfer from ATP (Lipmann), a liberation of pyrophosphate (Keller and Hoagland) and esterification of the amino acid to {tts spacies of transfar-ribonucleic acid (RNA) (Holley, LipmanngdZachag, Hoagland, and others). fA from 4 kinds of nucleo- aide-triphoephates complementary in ite bases to the deoxyriboaucleic acid-strand (DHA) of the gene (Hurwits, Weiss, Stephens, and othera). This messenger-RNA (postulated by Jacob and Moned), carries by means of the sequence of its 4 spectes nucleotides the information from the game to the factories of proteins; there, it determines the specific sequence of soma 2) different amino acids within long polypeptide-chains ef up to several hundred units. The demonstration ef an inhibition of amino acid incorporation iato protein by a DNA- destroying enzyme (DitAase) by Tissieres, Wovelli, and Matthaei and Nirenbergy indicated that “nassanger-RRA" might be produced in R- coli extracts. Further experiments with 4 “labeled nucleo- tides dons in collaboration with Dr. R. Roberts, have shown that @ messenger-l{ke RMA is only made in the absence of DNAase. How- ever, the direct evidence, a production of specific “messengers” upon the addition of the DMA from individual genes, is not established at the moment. ~Ae 3. he complex-formation between messenger-RNA and some stage in the develolment of a ribosome remains to be studied. Ilsotope-labeled aynthetic messanger-RNA, like polyuridylic acid, which we found to be highly active in stimulating seino acid incorporation, will be very useful for such favestigations on the fate of the "messenger". 4. The amino seid-charged transfer-RMA is assumed to recognize and bind to specific places on "measenger”’-RHA by the clasaical base-pairing mechanism vhich wes discovered in DNA by Watson and Crick (A pairs with U, G with C}. Thus, transfer-RNA carries as an adaptor ipecifidd amino acids to their proper places on messenger- RNA, 30 Chat the amine acids can be linked, in correct sequence, into protein. 5. Peptide-bonds are formed batween tha amino acids that are sequentially arranged along the messenger-RNA-strand. The stepa favelvad are unknown and are being investigated im several Labor- atories. Here egain, the use of pelyuridylic acid coding apecifically for phenylalanine is very helpful, since it allows an enormous reduction of the precursers required for a model-synthesis of an extremely simple polypeptide chain. The assay for “messenger -RNA In order to meke all protein synthesis in our system dependent upon the addition of informational or “messenger’-RNA, we destroyed the DNA with DNAane and incubated the &. coli extracts until all of the endogenous “massenger"-RNA had been inactivated in some way. This treatment gave us the first "assay" system for “massenger"-RMA. Thies “assey~-aystem" seamed to copy informational RRA extracted from any organism; from Z. coli as well as from yeast, tobacco mosaic virus (TMV), or Ascites twnors of mice. The synthetic RMA polyuridylic acid directed 100 times as much amino acid {ante protein than was ever observed in a cell-free system. This polymer contains only uridylic acid, one of the 4 kinds of nucleotides found generally in ribonucleic acids, and leads to the polymerization of only one - §& ~ of the 20 proteim-amino acids, phenylalanine. We showed that transfer-RHA carries this amino acid towards polyphenylalanine- synthesis. This discovary opened the way towards the deciphering of the genetic code. The further preduction and use of synthetic polynucleotides let us find out which groups of nucleotides direct the other emine acida by means of their transfer-RRA- adaptors into their proper places in the apecific prateins. When we had enzymatically synthesized RNA containing two, three of four different species of nucleotides in a random sequence, we could determine which amino acids vould be “coded” by certain combinations of nucleic acid-bases: U, C, A ox G. The resulta ac far definitely established, are seen in Table 1. Ochoa and collaborators have reported similar findings. TABLE 1 Amino Acid Code word determined Alanine UCG... Arginine uUCG... Cysteins UUG... Glutamic acid VAG... Glycine UGG... Isoleucine UUA... Leveine UUG... FUG... Lysine UAA... (7) Methionine UAG... Phenylalanine uuu... Proline UCC... Serine Wuc... + UG... Tryptophane UGG... Tyros ing WUA,.. Valine BUG... ‘weenie The numberg of nucleotides per coding unit, but mot the sequence, was determined. For this purpose, we expressed the amount of each amino acid incorporated in percent of the phenylalanine, which was directed into protein by the same U-containing polynucleotide. - + We calculated also the probability of any triplet of nuclestides in percent of UUU, coding presawaably for phenylalanine. This must be done an the basis of the determined quantitative maucleotide- composition of the polynucleotides used. So we could correlate the observed incorperation of any amino acid to the triplet with the beat-fitting statistically axpected frequency. These deterain- ations were done with many polynucleotides of varied ratios between the comparing nucleotides and agentually led to the same results. On the basis of these calculations, however, we could not decide whether the coding units might contain uridylic acid residues in addition to those specified. If the number of letters per code word would be larger than three, a proportional increase in the Cadivideal mumber of words coding for the amino acids should be eupected ( = more | degeneracy). The triplet-code ia still not only the simplest, but also the most likely and experimentally uncontradicted conespt. This conclusion comes from beth biocheaical and genetic (Criek) evidencs. The use of different methods in synthesizing RNA of well-defined nucleotide-sequences should allow ua in the near future to get direct evidence for the number and sequence of nucleotides in the eode- words. Then, we might also find other coding units possibly existing in addition to the ones already determined for certain amino acids. Tha code could be more degenerate. The high proportion which U takes in the coding units determined thus far, may disappear where other than random-polynucleotides will be used and show possibly U-less code words in addition. The present selection of partially known code words may se just the result of certain limitations inherant in the mathods used. 2-2 es5 of codi its Poly A base-pairs and forms double~ and triple-stranded helices with poly-U. In this manner it totally inectivates poly-U added to our £. coli system. Poly-C does neither base-pair mor inactivate pely U. This way be a model for “reprassion" of the synthesis of eertain proteins on the level of RMA. Such repressions occur in cells and if their mechanism ig disturbed, uncontrolled synthesis of proteins might result. Universality of the code? These determinations of RNA coding units in a bacterial =~ 19 « aystem would be more significant, if differant organisms used the same set of coding units, and if each unit would be correlated in avery one to the sma amino acid. YThe genetic code would then be “unfversal". This universality is favored by already existing observations: 1. Mutations of the THV¥-protesin, studied by “Wittmann in Tibingen ) end Taugita and Fraenkel-Conrat in Berkeley, resulted after treatment of the TMV-RNA with nitroue acid. There occur only two autagenic tranaitiiens of nucleic acid bases by oxidative deamination; U replaces C,or G replaces A in ona single place somehfiere along the RNA-chain. Aa @ eonsequence, mutants occur, in which aingle amino acids replace certain other amino acids found in TMV-protein of the wild type. 12 out of 14 different types of amino acid-replacements can be explained by the basa compositions determined with randem-polynucleotides in the E. soli system. Chencea should have led to less than 33% agreement. 2. another approach, taken first by Lippmann and ¥. zbrenstein, ia to use amino acid-charged transfer-RNA from 5. coli and let it deliver ita amino acids in a celi-free system from another organisa, = il - actually from rabbit-reticulocytes. The amino acids became incorpor- ated into the proper places in rabbit-hemoglobin. Thus, at least part of the code seems to bea universal. Further experiments done in marry laboratcrics at the moment, shall finally answer this question. The general nature of these experiments is to put either messengera for the formation of a specific protein or. amino acid-charged trans fer-RNA into a cell-free aystem prepared from another organism. If the code is universal, transfer-RNA from one and messenger from another specias have to fit for making correct amino acid sequences. A fundamental concept, the transfer of inherited information being stored in certain nucleotide sequences for the ultimate translation into the broad variety of functioning preteins, has found its final experimental proof. Certain findings made during our work on the coding problem, have promoted the research of many laboratories on various other processes involved in protein synthesis. SUMMARY 1. The base-composition of coding units for 15 amine acids was determined by means of random-polynucleotides of different base~composition. 2. 3. 4. 5s » 12 @ The code ia degenerate at least for the amino acids leucine and serine. The informational part of RHA appeared to be single-stranded. At leaat part of the code seema te be universal. Important features of the code are still unkaown: The total number and the sequence of nucleotides in the coding units, the smount of degeneracy, and the polarity and chemical nature of “starting points", from which the measages apparently (Crick) are read off.