ON THE CODLUG OF GENETIC INFORMATION By M. W. Nirenberg, O. W. Jones, P, Leder, BR. F. G. Glark, Wo S. Sly and &. Pestka, Rational Institutes of Health, Bethesda, Md. In Press Cold Spring Harbor Symposium 1963 dhe process of exaressing gayectic fodormwcconu wy Sdlgmaig, Giaateo ecida in proper sequence during protein eynthesia usually requires the RHA polymerase catalyzed synthesis of a strand of BHA complomentary to DHA. Recent experiments reported at this Symposium end elsewhere suggest that cRNA (mescenger RHA) synthesized in vivo is complementary to only one of the two strands of DNA (Robison and Guild, 1963; Marmur, 1963; Spiegelman, 19633 Wood and Berg, 1963). The mRWA becomes beuad to ribosomes, perhaps forming a polysomal aggreate and the amino acids may be carried to these sites and ordered in corvect sequence by specific transfer BNA spacies, Tt is possible that wkN/. codewords are read from a fixed point by nucleotide sequences in trensfer BiA complementary to those in mRNA codewords. Thus coding arrere during proteis synthesis may be minimized by the requirement for correct recognition at three successive steps; that is at the DRA mRNA, sRNA-f:rensfer RNA (or other intermediate), and amino acid- transfer ENA» activiting enzyme levels. Little is known about the mechaniens which impart speci/Acity at the lest two steps. Some Factors Infliencing the Mes © Bffict 9. Pol ieotidea The effect:s of base composition, catalytic ability, molecular waight ? end secondary si:ructure upon the measenger activity of synthetic polynucleo- tides will be considered at this time, __ The messenger activity of synthetic polynucleotides may be related to ite molecular weight. In E. coli extracts, poly u containing more than 100 uridylic acid residues per chain has Sreater template ectivity then smaller chains (Matthael et al 1962), but olfgo A fractions conteining as few as 9-10 adenyiic acid residues per chain recently have been found by Jones eat al. (2963) to direct polylysina synthesia, Also, in yeast extract® oligo U af reovren eheota Tenetts th rattunte Sterna _ ee te ee Meee See ee eae cd ve ee ee ‘ee fee “ cnucleaotides may be degreded by nucleases move ropidly than polynucleotides and thus appear to be les ficient as templates for protein synthesis, RNA chain-length aust be considered when comparing template activities of different BNA fractions. Also, secondary structure of RNA ereatley influences its messenger activity. When poly U is mixed with poly A, double- and triple- stranded helices are forned vhich are completely inactive in directing polyphenyl- clenine synthesis (Nirenberg and Matthaei, 1961). Oligo A also forms helices with poly U, and the extent of inhibition of polyphenylelanine synthesis can be correlated with oligo A chain-length and oligo A-poly U helix stability (Nirenberg et 91,1963}. In addition, Singer et al,{1963), have investigated a series of copolymers containing varying amounts of U and G and have found that guanine-rich polymers containing a high degree of oxdexed secondary structure (perhaps due to G-¢ interacticns} elso are inactive as templates for protein synthesis. These results suggest that RNA with a high proportion of helical structure may have Little template activity for protein synthesis. Recent experiments have whom that poly U-poly A helices do not bind to ribosomes, and for this reason may be unable to direct protein synthesis (Cukier and Nirenberg, unpublished results). It also is possible that small, localized areas of ordered structure may serve as periods in protein synthesis, ft is difficult to compare directly the messenger efficiencies of different polynucleotide prepat rations because the efficiency is modified by molecular size and secondary structure. However, if the average chain length and secondary structure of different RNA preparations are assumed to be a = £5 approximately equal, the data of Yeble I suggest that nucleotide content may not influence greatly the overall template efficiency ef mRNA. Poly U, poly UC, poly ACG and poly UACG contain 1, 8, 27, and 64 triplets respectively, and the preparations of these polynucleotides shown in Table I have been found to direct 1, 4, 9, and 10 emino acids, respectively, inte protein, The essential point is that approximately the same totel quantity of emino acids were directed into protein by each polynucleotide. Although these data must be interpreted with cere beceuse the same factor may not limit the ineorporation rate of each amino acid, they suggest thet the polynucleotide preparations may have approximately equal template efficiencies and that most nucleotide sequences may be able to sede for emine acids. Although nonsense sequences mey exist, thus far, none have been demonstrated definitively. Volynucleotides conteining #11 base combinations now have been used to direct protein synthesis in E, coli extracts, A qualitative summery of these data is presented in Table 2. Only those polynucleotides containing the ttiniousn bases necessary to direct an amino acid into protein ere shown. For example, phenylalanine is directed inte protein by poly U and other U con- taining polymers; however, since other bases are not required, phenylalanine is listed only under poly U0. Poly U, poly A, and poly C direst phenylalanine, lysine and proline, respectively, into protein. Polylysine synthesized in Ee cali extracts under the direction of poly A hag been found to contain 3-15 lysine residues per chain (Jones, Yaron, Seber, Heppel and Nirenberg, une published results}. No wessenger activity has been demonstrated fer poly G (Natthaed. et al , 1962), but the highly ordorad structure of poly G might mask template activity. However, @ polysuclestide composed enly of hypoxanthine whe (poly I) with less secondary structure than poly ©, still hae not been found to direct amino acids into protein. Since hypoxanthine can replace @ in RMA code words, the 2-amino group of G does not appear to be essential for coding amino acids (Basilio et al, 1962, Niernberg and Jones, unpublished data). Bach polynucleotide composed of 2 different bases has 8 triplets, but no polymar has been found to direct more than 6 different amino ecids into protein. Poly UC is unique in that ic codes for only 4 amino acide, even though ali UC triplets appear to function se codawords (see Coding Ratio Section). It ie imortant to note aleo that polynucleotides containing only two different bases direct with grest specificity almost all amino acids into protein. These findings undoubtedly reflect basic molecular characteristics of both the recognition procese and the general nature of the code. The Coding Batic A series of poly AC and poly UC preparations with different proportions of bases were synthesized and thelr activities in Stimulating cell-free amino acid incorporation into protein vere determined, As chown in Fig. 1, poly AC directs the incorporation into protein of proline, histidine, threonine, asparagine, glutamine and lysine at linear rates for 15-20 minutes. Reactions were terminated after 10 minutes of incubation, while the rates of incorporation were still linear. In Table 3 ie presented en example of the deta obtained for each of the five poly AC preparations tested. The theoretical proportions of the abn four doublet and eight triplet parmitationa cxzpscted in randomly-orderad poly AC (containing by analysis, 47 parcent A and 523 percent ©) are shown 4n the first and sacond columns respectively. in the third and fourth columis are shown the pumoles of esch cl4 waminc acid directed into protein by this polyner. A total of 1685 yumoles of amino acids were dixected into protein, and the roiastive proportions of each amino acid incorporated, in percent, ara shown in the last coluan, The 4 doublet parsutations do not contain anough specific information to code for the 6 amino acids incorporated, whereas the information content -@ the § triple: words is adequate. The percent incorporation of lysine, asparagine, glutamine and histidine agrees wel) with triplet codeword fre- quencies, but not with doublet frequencies. If all triplets were read, some amino acids would respond to 2 or more codewords, for 6 amino acids would then be coded by & words. In such cases, che sum of the triplet frequencies would have to be coupsred with the corresponding amino acid incorporation data. For axample, 1£ CAA and CCA both coded for one amino acid, the sum of their frequencies is 24.9 percent, which cannot be distinguished from the frequency of the doublet CA (aleo 24.9 percent). Therefore this experimental approach may allow determination of the coding ratio for some, but not all, avino acids. Analyoia of a series of polynucleotides with varying base-ratios per- mits comparisons to be made with greater accuracy. The expected statistical relationship between codeword frequency and polynucleotide base-ratio are presented graphically in Pige. 2 and 3. Theoretical frequencies in percent of doublet and triplet codewords are shown on the ordinate ani the base-ratio ia shown on the abseissa. Nucleotide sequence 16 arbitrary, and eath curve representa only one of the three possible sequence permutations. As nored before, the sum of the frequencies of the triplets AAC and ACC equale the frequency of the doublet AC. Thus the AC curve represents either the doublet 40, or the sun of the two triplets AAC plus ACC. Also show are the obsarved ci4. amino acid incorporation data, Sach point represents « different poly 4G preparation with the indicated base-ratio. As show in Fig. 2, the ob- sarved incorporetion of c*. nistidine agrees well with the theoretical fre- quency of the triplet ACC and differs markedly from both the AAC triplet and 4&2 doublet curves. Tha data also demonstrate that the ebserved incorporations of both cl - asparagine and cM. slutamine ogres well with the fraquencies of AAC triplets. In contras:, the incorporation of cl. threonine te sinilar to the expected frequencies of either the doublet AC, or the two triplets, AAC pluo ACC. Therefore, threonine appears to be coded eather by a doublet of by two triplets, and it is not possible to differentiate batween thece altematives on the basis of these data. in Fig. 3 are presented the template activities of poly AC preparations for c*. proline and cl. 1ysine. ‘the experinantally obtained incorporation data indicate that proline is coded either by the doublet cc or by the two triplets ccc and cca. cl. rystne appears to be coded by the triplet AAA. ming Agi Tocorporation Directed by Roly UC, The data of Fig. 4 show thet proline is directed into protein either by the doublet CC er by the eum of the two triplets coc end cu. c!*-phenytelenine eppaate to be coded either by the doublet uU or by the tuo triplets UUU ané WUC. it 4e important to note that if the codewords corresponding to thece amino scids are triplets, =]~ both CCG and CCU would code for proline and both UUU and UUG would code for phenylalanine, In Fig. 5 are shown the poly UC-directed serine and leucine incor- poration data. Both serine and leucine appear to be coded either by the doublet UC, or by the two triplets BUC and UCC, Coding of serine or of leucine by one, rather than 2 triplets is not indicated. It is important to note that if serine is coded by triplets, one triplet would have to contain 2 U residues and the other 2 C residues, Triplet words for leucine also would contain either 2 U or 2 C residues, These experiments strongly suggest that histidine, asparagine, glutamine and lysine are coded by triplet words and that the RNA cade cannot be com- posed only of doublets. Threoutne, proline, phenylalanine, serine and leucine were found to be coded either by multiple triplets or by doublets, These gata are summarized in Table 4. A mixed doublet-triplet cede cannet be excluded on the basis of the available data; however, a uniform code con- taining only triplets would appear more probable. The Current Codeword Dictionary. Assuming for the present that all amino acids are coded by triplets, current approximations of RNA codewords may be summarized as shown in Table 5, Nucleotide sequence is arbitrary. Flity of the 64 possible triplets have been assigned, Almost all amino acids can be coded by polynucleotides containing 2 different baces, Since polynucleotides containing 3 bases direct protein synthesis as efficiently as polymers containing only 2 bases, it seems probable that most 3 base words are recognized. Tentative assignments are given for such words, It seems clear that most amino acids are coded by multiple words, Furthermore, multiple words corresponding to one amino acid often differ in base composition by only 1 nucleotide, These observations also suggest that nucleotide sequences in multiple words often may be identical, A triplet code may be constructed wherein correct hydrogen bonding between 2 out of 3 nucleotide pairs may, in some cases, suffice for coding, or alternatively, a base at one position in the triplet sometimes may pair optionally and correctly with 2 or more bases, Tt should be noted that a triplet code of this type in some respects would bear a superficial re- semblance to a doublet code and would be in accord with all of the dats available, The coding data obtained thus far clearly indicate that most nucleotide sequences can code for amino acids with great specificity. Weisblum et al (1962) have reported that multiple species of leucine transfer RNA recognize different codewords in synthetic polynucleotides; however, additional data presented at this symposium by Benzer and by von Ehrenstein and Gonano suggest that codeword specificity in directing leucine incorporation may be greater with synthetic polyaucleotides than with natural mRNA. It is important to emphasize the possibility that raudomly-ordered synthetic polynucleotides may test the cell's potential to recognize codewords, and that the entire potential May not be utilized in vivo, except perhaps during mutation. Thus oRNA synthesized by a cell may not cantain as many codewords as randoaly-ordered polynucleotides. £ 1 fe LO Ted § OVS OY ©2 Who, OY have sir lar etructure Wor example phenylalanive, iyvogine end irepiocphan are derived from Shitovie acid, and isoleucine, valine end teueine are evrnthegized froin raela butysae, BNA codewords correspon ning to thege amino acide a> Such comparigons evgg¢est thay a facile of amino acids w: BY Peo wytioes a family of codewords whoae membera contuia alrailar bases Althe ie! all amine acida in ible patiera, enough additional examples 3 may be clus! é Ge tonay do: -lopmen'! of the code. ar the recagnition of aucleotides in ccd « eS to warrant ‘he suggestion that guch releiionshing reflect either the evota- le Ca words hy «ine acide. The latter hag been oroposed by Woese 1963) and alae is ¢ cussed by Weinstein in this aynmporurn. “10 OLEGODEOXYSHYMIDYLATE DIRECTED POLYLYSINE SYNTHESES The chemical synthesis of oligodeoxynuciectides by the method of Khorana and his associates (1961, 1962) and the demonstration of an ollgo- deoxynucleotide-dependent synthesis of polyribonucleotides, catalyzed by RNA polymerase (Furth et al, 1961, Stevens, 1961, Chamberlain and Berg, 1962, Falaschi et al, 1963), provided an opportunity to study their ability to stimulate cell-free amino acid incorporation. Since poly A serves as a template for polylysine synthesis (Gardner et al, 1962), oligo aT Coligodeoxythymidylate) has been used to direct poly A, and subsequent polylysine synthesis, as follews: 1) atp Oli SOL RIEVaTE —y» Poly A + PP 2) Lysine POLY ALL 2 wwe ee eee ee ee & Polylysine E. coli Extracts, etc, > uy In addition, natural DNA and poly U have been shown to direct polylysine synthesis, Poly A was synthesized in RNA polymerase oligo dT reaction mixtures (stage I) as described in the legend accompanying Fig. 6, and then components supporting amino acid incorporation into protein (Stage IZ) as in the legend of Fig. 7, were added, After further incubation, incorporation of cl 4.Lysine into polylysine was determined by precipitation with a TCA - tungstate solution (Gardner et al, 1962). The data of Fig. 6, show that cl4.AMP incorporation was dependent upon the addition of oligo dT13.14 (13-14 nucleotides per chain) to stage I reaction mixtures, and that ch 4eayp incorporation was proportional to the succeed wut a BG ose, AT Wheapt wetedetes hey om aces ae eS gees Seem amore gee a one BUSI OF ORR Oe EAM WAUMMI gun TAGs? Gh wath GE ZOOS PME AGS Ga ake 14 product synthesized in the pre- fhe average chain length of the ¢ sence of oligo aT 3.14 was determined by deproteinizing the reaction mixtures, rencving the co! agp by paper chromatogreshy, hydrolyzing the lf “product in 0.2 N BOR and separating the aucieotides by paper chromatography. Tha radio- acciwity of adenceine, sdenosine-3' ¢2'}-S'odiphoepkate am’ adenosine-3' {(2’jeo gunephosphnte was Seund to be 339, 308 and 22,900 counts por minute, respectively. ‘Shus, oli to ag 330}4 stimulated tha synthesie of poly A of average chain Length fe70 pA Cadanylate) residues. These data confirm similar results obtained by Furth et al (1961) and Felescht et al (1963). Falaschi et al (1963) also demonstrated that elign af chaine are aot alongated by the addition ef (pA) residues to the free 3-hydronyl enda of olige df chains, and hava obtained avidence which suggests that oligedeazy- aucleotides serve as templates rather then primera, Although cus BNA polymerase praparations wore purified 100-150 foid (Chesberlain snd Berg, 1962}, we have detested unprimed mucleotide incorporation under other conditions. Yurther qzyme purification will be necessary to datermine unequivocally whether oligedecoxymiclastides function only aa texplatee in this systen. After incubating stage I feactfion mistures at 37°, ategs IX components were added ag deceribed in Ebe 14 legend sccompanying Fig. '7 . Wo ducrease in C” -lysine incorporation vas found im the absasace of oligo dT, whereas the addition of 1.2 mumoles of i4 (pdf) rvasidues in oligo oF 3.15 stiuuleted C’ olysine incorporation at a gi En 240% ae ¥ 138, & & ry % cy Tn ay ORO yh nye 236 a & ey sg a» nd FE We te 8 Cee) cae 1 fet at 3. & Pub h 2 hy, & a ae W4 RENT om & pe & a eo ay 2 at at fou re sie a s ee Sardge { a v a fe ard re ¥: or ors sates 2D * 38 2 ante Bane oa t ee a & Baw oF z Es moa gd 2 be & iad 43 Se ae a ; % oe us ot be ob v3 a & a” Paahat haale of % a Be oe ya &: 1 Bey ce 4 @ PESTGIR, oe fe tion tn aed: Stash POP = et & B2E BE a a & a 1a . a s ere 4 < Salenot ea comple iy mae, coke ho tm Ce eR ge Poe Sten WG OUR CO BIg 2 Mean bis PGE 9 Fe} s ey 9 per SS 3 @ & ket ie {chamberhain and Bexg, 19623 ¢ MA cCapendeank synthesis of poly A fcom AEP (in the absence of UTP, GIP and CTP). Under these conditions, Ss = 2s, 2 s, Ge. - * fe poly A syuthesiged under the divsckion of calf thyms BUA stinuiated ct “ lysine iasorppazatzion, Pa’ “Ss i chromategsaphy of polylysine as described in the degend accompatiing Fig. 8, permits separation of lysine peptides of different chain icagths. Peptides containing appromiwately eleven or more lysine gesiduc: cemain at the origin, whereas the mobilities of smaller peptides o nye %, \ a Ca ey te oe %, se 1m + ave ar followss lysine » die 7 trhe ? tetrae > penta > hexae > hepta«> . : ca om oe. 14 : ecte- > nonay » deca-lysine., As shown in Fig. 8, most of the G°”epraducs syiihesized in the presence of oligo Ge 3014 remained ak the ovisin after 2 ‘ 14 chivmatogiuphy. Digastion with trypsin converted the C”"eproduct almost qu.atitatively to peptides hich migrated with frce, dix, She, and tetzae jysine chneacteristics, . 16 . «4 Za supatate experiments the €” «product was eluted from the origin, . : , . fe 1& . Both aliquots were chromatographed as before, Gne C°"*spot having the characterlutice mebilicy of free lysine was found following aeid hydzolysis. . . : i, . . Stans Atcer digastion with trypsin, C” “eproducts with the expected mobilities of 4 . 14 free, die, trle, and tetra-lysine were found, ia addition che C= & 3 8 polylysine which remained at the origin was shova to contain carbexyl- ty ee oa . : I elysine residues by a hydzaginolysis method (Akabori et al, 1952). terminal ¢ Effect of Molecular Weicht upon the Activity of Oligo dt, Falaschi ct ai. (1963) denonstrated that oligo d? ebains containing less thaa 4 residues REPS REECLS AKABORE, S., K. OHNO, and K. NARITA. 1952. On the hydvaginolysis of proteins and peptides. Bull. Chem. Soc. Japan, 253 214-228, BASILIO, C., A. J. WAHBA, P. LENGYEL, J. F. SPEYER and S$. OCHOA. 1962, Synthetic polynucleotides and the amino acid code, V. Proc. Nat, Acad, Sci., U.S., 483 613~616, CHAMBERLAIN, M. and P. BERG, 1962. DNA directed synthesis of RNA by an enzyme from E. coli. Proc. Nat. Acad. Sci. U.3., 483 81-94. FALASCRE, A., J. ADLER and H. G, KHORANA. 1963. Chemically synthesized deoxypolynucleotides as templates for RNA polymerase, J. Biol. Chem. (in prass). FURTH, J. J., 3. HURWEZTZ and M, GOLDMANN. 1961. ‘The directing role of DNA in RNA synthesis. Biochem. Biophys. Res. Com., 42 362-367. GARDNER, R.S,, A.J. WAHBA, C, BASILYO, 8, S. MELLER, P. LENGYEI, and J. F, SPEYER, 1962. Synthetic polynucleotides and the amino acid cede, VIZ. Proc. Nat. Acad. Sci. Us8., 48: 2087-2093, JONES, 0. W., EB. E. TOWNSEND, H. A. SOBER and L. A. HRPPEL. 1963. Effect of chain length on the template activity of polyribonucleotides. Biechem, (in press). KHORAMA, H. G. and J. P VIZSOLYI. 1961. Studies on polynucleotides VIII. J. Am, Chem, Soc., 83: 675-685. KHORANA, H. G,. and J. P. VIZSOLYI. 1962. Studies on polynucleotides XIX. J. Am. Cham. Soc., 84: 414-619, KRAKOW, J. S. ard 8, OGHOA. 1963. ENA polymerase of Azotobacter vinelandii, 2. Proc. Nat. Acad. Sci. U.S., 493 88-94, Refervaacas -2< MARCUS, %., A. K. BRETTHAVER, B, HN. BOCK and 8. 0, RALVORSON, 1963, The effect of poly U eise on the incorporation of phenylalenina gn the cell-free yoast system. Proc. Mat. Acad. Sei. U.8.,5 {in press). MABMUR, J. 1963. Biological and phyoical properties of bacillus bacterfcphegs HA. This Symcetum. MATTEART, J. H., 0. W. JONES , R. G, MARTIN, and M. W. NZREIDERS, 1962. Characteristics end composition of BNA coding units. Pree. Hat. Acad. Sci. U.S., 48: 666-677. HIBENBERG, M. W. and J. H. MATTHARI, 1961. The dependence of cell-frce protein synthesis in E. coi% upon neturally occurring of eynthetic polyribonecleotides. Proc. Mat. Acad. Sci. U. 8., 47 1588-1602. WIRENBERG, M. W., J. H. MATYHART, O. W. JONES, R. C. MARTIH and 8. BR. BARONDES. 1963. Approximation of the genetic code via cellefrce protein synthesis directed by template RRA. Fed. Proc. 22: 55-61, ROBISON, M. and W, RB. GUILD. 1963. Evidence for message reading from a ubique strand of DHA. Fed. Proc., 22: 643. SINGER, 4.F., O.W. JOWES and M. W. NERENBERG, 1963, The effect of secomlary . structure on the template activity of polyribonucleotides. Proc, Nat. Acad, Sci. U.8., 49 392+399. SPIRGELMAN, S$. 1963. Some properties of the genetic transcription mechaniamn, This Symposium, STEVENS, A. 1961 Net formation of polyribonucleoctides with base sorpositiona analogous to DNA. J. Biol. Chem., 236: FC43-PCAS. WALEY, 8.6. and J. WATSON. 1953. The action of trypsin on polylysine. Biocken. Jeegat 328-337. Hetecangns Ge WAlBA, A. Je, B, S. GARDNER, C, BASZLIO, H, S. MELLER, J, FP, SPAYSH ond DP. LENGYEL, 1962. Synthetic polynucleotides and the anine acid cada, Wilt. Proc. Heat. Acad. Sad. ULS., 49: 116-122, WEISS, 8. B. and T, MARAMOTO, 1961. On che participation of DMA fn BRA synthesis, Proc, Nat. Acad. Scl. U.8., 473 694-697. WRESS, 8. B. 4963, Properties of DNAedopendent aynthesiaed BNA. pp. S167. gsfsrastional Macromolecules, Rew York. Aeadenie Praga. WEISBLUM, B., 8, BERZER and B. H, POLLEY, 1962. A shysieal basis for degenexacy in the amino ecid code. Prac. Nat. Acad. Sel. U.8., &&3 1R49-1456 , WOESB, C. 1963, Tha Genstic Code. 4.0.8.0. Reviews, (in presa) WOOD, W.5. amd P, BERG. 1963. Seudies on the “messenger” activisy of BMA synthesised with BHA polymerase, ‘his Syapesiun. TABLE 1, ‘TRIPLATE ACTIVITIES oF 1, 2, 3 AND 4 BASE POLYNUCLEOTIDES POLYNUCLEOTIDE U uC ACG UACG u 100 AT 7 56 BASE RATIO c - 53 32 13 HOLES-PERCENT OA ° ~ 46 5 6 - - 22 25 POSSIBLE TRIPLETS 1 8 a7 64 cl4anio ACIDS DIRECTED PHE PHE LYS ILEU PRO INTO PROTSIN LU ALA HET LYS SER ARG CYSH ALA PRO SER VAL ARG THR Gk oR GLU-NH, TRY GLU-NH, ASP-flily TYR ASBONIL, HIS PHE HIS PRO LEU SER rota c!4.anmo ACID INCORPORATION (munoles) 3e2k 2.91 2.22 5009 Legend for Zeble 1b. See text for details. Cedeword nucleotide sequences are arbitvary. TABLE 2, SUMMARY OF CODING DATA re POLY U A c G RE NST AE: a PEE LYS FRO © POLY DA te UG AG AG CS UAS TYR LEV LEV HIS ARG ARG bar LEY SEB VAL ASP NH, GLU ALA AS? TILED CYSH GLU-NA, CRUSNHL, SER ASP Nia TRY THR Asp® 7° Legead fer Zable 2. ve? Ouky thasa polyanuslootid¢es eontainiag the atntmal wmxher of bases necaasery te atinalate on amino acid ints protein are showa. Amino seido codsd by homepolynucleotides axe net listed sgain under rendomly-orderad polynucleotides. *Predicted ' Reported by Wahba et al (1963) CCHPARZSCH DOSTVGEN AMINO ACTOS INCORPORATED wise site foe SA GODEEGRD FREQUEKCT ES ated le Te a vi ae a ney a Se ARO a Wd PENN Bs ea SRV WMS D AIOE EE GEG i iF AT Bim d 7 . - 1, . . Thearatiest Codeword GU 'shmino Acids Tauvorpovated fnte Pare steney in Poly an Con Protein . Featyse ; : FREQUENCY OF ceining 47% & and 455% ¢ 34 ; 14 LLMOLES a .20NG ACIDS ae 3 wee ary my Pa oa fT * ae “ eke ROULLATS TRIPLETS C'MAIIHO ACID THCOREOPATED © THOCEPORATED peorcant Parecug aA Ooaak AAS 20 ott TYSIRE 383 20,8 AC 24.9 AAG 13,7 ASPARSB ING 192 13.6 GA 26.9 AGA 13,7 GLUTAMINE 257 6,3 Co 28,1 Gok 1.7 THREONTNE fit 26,3 oo Poe CCA 13,2 BUSTIDING 1s9 54 2 >. to oy 1 . Fotak 160.0 age 13.2 PROLIVS $50 32,6 CAS 13.2 ner eres COG Mod fotei 1685 Total 10.0 Leral 199.0 LESENDS FOR TASLES Legend for Table 3°" Back caaction mixture contained the following compcnents 3a Final voluee 626,25 al: 0.2 # feic, pl 7.8; 0.01 & maguesiun asctates . «3 - 7 =3 is & 3 20 i mereaptoethancl]; Jn 10" M AY®; 5 x 10°? & i trae ter cy 2.6 Bi + potaceiun phosphoensclorravate: § ug of evgatallias phosphoancloyruvate Einuase {(Cxiif, Corp, Eicchem, Research); 0.9 2 1077 wc eaninge acd; skotein Quivctberg and Matthsed, 1961) Reaetion wlstures wore incubated a a hg a 2 - a at 37° for 10 ninmutes. Protein precipetation, vashing, and om parforne? so deceribed by Rirerberg end Metthact (29633, the theoveticsl fracuenvies in percent, of dovblcta end triplers 2 in polyneclestides vera caleulated aa folious: The fseqiensy of tho tripe % int J08 in this poly AG preparation would ba .47 2.47 B .47 2 200 — 10,4 percent, The doublet feaqusney fer CA would be .47 « .93 = 100 = 2h, pSeTGone, The pyroles of each amino acid inecrporated in the abseuce of poly- aueleotide wars: lyedne, 30: asparagine, 54; glutarlne, 493 threonine, 403 histidine, 263 proline, 36. TABLE 4 . SUMMARY OF CODING RATIO DATA 14 CODEWORD Co -pHINO ACID TRIPLET DOUBLET RISTIDUE ACG - ASPARAG IE CAA - GLUTAMINE AAG - LYSINE AAA - ‘CHPZOR INE CCA + ACA or AL DROLINE GCG + GAC + CUC or CC PRENVLALARIEE BUY + UCU ox UU SERINE cUU + CCU or CD LEUCINE vUC + UCC or UG % Muclestide sequences are arbitrary, ‘a ASBY OF EMA COOEVORRS AGTGG {5705 See EIN, Dah, SSPS FAST ASFARLEG ACID CYSPEIPS GRUTSHES ACTD CLUSERT IRE CEYOTRE BISTLOLAS REOLS0CINE LEOGOINS LYSTER PORTNLALAB INS FROLIRE SERIES THRECIENS TRY PLOSIAN SVEISIRS VALENS PSM RE al OCT ese oAe Slee ACA GUA GUS GAA BAG GES ACC DAD BUS bbe GA Boy Ces oe CAS GN AUYy Tey ues! AGH AUA CoA} aay? AGA AGS Agy' GEA Gas &&S Guu ecu UL CSA uA! BM SORE FORDE 2 ACG! wect Acg' Gaa!' anc’ Ago! C38 BoE CCA BCG te; SGA * Arblerary wucleotide sequence. ' Probable 7 hn oe nip ae WATT PE pt ~ f ‘ oe ae ‘ TABLE ©, RELATIONSHIP BETWEEN AVfTHO ACIDS GP SIMILAR METARCLEO ORIGIN OR STRUCTURE AND THETR ANA CODEMCRDS AROMATIC DICARBCGHYLIC AMEE SLEU, Val, LSU ANING ACTBS ACTOS ANG AMIDES PAMILY FUE vuU ASP~Ri, BAD TLE WUA DUG ALS UAA 2¥R DUA ASP UG VAL Ou RY UGG GLU AUG ERG UuG ASB BUA uC £06 UY Aat RO te Ae OEE ON ET NE Ao EMAAR 2 I LEE © see: SATIRE a TRS PRS rs SET mera ed BSLASISN LATURU OLTGS @P CHAT LONGI AWh AteEyIR? OIE ST TAR UE Re PE BON oe vere ee at et ORE! TI LY SEE GA ie TR AE i 14 AQUDC IH G 4 app G ~-LYSEYE TECOPPOOA TION EEOCOREQUG ETO (upmeles} faumoles} OS EO OR EEE Mae Ce RRR On SO BS eh IRS, RRR (EL Ba ly AE AI eT CATR BE, ie lel Rs Ve ROBE 0.4 Q,.075 A? mersolas oligo dF 1,5 6.237 6-7 i? mungies oligo af 6.8 0.337 798 AZ aynoles olive Poa 13.9 0.406 SN ES TN TOON TPE BRN 3 5 EIS PEER OB ROMY INS SO RAPER EO PUM Eos Leone RRR TURTR TE ESE RE ETS URE 2 tne components of ehe renetdon aiztucas c24 Su the legendi acconganging Fig. $ and 7 Bice tete OF L0 camicce eda iancomporerie: Smee efetetn directed oy poly AC ibese ratic - A, 4° pexcene and @, $3 percent), Reaction mixture components are described in tha Legend of Zable 3. 3 ja 1 Lipentes : e ere wr a Can report oO wit ty ad dade & Seek a Sen Coun: “ nay ee PWeArag Ae we 3 cb a roy sy ¢ tha i 2G rots ne Bh C x feble 3. . s bad sean YE igs eas des: a ‘2 a) Yehle 3, as ry ¥ ates u we + vy at a i] wee, 35 Fig. 6 2 . 2 ld > Cheracteristias of GC “spoly A synthesis 4n RNA polymerase (staget) reuction minturea, In the figure cn the laft, reaction mixtures were * . Qo ~ . fncubated at 37° for 15 minutes, then were deproteiniszed and washed with ° 3 percent TCA at 3°, The syzbols in the figure on the right represest the foliewing: 4, + polymer; ¢, + 1.2 mymoles of base reefdues in oligo de eld Eaca stege I reaction mixture contudnad the following in a final volume of G.125 wl: 4 x wu" HB Tris, pH 7.6; 4x 17? M MgGl. 5 un? Bt enGhs 3 1.2% iu™* ii wercaptcethencl; 1.6 x 1? g-cl* ane, tetralithdam sult (Schwsr2g BloResearch, Inc.}3; and 20 pg BE. coli RNA polymerase pvocein {20 units (hanlerlain and Berg, 1962) ). Fig. 7 Charecterictice of oligo éf,,,, directed synthesio of ci ~poly- lysiuc. Fhe eymbels vepresent the following: a , wimus olfgo dT; A; plus 1.2 «roles (sdT) rasiduas in olige dT L1G! G plus 2.4 gpmoles (eéP) wasidzes in oligo 48,4.4,43 %> plus 4.8 mmolea (pét) residues in olfgo OF 04? components of stage I reaction mintures ave aa noted in ths lesen? of Pig. 6, stage fT reactton mixtures contained, in 0.25 ml: 6x10 w tris, pH 7.8, 25 1079 MgCl; 1.2 = 10 B magnocium «4 2 3 acetates Sx 10°" M EmCL,; 1.2 5 19°° 4 mexcaptoethanoly 2.8 % 19 - 03 i AGP; 5 =z 1 2 MEST: Ss 10 ~ M po easlua phesphocuclpyrevate: 3 ys wtystalline phosphoonolpyruvate kinase (alif, Cerp. Bicchen, Recesrch); *, q “4 seach of 19 Leantno acids; 2.2 107 uc! -tetysine (tuclear 2 x 10 Chicego Coup.) with spacifie vaddeactivity of 4-8 meurles/ansle; 20 ng RMA polyneraca peokein (20 units (Ghacberiain and Borg, 1962)) and 1,1 mg B, ¢sik extxect protein Qiirenberg and Motthaed, 1901). Saegn 2 veantion nizkatcs were imeubated at 37° for 15 minutes before the addition of atege IT components, Wig. 8 Chromatographic analysis of the ce po1ytysine syothasized in stage XY plua stage IZ reaction mixtures under the dfrection of oligo rT 15-14 before end after tryptic digestion. Ths syubots Septusen® the following: 9, miuus olfgo df; 2, plus oligo at. 13034! A, plus cligo at, a Ecllowed by tryptic digestion. Reactions vers : 43 cereiec out as described in the legends ef Piguras 6 and 7. ALEo? ¢. a oe a ee oO * . incubating staga EZ reaction méxtoves at 37° for 30 einutes, the ra- aublone were teemineted by the addition ef 2 mh of cold 10% GGA, vee supcenaicnt solution was extvaectad three tires with equal velures of other, concantrated in vaeue, and the reaidus was compared chronatoncashieally 2H panes ine solvent similar to thes decccihed by Yslev eng Wetacn £1953) comtaining pyridinefa-buty] elehol/acetic actdfianey = C£0/2/7 wie = 56 heoure. MOLES CAMINO ACID INTO PROTEIN INCORPORATED 800 ;-- 600 400 200 eee -_a Threonine — =g------4--- w Proline A —a Asparagine Q —p Lysine a. —® Histidine oO” —o Glutamine 30 40 50 60 TIME (MINUTES) PERCENT CODEWORDS IN POLY AC 30 Nm oO Theoretical Frequency of FPNA Codewords —— Observed Frequency of Amino Acid Incorporation Ac or 30 a Ac or AAC+ ACC 20 » / aan j 7 AAC 10 17 ve / fe _-~” Glutamine Tig a f “ ‘ foe KL \ O ao] | | | ! i l | ~~. L O !0 20 30 40 50 60 70 80 90 100 PERCENT C IN POLY AC PERCENT CODEWORDS IN POLY AC 100 90 80 70 60 50 40 20 10 100 90 80 70 60 50 40 30 20 10 co O// | or 7 / COCTCCA / / Lgce T \\ -—----— Theoretical Frequency of RNA Codewords 4 Observed Frequency of \\ Amino Acid Incorporation - \ AA , or \ \\ AAAFAAC \ N AAA) & N N — | | / ! J | = _—_ O 0 — at i 10 20 30 40 50 60 70 80 30 100 PERCENT C IN POLY AC PERCENT CODEWORDS IN POLY UC 100 90r TOF 40+ 20 lOF CC ji or / CCCHCCU / / 4 / CCC 4 100 oO oO I ea 80 TOF 60; SOF 40;- 30 20; 1OF 0 O Theoretical Frequency \ \ of RNA Codewords Observed Frequency of \\ Amino Acid Incorporation UU \\ or V &\ \ Phenylalanine ! ! ! i ! i >> = >See 10 20 30 40 50 60 70 80 930 100 PERCENT C IN POLY UC Fig + PERCENT CODEWORDS IN POLY UC OW © NM Oo l T l l l l Theoretical Frequency of RNA Codewords ——— Observed Frequency of Amino Acid Incorporation kh ie / Ye “vuc Ise Nec \ ‘\ Ss oo ~N \ \ lok l/ a ~ \\ // 7 “ \\ Li oo SN \\ UY 7 N \\ 4 a SN \ 4 “ J 7 ~\ \ j a ~S N / a ~~ N O We = { | I | | | | p~~.N 0 10 20 30 40 50 60 70 80 90 100 mg 5” PERCENT C IN POLY UC ££ FH O98 GB WNW Oo do oOo Oo gd mz MOLES C'2 AMP INCORPORATED Nw oO 0 | | | | | 0 lo 20 30 40 5.0 F.Hb eo —® vo | plus oligo dT\3-14 minus oligo dT 13-14 |__| 4 — a O 20 40 +60 £80 100° 120 MINUTES pp MOLES C'* -LYSINE INCORPORATED 1000 | oligo dT\3-14 900 -- oe 48 m » moles \ oligo dT (3-14 @ gs00 2.4 mp moles A \ 700 kK oligo OT 3-14 A |.2 mz moles 600 ;-— 500 |- 400 |- e 300 f- f 200 a minus oligo dT =. i. Z a 100 |- 0 | | l | | | | | | J O Te) 20 30 40 50 60 70 80 90 100 F 457 MINUTES COUNTS / MINUTE C!4-LYSINE CHROMATOGRAPHIC POSITION OF LYSINE PEPTIDES POLY- LONGER LYSINE OLICOPEPTIDES PENTA- ene sty®, | ae a %ete: wanes ere tan eraser 2400 r— 2200 -— 2000 F— plus oligo dT 1800 F—- plus Trypsin t t i \ 1 \ i Nha i 4 \ 1600 r— 1 1 _. & 1400 F— 1200 F- — a 1000 F—- plus oligo dT 800 /— —_ 600 -— = 400 -— _ 200 t— - _ INCHES