chapter 8 GENES AS DETERMINANTS OF PROTEIN STRUCTURE S tadies on the biochemical effects of mutations have given strong support to the notion that individual genes are con- cerned with the biosynthesis of individual proteins. In a large num- ber of instances it has been possible to attribute the absence, or modification of an enzyme to a single gene mutation. The study of such biochemical lesions, not only in microorganisms like Neurospora and E. coli but in man and other higher organisms as well, has led to the concept of a “one gene-one enzyme” relationship, which we have already introduced in Chapter 2. The term “gene” as used in this context has, until quite recently, been employed to convey the purely abstract concept of a unit of heredity. It represented a quantum of genetic information that in some way controlled the biosynthesis of a single protein or, in more cautious terms, of some “functional unit.” The recent advances in the biochemistry of the chromosome and of DNA, and in the mapping of genetic “fine structure” of the sort we have discussed in relation to bacteriophage, now make it possible to speculate about gene action 164 in chemical terms instead of formal abstraction. The investigations of S. Benzer, G. Streisinger, M. Demerec and his collaborators, G. Pontecorvo, and many others have indicated that the idea of a one-dimensional array of “genes,” divisible by genetic recombina- tion, may very likely be extended down to molecular dimensions. Their results suggest that the word “pseudoallelism” needed to be invented only because of the difficulties of demonstrating extremely rare recombinations in unfavorable biological material. If we accept the generalities of the “one gene-one enzyme” con- cept, and if we are willing to go along with the present trend of opinion on the role of DNA as the basic determinant of heredity, we must seriously consider the conclusion that the information which governs details of protein structure is present in the chemical struc- ture of the DNA molecule. It is an undeniable temptation to sug- gest further that a point mutation is really just a very localized change in the sequence or the three-dimensional relationships within a poly- nucleotide chain and that such a localized change might reflect itself in the sequence and folding of the protein concerned. In spite of the fact that many investigators properly accept the generality as a working hypothesis, such speculations are, at present, mostly fancy with little fact. The pathway from gene structure to phenotypic protein may be a long and tortuous one, and we cannot rule out such possible complications as the combined action of several genes in the synthesis of a single protein or the involvement of cytoplasmic heredi- tary factors which might modify, or even initiate, steps in a biosyn- thetic pathway. If this hypothesis is an approximately correct one, however, we should, as N. Horowitz has pointed out, be able to demonstrate mutations that lead to qualitative as well as quantitative changes in enzymes and other proteins. It should be possible, for example, to show that various mutations within a given protein-determining re- gion of the genetic material of an organism can lead to “mutant” forms of a biologically active protein which exhibit varying degrees of functional adequacy. Mutations affecting portions of protein struc- ture that are essential for function should be lethal ones, whereas those affecting less essential regions might either be undetected or “leaky,” to use the genetic patois. In spite of the fact that hundreds of examples have been found of gene-protein relationships, it has been possible to demonstrate a cor- relation between the mutation of a single gene and the chemical and physical properties of a homogeneous protein molecule in only a few instances. Many of these positive correlations have emerged from GENES AS DETERMINANTS OF PROTEIN STRUCTURE 165 studies on proteins of higher organisms for the simple reason that protein samples of sufficient purity are easier to come by with red cells, milk, and plasma than with microorganisms. However, the ad- vantages offered by microorganisms in respect to genetic mapping has been a tremendous stimulus to gene-minded protein chemists, and it is likely that many of the major advances in this area will be made on material from such sources. If, for example, the protein whose biosynthesis is under the control of that region of genetic material in T4 bacteriophage so elegantly mapped by Benzer (see Chapter 4) could be identified and isolated in pure form, it is clear that a direct TABLE 14 Alterations in Proteins Attributable to Mutations Demonstrated or Possible Protein Species Effects of Mutation Hemoglobin! 2 Man Composition and charge Sheep Charge Mouse Charge 8-Lactoglobulin' 2 Cattle Charge Haptoglobin? Man Charge Pantothenate-synthesizing enzyme! E. coli Thermostability Tyrosinase! Neurospora er. Thermostability Glutamic acid dehydrogenase! Neurospora cr. _ Reversible heat activation For more detailed reference see: 1. N. Horowitz, Federation Proc., 15, 818 (1956). 2. D. Steinberg and E. Mihalyi, Ann. Rev. Biochem., 26, 373 (1957). test, in enormous detail, could be made for the existence of a corre- spondence between “cistron” and protein. Such detail could never be achieved with human proteins because extensive gene mapping in man is limited by his lengthy generation time and his eugenic mores. A partial list of those proteins for which gene-linked modification has been demonstrated is presented in Table 14. With one excep- tion, human hemoglobin, the difference between the normal protein and that obtained from the mutant has been in electrophoretic mo- bility, heat stability, and serological behavior. The net charge, stabil- ity, and serology of a protein are, of course, quite distinctive charac- teristics, and the proteins in Table 14 which have been studied in respect to these parameters can almost certainly be assumed to exist 166 THE MOLECULAR BASIS OF EVOLUTION in forms whose differences are related to allelomorphic genes. Never- theless, small organic molecules, tightly bound to proteins, can modify charge, and polysaccharides or other haptenic substances may in- fluence antigenicity. For such reasons, the case of human hemoglobin is a particularly favorable one, since for this protein the electro- phoretic and solubility differences between mutant forms are at- tributable to actual modifications in amino acid sequence. In 1949, L. Pauling, H. A. Itano, S. J. Singer, and I. C, Wells? made the important observation that the hemoglobin of sickle-cell anemics is electrophoretically abnormal and that in individuals with sickle-cell trait (an asymptomatic condition) a mixture of the abnormal sickle- cell and the normal forms could be demonstrated. Extensive study of the familial relationships of sickle-cell anemia has indicated that this frequently fatal disease is inherited in a Mendelian fashion. By an analysis of the genetic relationships between sickle-cell anemia and sickle-cell trait, J. V. Neel established that the production of the abnormal hemoglobin was due to the presence of a single mutant gene. Genetically, the anemic may be characterized as homozygous for the sickling gene and the individual with the trait as heterozygous. The studies of Pauling and Itano and their collaborators, together with the discovery by H. Horlein and G. Weber? of a congenital methemoglobinemia involving an abnormal globin component, stim- ulated the search for other genetically linked aberrations in hemo- globin synthesis. At present writing a dozen or more types of ab- normal hemoglobins which may be detected by their unusual physi- cal properties are known. In addition, there are a number of clinical situations in which detection depends on hematologic examination but no changes in the physical properties of hemoglobin have been observed. Such abnormal individuals have microcytic red cells or cells showing some other deviation from the normal morphology of erythrocytes. These instances of inhibition of synthesis of normal hemoglobin are collectively named thalassemia, and Allison has pro- posed, on the basis of the observation that the locus controlling the thalassemia effect does not appear to be allelomorphic with the nor- mal hemoglobin gene, Hb‘, that the locus for thalassemia be desig- nated Th. The normal gene at this locus would then be termed Th* and the thalassemia allele, Th’. Examples of the clinical nomenclature and genotypic designations for a number of abnormalities involving the hemoglobin molecule are given in Table 15. This compilation is taken from the excellent re- view by Itano to which the reader is referred for more detailed in- formation. For our present purposes it is sufficient to recognize, GENES AS DETERMINANTS OF PROTEIN STRUCTURE 167 TABLE 15 The Human Hemoglobins* Method of Detection Method of Detection A ‘ poral adult E Electrophoresis r me G Electrophoresis 5 ectrophoresis H Electrophoresis succtrophoresis I“ Electrophoresis Fe ubility J Electrophoresis actoid formation M - Spectroph Cc Electrophoresis poeeropnoromenty D Electrophoresis and solubility Nomenclature of Syndromes Associated with Abnormalities in Hemoglobin Metabolism Genotype Condition Hb Locus Th Locus Homozygous Normal AHA Sickle-cell anemia eins mere Hemoglobin C disease HbE HBS Thalassemia major Hoth Hb Thalassemia major ThTTAT Heterozygous Sickle-cell trait Hb4HbS Hemoglobin C trait Hb4Hbe Sickle-cell hemoglobin C disease HbSHbe Thalassemia minor Hb4 Hb Thalassemia minor TANTRT Sickle-cell thalassemia disease HbSHbt* " Hemoglobin C thalassemia disease Hobe Hb Doubly heterozygous Sickle-cell thalassemia disease Hb4 HDS NTRT Hemoglobin C thalassemia disease Hb4Hbe a i * From a review by H. Itano, (C. B. Anfinsen, M. L. Anson, Press, p. 215, 1957. Advances in Protein Chemistry, volume 12, K. Bailey, and J. T. Edsall, editors), Academic 168 THE MOLECULAR BASIS OF EVOLUTION first, that some of the various abnormal hemoglobins (Hb*, Hb§, etc.) are under the control of a series of genes which seem to be allelic (they are, perhaps, pseudoallelic) and, second, that certain other abnormalities, inclusively termed thallassemias (Th*:, Th*, etc.) involve genetic abnormalities for which no physical or chemical reflection in the structure of the hemoglobin molecule has been ob- served and which appear to be associated with genetic loci different from the Hb4 locus. Let us now examine what chemical data we have. The differences observed by Pauling, Itano, and their colleagues in the electrophoretic mobility of normal and sickle-cell hemoglobin might be ascribed to modifications in the amino acid sequence leading to the introduction or deletion of charged side-chain groups. On the other hand, such charge differences might be apparent only and could reflect the man- ner of folding of the polypeptide chains of the protein to expose or to mask charged groups in response to configurational change. A direct test of these hypotheses has been made by V. Ingram,° who has examined the details of sequence in the molecule (Figure 78) by means of the sensitive “fingerprinting” technique described in the previous chapter. His investigations have made it extremely likely that both sickle-cell hemoglobin and hemoglobin C differ from normal hemoglobin in only a single amino acid residue. The affected portion of the protein is shown in Figure 79. A glutamic acid residue in Hb4 has been replaced with valine and lysine, respectively, in Hb® and Hb°. The corresponding changes in net charge per mole (plus 2 for Hbs and plus 4 for Hb°, with respect to Hb) agree with that to be expected from the electrophoretic measurements, and no evi- dence has been obtained for other changes in sequence in the rest of the molecular structure of the protein. We have here, then, a direct test of the proposition that a mutation in a specific genetic locus causes a specific change in the covalent structure of the phenotypic protein related to this locus. Indeed, Ingram’s experiments are a test with a vengeance. Not only do the allelic Mendelian genes Hb‘, Hb’, and Hb® have to do with a very restricted aspect of structure, but they all appear to be related to the same aspect, namely the se- quence at one unique point. If the sequence of nucleotides in the polynucleotide chain of DNA determines polypeptide sequence, how can we explain the fact that these three genetically segregatable loci all influence the same position in the polypeptide? A particularly intriguing possibility for explaining Ingram’s results comes from a consideration of the theoretical model of Watson and Crick for DNA structure. The obligatory pairing of heterocyclic GENES AS DETERMINANTS OF PROTEIN STRUCTURE 169 (a) (b) Figure 78. “Fingerprints” of the peptides produced by digestion of normal hemo- globin (a) and sickle-cell hemoglobin (b) with trypsin. The “fingerprints” were obtained by a combination of electrophoresis and chromatography, more or less as described in Figures 71 and 72. The encircled areas in the figure show where the fingerprints differ significantly. From V. M. Ingram, Nature, 180, 326 (1957). bases in this structure has, as we have discussed earlier, been sug- gested as a basis for the accurate self-duplication of DNA strands. The specific sequences of the bases in the complementary strands of the double helix have also been viewed as a set of coded genetic information which might serve as the fundamental template for pro- tein synthesis. The most popular code form has been one based on “triplets,” in which various sets of three nucleotides correspond to a specific amino acid. Employing this idea, we may arbitrarily trans- late the sequence of amino acids in hemoglobin that differs in the three mutant forms into a corresponding nucleotide code as shown in Figure 80. The replacement of a single nucleotide with another within the critical trinucleotide sequence would give us the required 170 THE MOLECULAR BASIS OF EVOLUTION change in code. (The reader will obviously not take all this too seriously. The most improbable hypotheses in science have turned out to be true, however, and this one certainly deserves some serious consideration for its novelty and coherence. ) One very interesting question is raised by the existence of three mutant forms of hemoglobin differing from one another in respect to a single “locus.” Why, with some 300 amino acid residues in a hemoglobin monomer to choose from, has the accident of mutation occurred, and been perpetuated, in the same place three times? The phenomenon is qualitatively reminiscent of the results obtained by Benzer in his analysis of mutants in the rII region of bacteriophage T4 where he observed that, out of many hundreds of mutant colonies selected, a disproportionately great number involved mutation in the same genetic locus, whereas others were modified only rarely. The nonrandom distribution of affected loci, both in the bacteriophage case for which we have a good deal of genetic information, and for human hemoglobin for which we unfortunately have very little, might mean that only certain mutations are “permissible” and that the de- gree of permissibility is slight in most of the genetic material. We might equally well suggest, however, that some unsuspected peculiari- ties of DNA structure favor the modification of some lengths of nucleotide sequence more than others. Most probably, the mutant hemoglobin genes have been preserved because of the selective ad- vantage they have conferred on the affected individuals. _ (Sickle-cell anemia, for example, is correlated with decreased susceptibility to clinical malaria. ) + -~ - + HbA... . His. Val.Leu.Leu.Thr.Pro.Glu.Glu.Lys . . . t t + - + Hb § .. . His.Val.Leu.Leu.Thr.Pro.Val.Glu.Lys .. . t t + + - + HbC .. . His.Val.Leu.Leu.Thr.Pro.Lys Glu.Lys . t t t Figure 79. The differences in amino acid sequence between normal hemoglobin, sickle-cell hemoglobin and hemoglobin C. The arrows indicate the points of attack by trypsin which have lead to the production of the peptide fragments shown in the figure. GENES AS DETERMINANTS OF PROTEIN STRUCTURE 171 DNA a DNA Protein ONA {a} A} {a} {BE o}--{6} “s+ {a} a}-{ib 4a} {8} {6}--{C} a-{th Ti}--{ah > Ga" {é}--{c} a}--{+ OG once {o}-{ {T} fa —» ood {}-{ak —> Giutarnle {T}--{ah —> Glutar eo {fo Gh Hemoglobin A Hemoglobin S Hemoglobin C Figure 80. A hypothetical scheme showing how the structure of deoxyribonucleic acid might be related to the structures of hemoglobin A (normal hemoglobin), hemoglobin § (sickle-cell hemoglobin), and hemoglobin C. The diagram sug- gests a correspondence between triplets of purine and pyrimidine bases and in- dividual amino acid residues. A change in the base sequence corresponding to the third amino acid from the top of the drawing could conceivably lead to the changes in code required for the modifications in sequence shown in Figure 79. The reader should be very much aware of the completely speculative nature of this diagram, In the absence of further chemical data of the sort available for the human hemoglobins, it may be of value to examine in more detail some of the research now in progress which can be expected to settle some of the problems that we have posed. Many groups of investi- gators are busily engaged in attempts to isolate and characterize par- ticular proteins from organisms that differ in genotype by a single mutation. We have already referred to the studies of Horowitz and Fling (Chapter 2) on the tyrosinases of Neurospora mutants in which differences in heat stability and activation energies of thermal inacti- vation were demonstrated. Rigid purification of these tyrosinases, 172 THE MOLECULAR BASIS OF EVOLUTION Glutamic and subsequent study of their chemical structure, may well lead to another situation like that of the hemoglobins for which the direct chemical consequences of mutation can be shown. Others of the protein systems under investigation, listed in Table 14, also promise to be extremely informative, particularly those involving easily iso- lated proteins like the B-lactoglobulins of milk. Because of their flexibility as regards genetic analysis, however, the bacteria and bac- teriophages are, at present, receiving the most concerted attention. For example, no less than three laboratories are in the midst of the particular problem of determining the effects of mutation in the h region of bacteriophage T2 on the chemical nature of the phage particle. The host range (h) region of the genetic material of bacteriophage T2 determines whether or not a phage particle will adsorb to a specific bacterial cell host. Thus the wild-type phage, T2h+, will adsorb to and infect E. coli of the B strain but not of the B/2 strain, whereas h mutants will attack both B and B/2. Thus wild-. type particles (h+), when grown on a Petri dish containing agar in which is uniformly distributed a mixture of B and B/2 E. coli, will lyse only the B cells and a turbid plaque will be formed. On the other hand, h mutants will attack both B and B/2 and a completely clear plaque will result. (It is convenient, in what follows, to think of an h+ mutation as a “defect” in the “norma?” h region of the genetic strand.) This difference in phenotypic behavior can be made the basis for a quantitative estimate of the proportion of h and h+ par- ticles and has been applied by Streisinger and Franklin‘ for the lo- cation of various h mutants along the linear genetic map in a manner much like that employed by Benzer for the mapping of r mutants in bacteriophage T4. Since use of the technique of fine-structure mapping in bacterio- phage will become more and more common, it will be instructive to examine briefly the general approach to the mapping of the h region as an additional example. — An h-type phage arbitrarily named h,°, was plated on mixed B and B/2 cells as above described, and the turbid plaques were chosen as examples of reversions to the h+ genotype. In this way there was obtained a series of mutants of the h+ variety. Fourteen h+ mutants having low reversion indices* were then crossed with each other as * All the ht mutants were examined for their propensity to revert spon- taneously to the h phenotype, and those having a high “reversion index” were discarded since such mutants would have introduced technical difficulties in sub- sequent studies of the ability of pairs of h* mutants to yield h phenotypes by genetic recombination. GENES AS DETERMINANTS OF PROTEIN. STRUCTURE 173 The “defective,” h* loci are encircled. During the formation of progeny a process analogous to crossing over takes place. The proba- bility that recombination of hy and he will occur in the doubly infected cell, A, is four times greater than the probability that h; and hg will recombine in B. The relative positions of h,*, het, and hgt might then be x 3x (a) + | + Go) G3 () or @® @ @ The correct order, hy*-h3t-hot, may be established by crossing mutant hy* with Ast. Recombination here will correspond to 3z rather than 5z. (b} Figure 81. Establishing the relative separation and order of h+ loci by two fac- tor crosses, 174 THE MOLECULAR BASIS OF EVOLUTION Possibility 1 Possibility 2 2 h genotypes 2 h genotypes @ rahihe roghthe with many more r+ since only one “crossover” is involved Thus, order is roo—hy—he if rt are in excess, or reo—he—h; if reo are in excess. @ ryhohy roohohy with many more Tee Figure 82. Establishing the absolute order of ht loci by three-factor crosses (ry oh,* X rh,*). well as with the original h+ strain by mixed infection of E. coli B. The progeny were examined (by the “turbid-or-clear” plaque test) for the relative proportion of h phenotypes that had formed through recombination (Figure 81). Each pair of h+ mutants was found to yield h recombinants with a characteristic and reproducible fre- quency. All these frequencies were low, however (less than 1 per cent), indicating that the h+ mutants examined all occurred within a region along the genetic map of less than two recombination units. (As discussed in Chapter 4, the total “length” of the genetic material in T2 phage may correspond to as much as 800 such units. ) Having established that all the various h+ mutations occurred within a small region of the map, it was necessary to determine the order in which they were arranged with respect to one another. This was done through the use of three-factor crosses, a procedure with a forbidding name but one that is perfectly straightforward when thought of in terms of a simple model ( Figure 82), Crosses were made by mixed infection of strain B bacteria with bacteriophages containing, in addition to one of the h+ loci, either the wild type, r+, or the mutant, r,, (which belong to the so-called GENES AS DETERMINANTS OF PROTEIN STRUCTURE 175 Ayt hogt hogt higt Ag+ gt Yep $ r<~0.50 + 0.05 —> >10.004 + 0.0015 r<———1.03 + 0.15 —__—»> 70.26 + 0.035. y 0.019 + 0.001 <——————}—159 + 029 —______.] <—__——_—_1-176 + 0.35 157+ 0.14 1.68 * 0.28 1.29 + 0.20 <—_——_—119 * 006 0.34 0.05 0.33 + 0.03 0.29 + 0.03 Figure 83. Genetic map of ht mutants. Taken from the studies of G. Streisinger and N. Franklin on the genetic determination of host range in bacteriophage T2, Cold Spring Harbor Symposia Quant. Biology, 21, 103 (1956). plaque-type mutants that were mapped by Benzer). The prepara- tion of these doubly marked mutants requires a considerable amount of technical manipulation involving repeated back-crossing and selec- tion and we shall not attempt to describe the details of the chore. Suffice it to say that strains of bacteriophage were obtained which permitted crosses of the type rh, th, X rth,h,+ to be made, where the r,, locus is situated 24 recombination units from the h region as determined by two-factor crosses. The relative location of two ht loci, h,+ and h,*+, with respect to the r,, locus may be determined by the estimation of the proportion of the h mutants formed by recombination which is r+ in character. The applicability of this test becomes clear upon inspection of the schematic representation shown in the figure. If the order of loci is r,.—-h,-h, rather than T,.~h,-h,, a far greater proportion of h-type recombinants will be r+ since the incorporation of the segregated h alleles into one functional unit requires only one “crossover” in the first instance and two in the second. All of this deduction involves, of course, the assumptions we have mentioned earlier, including the reality of a linear arrangement of genetic loci in phage and the avail- ability of a mechanism of crossover at least analogous to that gen- erally invoked for recombination in the chromosomes of higher or- 176 THE MOLECULAR BASIS OF EVOLUTION ganisms. These assumptions are, operationally speaking, applicable in the present case. The order of h+ loci shown on the map in Figure 83, which were determined from the three-factor cross data, are compatible with the distances between the loci which was indi- cated by the preliminary two-factor cross experiments, Before considering these genetic observations in terms of the heredi- tary control of phage chemistry, one further observation needs to be reviewed. This concerns the demonstration of the functional unity of the h region. Do all the h+ mutations in the “map” shown in Figure 83 belong to a single unit of function (a “cistron”), or is it possible that they are divided into more than one group and act co- operatively in the determination of host range specificity? A de- cision may be made by use of the cis-trans test which we have previ- ously described in relation to the r mutants. Streisinger* demon- strated that all the h mutants belong to a single functional unit by comparing the effectiveness of crosses of the cis type (h x h*+) and of the trans type (h,+ xh, +) in producing h phenotypes (i.e., phage which adsorbs to both B and B/2 bacteria). If, in Figure 84, each I I hy (2) : : Progeny of which only Trans ——* about 3 are Ax f ! phenotypes OnE: ; + Cis 1 hg Progeny of which more (or) —. than 60% are he h-phenotypes t | hy he GENES AS DETERMINANTS OF PROTEIN STRUCTURE 177 of the indicated portions of the genetic strand acts separately, and they cooperatively produce a normal h phenotype, all the progeny in such a mixed infection should have the h character. It was found, however, that only a very small proportion of the progeny were h in phenotype (about 3 per cent, of the order of that to be expected from crossover and other sequelae of recombination). In the case of the cis arrangement (Figure 84), a high percentage of h pheno- Figure 85. Electron photomicrograph of bacteriophage T2 adsorbed on cell walls of E. coli B. Some of the adsorbed virus particles have lost their DNA, pre- sumably by injection into the bacterial cell (see arrows ). This photograph was obtained through the kindness of Dr. Thomas F. Anderson of the Institute for Cancer Research, Philadelphia, Pa. 178 THE MOLECULAR BASIS OF EVOLUTION types was observed (about 60 per cent). (Enough phenotypically h material is presumably made by the all-h strand to confer this charac- ter on some of the h+ genomes as well as the h. That is, genetically h+ phage may have, associated with their protein coats, some h-type host-range protein). It may be concluded, therefore, that only a func- tionally complete h region will suffice for the expression of the h character. We are now in a position to consider the genetic map of the h region in terms of what it does for the bacteriophage particle. Our attention must, of course, be directed at that part of the chemistry (and morphology) of T2 which has to do with its adsorption to host cells. Phage particles attach to bacterial cells by the tips of their tails (Figure 85). The same sort of attachment occurs with phage “ghosts” prepared by suddenly exposing intact phage to an osmotic shock. Since the phage ghost is essentially all protein, except for traces of carbohydrate present in such small amounts that its func- tional importance is fairly unlikely, it may be concluded that the busi- ness of attachment involves a specific protein component. Further support for the protein nature of the adsorbing substance comes from the fact that the kinetics of inactivation of the adsorptive capacity by agents like urea are very similar to those of protein denaturation. It has also been observed that the blocking of amino groups in phage prevents attachment to bacteria. We may approach the problem of isolating and characterizing the protein component responsible for host range specificity in two ways. First, we may proceed to isolate various fragments from disrupted phage particles. Such studies have been carried out by S. Brenner and his colleagues in the Cavendish Laboratory at the University of Cambridge. These investigators have concluded that the host-range function is carried in the slender fibers that are attached to, and wrapped around, the tail of the virus particle (Figure 86) and have prepared highly purified concentrates of free fibers for chemical study. A second approach to the problem involves the fractionation of the total protein mixture making up phage ghosts in the same way that we would approach the isolation of an enzyme from a crude tissue extract. Phage ghosts may be solubilized in a number of ways that should not cause modification in the covalent structure of the component proteins, and solutions prepared with such agents as urea and guanidine appear to be amenable to study by chromatographic, electrophoretic, and ultracentrifugal techniques. (See F igures 87 and 88, for example.) GENES AS DETERMINANTS OF PROTEIN STRUCTURE 179 Figure 86. Electron photomicrograph of T2 bacteriophage, disrupted by treatment with N-ethyl maleimide. This photograph was obtained through the kindness of Mrs. E. R. Kauf- man and Dr. A. M. Katz of the Na- tional Institutes of Health, Bethesda, ~ Maryland. Both the morphological and “chemical” attacks on the fractionation problem require a test for functional activity. Although not direct, a test has been devised based on the fact that the antigenicity of the T-even bacteriophages against rabbit antiphage antibody is con- trolled by the same genetic locus as that which determines the host range. Thus, Streisinger® has shown that no measurable recombina- tion occurs between the determinants of host range and the determi- nants of serotype. (The reader must be asked to assume the validity of this conclusion; he may, however, wish to read the elegant paper of Streisinger,> in which the details and arguments are presented. ) We may, then, hopefully assay any given protein fraction or morpho- logical fraction for activity by estimating its ability to block the phage-neutralizing action of an inactivating antibody preparation. The studies on the chemical consequences of mutation in the h region of bacteriophage have only just begun, and the problems of isolation must first be solved. It seems likely, however, that these investigations, as well as others concerned with other regions of the 180 THE MOLECULAR BASIS OF EVOLUTION GENES AS DETERMINANTS OF PROTEIN STRUCTURE contracted sheath, and University of Cam- Champe, L. Barnett, ve-staining technique lying flat (300,000). , core, , some on end and some } ge T2 showing the head treated bacteriopha Right: Purified sheaths The preparations were mad his colleagues graph of H.0,- (300,000). gh the kindness of pictures together with y, 1959, in press). Left: Electronmicro: f Molecular Biolog graphs were obtained throu h gland, who took these tail fibers attached to the base of the core These photo and S. Benzer (Journal o Figure 86 (continued). described by S. Brenner bridge, En 18? e by the negati hysica Acta, 1959, in press). Biochimica et Biop and R. W. Home ( 0.6 l l T T T | l ° wo ! S a I Optical density, 280 mu 2 2 NO Ww T | Serum blocking power de JAA AL 0 0 10 20 30 40 50 60 70 80 90 | 0.01 M Le Gradient to | Gradient to | ‘ phosphate ‘ 0.1 M phosphate 1.0 M phosphate “ Fraction number Figure 87. Partial purification of the protein component in ghosts of bacterio- phage T2 which is responsible for serum blocking power (see text) and which presumably determines the host range of the phage. A preparation of “ghosts” was dissolved in ice-cold 5.2 M urea at pH 7.4 and chromatographed on a column of the cation exchanger XE-64. From W. J. Dreyer, A, Katz, and C. B. Anfinsen, Federation Proc., 17, 214 (1958). genetic map of phage, should ultimately enable us to make direct point-by-point comparisons of genetic changes and structural modi- fications in protein molecules. The particular power of the bacterio- phage approach, and of similar studies on other microbial systems,* is the extreme discrimination of the genetic mapping for these “or- * For example, C. Levinthal and A. Garen, of the Massachusetts Institute of Technology, have recently begun the mapping of the “cistron” which controls the synthesis of an alkaline phosphatase in E. coli. This enzyme is synthesized in large quantities when phosphate is limiting in the culture medium. Organ- isms are grown on plates, containing medium or low phosphate concentration, which are then sprayed with nitrophenylphosphate. Alkaline phosphatase-con- taining cells cleave the phosphate ester to yield the yellow-colored nitrophenol. Bacterial colonies containing the enzyme thus become yellow, some mutants re- main white, and certain mutants, having enzyme of intermediate activity, are weakly colored. The enzymes, active or inactive, may be isolated from the various strains and are at present being subjected to structural analysis. 182 THE MOLECULAR BASIS OF EVOLUTION ganisms.” If Benzer’s calculation of the length of the “recon” in terms of nucleotide units in the DNA chain proves to be reasonabl correct, we may expect to be able to distinguish, genetically between loci as closely packed as those determining the three hemoglobins in- vestigated by Ingram. In the h region, for example, the locus h,,+ appears to be only 0.004 recombination units from hot, Translated into Benzer’s nucleotide language, this distance would correspond to about the distance between one nucleotide pair. In spite of the wishful chemical thinking and genetic uncertainty involved in all these speculations, it must be very clear why the biochemist is will- ing to risk the gamble of time and effort required to test the gen- 0.9 — 0.6 0.5 -— 0.4 Optical density, 280 mu ! 0.2 M phosphate, 0.2 M phosphate, 02k 0.1 M phosphate, PH 6.5 PH 6.5 pPH65+02M —- NaCl 0.1 J- 20 60 100 AD” . 140 160 Fraction number, 22 ml./ fraction 180 200 Figure 88. The purification of lysozyme from lysates of E. colt on a co ation exchanger, XE-64. The small chromatographic peak at the far wight of the c romatogram contains the lysozyme activity (the dotted curve). Lysozyme may also be isolated from bacteriophage ghosts. The starting material of choice however, is an E. coli lysate. Enough lysozyme is presumably synthesized follow- ing infection of E. coli cells with phage to more than Satisfy the needs for the ormation of progeny. The excess enzyme within the infected bacterial cells is then released into the surrounding culture medium upon lysis. The enzyme emerging from the ion exchange column is purified several thousandfold over its concentration in the crude lysate. From unpublished experiments of Dr. W Dreyer, National Heart Institute, Bethesda, Maryland. os GENES AS DETERMINANTS OF PROTEIN STRUCTURE 183 eral hypothesis. With luck, the answers might begin to clarify some of the most central problems of biology. REFERENCES 1. L. Pauling, H. A. Itano, S. J. Sanger, and I. C. Wells, Science, 110, 543 (1949). 2. H. Horlein and G. Weber, Deut. med. Wochschr., 73, 876 (1948). 3. V. M. Ingram, Nature, 180, 326 (1957). . G. Streisinger and N. C. Franklin, “Genetic Mechanisms: Structure and Func- tion,” Cold Spring Harbor Symposia on Quant. Biol., 21, 103 (1956). . G, Streisinger, Virology, 2, 388 (1956). mh NK 184 THE MOLECULAR BASIS OF EVOLUTION