Proc. Natl, Acad. Sci. USA Vol. 89, pp. 6280-6284, July 1992 Biochemistry Murine Hox-1.11 homeobox gene structure and expression (Hox-1.11 nucleotide sequence /embryonic development) DOoNG-PING TAN**, JACQUELINE FERRANTE*, ADIL NAZARALI*, XIAOPING SHAo*, CHRISTINE A. KOZAK, Vicky Guo*, AND MARSHALL NIRENBERG* *Laboratory of Biochemical Genetics, National Heart Lung and Blood Institute, and {Laboratory of Molecular Microbiology, National [nstitute of Allergy and Infectious Diseases, National Institutes of Health. Bethesda, MD Contributed by Marshall Nirenberg, March 31, 1992 ABSTRACT The Hox-1.11 gene encodes a protein 372 amino acid residues long that contains a conserved pentapep- tide, a homeodomain, and an acidic region. The amino acid sequence of the homeodomain of Hox-1.11 is identical to that of Hox-2.8, and the N-terminal and C-terminal regions of Hox-1.11 are similar to those of human HOX2H, which is the equivalent of murine Hox-2.8. The Hox-1.1] gene was shown to reside on murine chromosome 6, which contains the Hox-1 cluster of homeobox genes. One species of Hox-1.11 poly(A)+ RNA approximately 1.7 kb long was detected in mouse em- bryos, which is most abundant in 12-day-old embryos and progressively decreases during further embryonic develop- ment. The most anterior expression of Hox-1.11 poly(A)+ RNA in 12- to 14-day-old mouse embryos was shown by in situ hybridization to be in the mid and posterior hindbrain. Hox- 1.11 poly(A)* RNA also is expressed in the VII and VIII cranial ganglia, spinal cord, spinal ganglia, larynx, lungs, vertebrae, sternum, and intestine. Mouse chromosomes contain four clusters of homeobox genes that are thought to have originated during evolution by successive duplications of an ancestral Antennapedia— Ultrabithorax (Antp-Ubx) cluster of homeobox genes (1, 2). Both the amino acid sequences of the homeodomains derived from these genes and the order of the genes within each cluster have been conserved during evolution (1, 2). The order of the homeobox genes in a mammalian Antp-Ubx chromosomal cluster of genes is related to the most anterior site of expression of each gene in the embryo, which is successively displaced toward the posterior, starting with the second gene from the 3’ end of the cluster and progressing toward the gene at the 5’ end of the cluster (for a recent review, see ref. 3). However, the expression of many of the homeobox genes overlaps toward the posterior. Some of the Drosophila homeobox genes in the Antp and Ubx clusters of genes (4) function as homeotic selector genes (5), which determine unique parts of the body. Homeotic selector genes also may be determinants of cell compartments—i.e., they may regulate genes that encode molecules that enable cells to mix only with cells in the same compartment. Relatively little is known about the functions of Antp-—Ubx clusters of homeobox genes in mammals. However, recent evidence suggests that a segmental pattern of rhombomeres is generated during the development of the vertebrate hind- brain. Motorneuron nuclei of branchiomotor nerves V, VII, and IX are produced by rhombomeres 2 and 3, 4 and 5, and 6 and 7, respectively. Furthermore, pairs of hindbrain seg- ments match adjacent branchial arches (6). In addition, cells in the hindbrain of the developing chick embryo do not cross rhombomere boundaries (7), which suggests that rhom- bomeres correspond to cell compartments in the developing The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked *‘advertisement”’ in accordance with 18 U.S.C. $1734 solely to indicate this fact. 6280 hindbrain. Krumlauf and Boncinelli and their colleagues (8) have shown that anterior expression boundaries of some Hox genes correspond to rhombomere boundaries and have sug- gested that combinatorial sets of homeobox and other pro- teins that regulate genes may impart unique positional ad- dresses to hindbrain rhombomeres and associated structures in the branchial region of the embryo. In this report we describe the nucleotide sequence$ and the expression of the murine Hox-/.// gene during embryonic development. METHODS AND MATERIALS Hox-1.11 Clones. Part of the homeobox region of the murine Hox-/.// gene (9) (nucleotide residues 1539-1658 in Fig. 2) was used as a template for the synthesis of a 32P- labeled RNA probe (=1.4 x 10? cpm per wg of RNA), which was used to screen an ICR Swiss mouse genomic DNA library in AGem-11 (Promega) for the Hox-/.// gene. Hy- bridization was performed at 65°C in 1 M NaCl/50 mM Tris-HCl, pH 7.6/1% SDS containing 100 wg of yeast tRNA per ml and 3.3-4.4 x 105 cpm at 22P-labeled RNA per ml. Filters were washed (final wash) in 0.1x SSC (1x SSC = 0.15 M NaC!1/0.015 M sodium citrate, pH 7) containing 0.1% SDS at 40°C for 30 min. Escherichia coli C600 hfl cells (BNN 102) infected with an 11.5-day-old Swiss mouse embryo cDNA library in AgtlO (Clontech) were plated, and 2 x 10° plaques were screened with an *°S-labeled RNA probe (specific activity, 1.9 x 10° cpm per wg of RNA) transcribed from a Hox-/.// genomic DNA subclone (nucleotide residues 1494~2220 in Fig. 2). Hybridization was performed in 50% formamide containing 5x Denhardt’s solution (1x = 0.02% polyvinylpyrrolidone/ 0.02% Ficoll/0.02% bovine serum albumin), 0.5% SDS, 175 pg of yeast (RNA per ml, and 4 x 10° cpm of 35S-labeled RNA per ml at 42°C for 20 hr. The final wash was with 0.1 SSC/0.1% SDS at 40°C for 1 hr. DNA Sequencing. Genomic DNA and cDNA fragments were subcloned into pBluescript II KS(+) (Stratagene). The exonuclease II/mung bean nuclease unidirectional deletion method (10) was used to generate genomic DNA or cDNA subclones with overlapping deletions. Both strands of DNA were sequenced by the dideoxynucleotide chain-termination method (11) using Sequenase 2.0 (United States Biochemical) or Taq polymerase DNA sequencing kits and with M13 forward or reverse primers or specific primers. 7-Deaza-dGTP or dITP (United States Biochemical) were used to resolve compressions. An Applied Biosystems DNA sequencer and fluorescent primers or dideoxynucleotides also were used to *To whom reprint requests should be addressed at: Laboratory of Biochemical Genetics, Nationa] Heart Lung and Blood Institute, National Institutes of Health, Building 36, Room 1C06, 9000 Rock- ville Pike, Bethesda, MD 20892. SThe nucleotide sequences of Hox-/.// genomic DNA and cDNA have been deposited in the GenBank data base (accession nos. M93148 and M93292, respectively). Biochemistry: Tan et al. determine DNA sequences. GCG computer programs were used for sequence analysis. We thank Marvin Shapiro for help in using the DNAdraw program (12) to make Figs. 2 and 3. Northern Analysis. BALB/c mouse embryos 10, 12, 14, 16, or 18 days after fertilization were homogenized in 4 M guanidine thiocyanate/0.1 M Tris chloride. RNA was puri- fied by ultracentrifugation through 5.7 M CsC1/10 mM EDTA (13). Poly(A)* RNA was obtained by oligo(dT)-cellulose column chromatography (13) and then was fractionated by gel electrophoresis (1% agarose/formaldehyde gels; 10 yg of poly(A)* RNA per lane). A 510-base-pair (bp) cDNA frag- ment starting from the 5’ end of the cDNA clone without the homeobox (nucleotide residues 345-821 and 1462-1494 in Fig. 2) was labeled with [32P]dCTP (2.1 2P-labeled cDNA (nucleotide residues 345-821 and 1462—1494 in Fig. 2) per ml, 50% formamide, 6x SSPE, 1% SDS, and 133 yg of dena- tured, sheared herring sperm DNA per ml. The final wash of the filters at 60°C was with 0.1x SSC/1.5% SDS. In Situ Hybridization. BALB/c mouse embryos 12.5 or 14 days after fertilization were separated from parental tissue and frozen as described by Dony and Gruss (15). Sections 10 pm thick were cut in a cryostat at —20°C and collected on slides coated with poly(L-lysine). Sections were fixed and hybridized by modification of the method described by Hogan et al. (16). S-labeled RNA probes without the homeobox (1-2 x 108 cpm/g) were prepared from the 5’ region of the cDNA (nucleotide residues 429-821 and 1462- 1494 in Fig. 2) by incorporation of uridine 5’-[a-@*S)thio]- triphosphate. Slides were washed with 2x SSC/1 mM DTT at 50°C, then with 0.2x SSC/1 mM DTT at 55°C, and finally with 0.2x SSC/1 mM DTT at 60°C (1 hr each wash). RESULTS AND DISCUSSION Two million recombinants from a murine genomic DNA library were screened for the Hox-/./1] homeobox gene by using *?P-labeled RNA synthesized from PCR-amplified, cloned mouse genomic DNA corresponding to the Hox-!./1 homeobox region described previously (9). Three Hox-/./1 clones, AT7, A16, and A33, were obtained. In addition, two million recombinants from an 11.5-day-old mouse embryo cDNA library were screened with a genomic DNA fragment corresponding to most of exon 2 of Hox-/.// genomic DNA, and one Hox-1.11 cDNA clone, 1250 bp long, was obtained. The structure of the Hox-/.// gene and partial restriction maps of Hox-1./1 genomic DNA and cDNA are shown in Fig. 1. The Hox-!.1] gene contains two exons separated by a small intron. Four thousand and forty nucleotide residues of Hox-/]./1 genomic DNA and 1250 residues of cDNA were sequenced. The composite nucleotide sequence of Hox-/.// genomic DNA and cDNA and the deduced amino acid sequence of the Hox-1.11 homeobox protein are shown in Fig. 2. The Hox- Proc. Natl. Acad. Sci. USA 89 (1992) 6281 SEQUENCED REGION 5 EXON 1 EXON 2 5 (Ss) PINTRON | (3) PPHRR RHH PH PP Rip yPBTPHH H PHBSHP BHPP BX ° 2 4 6, 8 \ 10 12 14 14s 2.17 GENOMIC DNA CLONE 7 \ 7 NX sf ‘N 7 YPWMK = HOMEOBOX ‘ / Py Bt pu 1 (AL, 3 S p12 3 & Tc2 cDNA CLONE 1 99 477 513 692 1250 BP Fic. 1. Partial restriction maps of Hox-/.// clones AT7 genomic DNA in AGem-11 and ATc2 cDNA in Agt-10. Exons 1 and 2 in cloned genomic DNA are represented by grey boxes and intron 1 by a narrow open box. The black box in exon 2 of cloned genomic DNA and cDNA corresponds to the homeobox. The location of the 4 kilobases (kb) of genomic DNA that was sequenced is shown above the genomic DNA restriction map. The boxed regions in cDNA Tepresent the coding portion of the cDNA. The black box in exon 1 of the cDNA corresponds to the conserved pentapeptide core, Tyr-Pro-Trp-Met-Lys (YPWMK). Abbreviations for restriction en- zymes are as follows: B, BamHI; H, AHindIll; P, Pst 1; R, EcoRI; S, Sac I; X, Xba 1. 1.11 gene contains a 1116-nucleotide-residue open reading frame that encodes a protein of 372 amino acid residues with a calculated M, of 40,793 and a pI of 5.67. The first ATG in the open reading frame (nucleotide residues 443-445) is assumed to be the initiation codon for protein synthesis; however, only 6 of the 10 nucleotide residues in the putative translation initiation site match the Kozak consensus se- quence for initiation of protein synthesis, GCC (A or G) CCATGG (17). The ATG codon is preceded by two adjacent in-frame termination codons 102 nucleotide residues up- stream of the ATG. Exon 1 encodes 126 amino acid residues with a conserved hexapeptide core, Glu-Tyr-Pro-Trp-Met- Lys, found in some vertebrate homeobox proteins. Similar sequences also are found in homeotic homeobox proteins of Drosophila. Comparison of the nucleotide sequences of Hox-1.11 genomic DNA and cDNA revealed a 640-bp intron with an unusual 5’ splice site, AG | GTCAGT. Only 3% of approximately 400 vertebrate 5’ splice sites examined contain C as the third nucleotide residue from the 5’ terminus of the intron (18); however, the splice branch site and 3’ splice site match the consensus sequences perfectly. Exon 2 encodes a homeodomain (amino acid residues 139-198), which is fol- lowed by a region with 9 acidic amino acid residues out of 12 residues (amino acid residues 214-225). Four potential phosphorylation sites for protein kinase C, one for cAMP-dependent protein kinase A, and six for casein kinase II are shown in Fig. 2. Two putative protein kinase C phosphorylation sites are within the homeodomain: Ser-139, the first amino acid residue in the homeodomain, and Thr-179, the amino acid residue immediately before the putative third a-helix of the homeodomain, which is thought to be the major DNA binding site of the protein. The last amino acid residue in the homeodomain, Thr-198, is a possible phosphorylation site for cAMP-dependent protein kinase A. One putative protein kinase C phosphorylation site is within the acidic domain, and a casein kinase I] site, Ser-227, is the second residue after the acidic domain. The possibility that the activity of Hox-1.11 protein as a regulator of gene expression is reversibly controlled by phosphorylation and dephosphor- ylation of serine or threonine residues deserves further study. The termination codon and polyadenylylation signal (AT- TAAA) are separated by only four nucleotide residues. AT- TAAA was reported to serve as a polyadenylylation signal in 12% of 269 vertebrate cDNAs surveyed, and the polyadeny- lylation activity of ATTAAA was shown to be 77% of the activity found for AATAAA (19). Comparison of the nucleo- tide sequences of Hox-/.// genomic DNA and cDNA showed 6282 Biochemistry: Tan et al. : . AP2_ . : - : . - : : : CAGGAGGCAAACAGGCACTCTCGCCCCCCACCCACTCCCGGGGCATTGCCATCCACACCCACATATATGTATTTTTGCCCTGAAAAAAAGTGTAAATAAAGCCTCGATGGCCCCCAATGA rc GGCGTTCCTTTCTGACTTTTTTGGATCAATCAAACAGACAGTGGCTTCTTTTGATTAAAGCCCAAATTGTCATTGGGCAGAAGCAATCATGTGACAGCCAATTCGGTCCAATTTCAACCT TGTCTCCATGAATTCAATAGTTTAATAGTAGCGCGGTCCCCATACGGCTGTAATCAGTGAAT TAGAAAAAAAACACCCCAGCAGCGATCTTCTATGATAGATTTTPTTTTTCCTTCGCGCT CGCCTTTTTCCTGGGCCTTGCCCCCCCAAAGCCCCTCCAAAAGAGGGAACTTTTCCTCCGAGGGGGCTCCAAGGAGAAGGCCATGAATTACGAATTTGAGCGAGAGATTGGTTTTATCAA MN ¥ E F E R E IT G F I oN TAGCCAGCCGTCGCTCGCTGAGTGCCTGACATCTTTTCCCCCTGTCGCTGATACATTTCAAAGT TCATCAATCAAGACCTCGACGCTTTCACACTCGACACTGATTCCTCCTCCTTTTGA S$ Q PLS}L A E C L TT & F P PV AOD T F QS 5 8 I K T S T LS H S T L IT P PP F CE GCAGACCATTCCCAGCCTGAACCCGGGCAGTCACCCTCGCCACGGCGCTGGCGTTGGCGGCCGCCCCAAGTCGAGCCCCGCGGGCAGTCGCGGCAGCCCGGTGCCTGCCCGCGCCCTGCA Q T TIT P S LN PG S H P RH GA GV GGRP K S&S S$ P AGS RG S85 P A G AL Q GCCGCCTIAGTATCCCTGGATGAAGGAGAAGAAGGCGGCCAAGAAAACCGCGCTGCCGCCCGCCGCCGCCTCCACGGGCCCTGCCTGCCTCGGCCACAAAGGTCAGTCCGGAGACTTGGC P PIE Y P W M KJE K K A A K K T AL PPA AA S T GP ACL GH EK CCCAGGTCTCGGGGACCCTTGTCCCCCTCCGGGCTTCCCTAGGAGCGGTTGTGGGGGAGGGGACCATGAGCTTCGAGGGAAGGGGGAGAGAGAGAGATGCCTGGTGGCGGGTCTCATGCC CCTGTGGTCTCAGGAGTGGGTTTGGTGGAGGGGAAAATAGATCCCGAGTCCCACAATGAGACATATATATTTTGAGGAGGGGGGGCACTTTCCCTAACTTGTGTAATGTAGGATGATTTA TTTGAGTTGGAACTGACCTCCTCTTIGTCTAGTIGTCCTAGAGTTTGGCTTTTTGACAGTAATGAAGAGTGATAGACCGCTCTTGCTCAGCTAAGCAGCTGATGCATTAATTATAAATTGT GTTGTAGCTAATATAAAGTTTGCTCCCGGATGAAGAGGTTGGGGGAAGC CACAGGCAGGAATTTGATGGAGGTGGAAGAGACTGGGCTTCCCCGGGCTGGGCTCCCAGGAAGGGCAGCAC SPLICE BRANCH SITE AATAGCTGCACTGCATCCAGGGGCGGCCATTTTGTTGCAGTTGATCTTTTCTGCTGTATTTATGCTCCAATGGAATAACCCTGCTCGGACCCTTCCACCTICAACTGTATGTGTGTCTCT TGTTGGTTTCCCTTTCTGCAGAATCCCTGGAAATAGCTGATGGCAGCGGCGGGGGATCCAGGCGTCTGAGAACCGCGTACACCAACACTCAGCTTTTGGAGC TGGAAAAGGAATTTCAT E S L EI AODGS& 6&6&GE&E 8 R R L RT AY T N T QL LE LE K E F CAACAAGTACCTTTGCAGACCCCGCAGGGTGGAAATCGCCGCGCTGCTGGATTTGACCGAGAGACAAGTCAAAGTGTGGTTTCAGAACCGGAGAATGAAGCATAAGAGGCAAAC: N K Y L ¢C R P R R V ET A AL LD & g ER @Q@ Vv K V W F Q N R RM K H K R Q 2 GCAAGGAGAACCAAAACAGCGAAGGGAAATTTAAAAACCTGGAGGACTCGGACAAAGTGGAGGAAGACGAGGAAGAGAAGTCACTCTTTGAGCAAGCCCTCAGTGTCTCCGGGGCCCTTC Cc K E N Q N S E G K F K N L E_D 8 DK VF EE D FE BE E£ K[SJL F EQaAtLS VS GAL TGCAGAGGGAAGGGTACACTTTTCAGCAAAATGCGCTCTCTCAACAGCAGGCTCCCAATGGACACAATGGCGACTCCCAAACTTTCCCAGTTTCGCCTTTAACCAGCAATGAGAAAAATT LE R EG ¥ TF QQ NAL §& @ QQ A PN GHN GD S$ QT F PV 5 [tr] K N TGAAACATTTTCAGCACCAGTCACCCACTGTTCCTAACTGCTTGTCAACAATGGGCCAGAACTGTGGAGCTGGCCTAAACAATGACAGTCCCGAGGCCATCGAGGTCCCCTCTTTGCAGG LK H F Q H QS P TV PN CLS TM GOQNC GAGHLNN OD S PE ATI EV P[S]L Q ACTTCAATGTTTTCTCCACAGATTCCTGCCTGCAGCTTTCAGATGCACTGTCGCCCAGCTTGCCTGGCTCCCTGGACAGTCCTGTAGATATCTCAGCTGACAGCTTTGACTTTTTTACAG DF N V F S$ T DS €C L @LS& DA LS PS LPG S$ LD PV D I S A D S F D F F T POLY A SIGNAL T/G CLUSTERS ACACACTCACCACAATCGACCTACAGCATCTGAATTAC TAAGAACATTAAAGCAARACAAAGCTTCACAAAACAAAACGCCTTTGACCAGGTGGCTTTGCCTTCTTTTATTCTCGGGAGTT yY * DT LET}T I DLoOQHLsN POLYADENYLATION SITE GATTTTCGTTTTAGTTTCTTICTTGATCTACCCCTACTCTCTCAAATGTTGAGGACTTTCCGTTTAATGTTCTCCCCTGACACAGTTTTAAAGCCATCTCTTGCAAATTATGTTGGCGTTC TAAGTGGTTTTTACACAGAACCCAACAAGCTICTATGTGATTTTCCTGAAAAACAAAACAGGAGGCCTGCAAGAAAGTGACCATAAATTGTCTTGTCACTTTCTGTTTATTTTTGTACCA CATTAGGATGCATTGTCATGCGTATTTTTGGTAGAATAAATTCTCCTTTGCTATAAGTAGCTITCTTATTTITITICTICCCCTCTITCTCAAGACTCATACTGATTTCTTATACTTCTTT TAGTCTAACATGGTAAATAAAAGTCTGGTCACAATTTACTTTTCAATCTTAATATATTTTATTAGGGTGGCATGTTCAAAGCC TGCAACAAACAGCAACAAAGAAGAAAACCAGAAAGTG CTAGGGCTTCTGGCAAGACTTITITITTITITTITTITITTITTTITTTTTTGGATTTTTCCCCCCTAAGGGAATGAATTTIGGTTTTCAGTGTTGAGACTCAGGCCATAGAGTTACATTAGTA ATAGAAATGGGCTTGGGCAGCCTGAGTTGGCCCTAGCCCAAGCTTTTGAGTGTGGAGAGGAAAAGTGCCTACAAAAACTTCCAGTTCTAGGACGAGTGGAGTGGAATGGAGTTCCCOTTAC CAAGCTGCTAGTTTCTAGGTCAATAACGGAGTTATCAAAAATTCATTTATTTATACACATTIGTTTCAAACATCTACGTTICCAACTCTTCCTTTCCCTCCCCCACACCCCCATTTTCAGT AP4 GTGGTTGATTAATCTCCCAGCTGTGGATGGGTTCAGGAAAAACAGACCTAAAAGACAGAGGAAAACAAAACAAAACAAAACCACGGTTACTAGCTGGGAAATGCATGCCAGGAAGSGCTC Z-DNA TITCCAGCCATCCCAAGATTCTTGCCCTCACCACACACACACACACACACACACACACACACACACACACACAATCGATATTTAATCCTTAAATCTGTTTACATTGCTAATAATAAATCA TTAAAGGTTTGAAGTCAGGTTAGCAAGTGGAGATTT TAAAAGTGTTTTTGCACCCTAGATTATTAATTAACCCAGTTTATTAGACAGAGAAGGAGCACTTTAAATGAAATATCAATAAAA HiNF-A HiNF-A TGTGATCTAGGGTATGTTATTGGGCTGTTCTGGTITTTTTTCCTICTITTGTCTCCTCCCACTATTTATTTATTTATTTATTTATTTATTTCCCTTTCTGGGTTCTATTCTGGTGACTTAC AP2 APYSP1 CGTGAGAAACTGCAGCAACATGTGCGTGGGGGGGGGGGGGCGGTGATTGATGGAAGAACCCAGTTCTTAATAGCAAAAGCTTGGGAATTACAATGAAACCTCGGACTTTATTCAGGGGTT AP2 TOCTCTCTACCTTCCCTTCACCCTCTACTCCCCTAATCCCTCAGCAGCCTCCATATGTGGGGACTGTGGTGGTTTTCAGGCCCCAGGCTCTGAGATCTGAGCAGGGTGTCCGGGCTCTGG CCAGGGCTGCSCAGTTCCTGGATGTGATTCGTCCTAAGCTCATAAATCAAACGCTTTCTATGAATGAGAATGTCATCAAAGAGATCAATTGCAGGAACACATGCACAAATAAAAATCCTC API TTACGTATTTGCCGGGGATCCCCGTCCGAAAGCATTAAGT TAGAAGGCGTTTAGTCATAATTCATTTTTATTGCTCTTTT Proc. Natl. Acad. Sci. USA 89 (1992) 120 240 360 480 13 600 53 720 93 840 126 960 1080 1200 1320 1440 1560 159 1680 199 1800 239 1920 279 2040 319 2160 359 2280 372 2400 2520 2649 2760 2880 3000 3120 3240 3360 3489 36CC 3720 3840 3960 4040 Fic. 2. The nucleotide sequence and the deduced amino acid sequence in single-letter code of the murine Hox-/.1/ gene. The nucleotide sequence is a composite obtained by sequencing 4,040 nucleotide residues of Hox-/.// genomic DNA (clones A16 and AT7) and 1,250 residues of clone ATc2 cDNA, which correspond to nucleotide residues 345-821 and 1462-2234. Numbers on the right correspond to deoxynucleotide or amino acid residues. The conserved hexapeptide, EYPWMK, and the homeobox are enclosed within boxes. Hox-1.11 mRNA synthesis is initiated at nucleotide residue 135. The black inverted triangles correspond to RNA splice sites. The splice branch recognition sequence near the end of the intron is underlined, and the branch site is shown as a boldface letter (nucleotide residue 1424). Putative sites for phosphorylation catalyzed by protein kinase C are shown as white S or T residues on black backgrounds; a black T shown on a grey background (T-198) is a putative phosphorylation site catalyzed by cAMP-dependent protein kinase A. S or T residues enclosed within open boxes are possible phosphorylation sites catalyzed by casein kinase II. The polyadenylylation signal sequence, ATTAAA (2206-2211), is underlined, and the polyadenylylation site (nucleotide residue 2234) is shown in boldface letters. T/G clusters on the downstream side of the polyadenylylation site and six additional polyadenylylation signal sequences are underlined. Some putative binding sites for proteins that regulate gene expression also are underlined. Two polypyrimidine regions and 30 sequential T residues also are underlined (30 T residues were found with Al6 DNA, 20 with clone AT7 DNA). that nucleotide residue 2234 functions as the polyadenyl- ylation site. T/G clusters, which may play a role in the formation of poly(A)+ RNA (for review see ref. 20), were found downstream of the polyadenylylation site. Six addi- tional polyadenylylation signals were found in the 3’ untrans- lated region of the Hox-/.// gene, but we do not know whether they function as alternative polyadenylylation signals. Primer extension experiments (not shown here) revealed one major site for initiation of Hox-1.11 mRNA synthesis at nucleotide residue 135. Further work is needed to determine whether the AATAAA sequence 40 nucleotide residues up- stream of the initiation site for mRNA synthesis functions as a TATA box. The first nucleotide residue of ATc2 cDNA corresponds to nucleotide residue 345 shown in Fig. 2; Biochemistry: Tan et al. Hox-i.il 4 INGEFERE IGF INSQPSLAECLTSFPG eTFOSSSTKASTLESEAMEEEP PE) 56 HOX2H WII Top Pd 59 e) SR SET OR SESE ey Vole US PUAN Pay ae eee seen PE le 47 PFEQTHPSLAPG HERHGAGVGGUIPK SSEIAGSRESISVED Ls ie bala as 1cg Pee AIGA S'S _ORPRSOE EDGYALP PPR SrSL gE Tah 0 od AEST I E 167 ¥ oa ALPP PACHGHK.. 2... ESMELADGSERESI GES OUTRO MMAR Ca taey 163 SOSATS PRIPAASAVPASGVGS PAD ea GGGARRLRTAYTNTOLLELEKEFHFNKY| 167 1 eS ILCRPRRVE TAALLDLTERQVKVWFOQNRRMKHKROTO: JONSIGGKF KN . ISDKVE 222 Wee SS SD SSSR BoP G0 UUT RDP ed LON Ora els se sey) Od cee KO. PPDGIRAC PGI IICDP. . 226 EEEKSLF AL ESV | LLEMEGY TEQONA LSOQOAENGHNGRSOTF AS PLYSNEKNLKH 282 rs A PY SAS . -WEACCEPPEVVIQCAL SAMPRPL vane ee eee RO 266 FQHOSPTY rr STMGONCGAGLNNEBPZEATEVPS NGF s#2DSCLOLSI 344 LEGAGAS SIG@ALRGAGGLEPGPLPEDVFSGROMSY. . 2. FE FigAAehfemeass 320 AVEDSD E38 Proton pels Bu) T SF apoimeil Ny* LSPSLEGSLDSPViggSiaome ST Le T DLO Fic. 3. The amino acid sequence in single-letter code of murine Hox-1.11 protein is compared with that of human HOX2H protein (24). The amino acid residues of the conserved hexapeptide and the homeodomain are underlined. White amino acid residues on a black background correspond to identical amino acid residues; black amino acid residues on a grey background represent groups of conservative amino acid replacements, which are as follows: S$, T, G, A, P/L,M. I, V/ E, D, Q, N/R, K, H/ F, Y. W/ and C. 72 56 Www therefore, 211 nucleotide residues are missing from the 5‘ end of ATc2 Hox-1.11 cDNA. The untranslated regions of the Hox-1.// gene contain possible sites for proteins that are known to regulate gene expression, such as AP-1, AP-2, AP-4, SP-1, and HiNF-A (seven sequential ATTT direct repeats constitute two HiNF-A sites). The 3’ untranslated region also contains 30 sequential T residues; two additional polypyrimidine regions; a (CA)2) repeat, which is expected to assume the conforma- tion of Z-DNA under appropriate conditions; and a region rich in G residues. The amino acid sequence of the murine Hox-1.11 homeo- domain is identical to that of the Hox-2.8 (21), which suggests that both homeobox proteins may bind to the same or similar nucleotide sequences in DNA. Hox-1.3 (22) and Hox-2.1 (23) also have homeodomains with identical amino acid se- quences. A comparison of the amino acid sequences of murine Hox-1.11 protein and human HOX2H (24) protein (the equivalent of murine Hox-2.8) is shown in Fig. 3. The sequence of the human, rather than the murine homeobox protein is shown because only the sequence of the homeobox region of murine Hox-2.8 has been reported thus far. The homeodomain of Hox-1.11 and HOX2H is the most highly conserved region of each protein; however, many identical amino acid residues or conservative amino acid replacements are present in the N-terminal region, near the conserved hexapeptide, and in the C-terminal region of each protein. Approximately 50% of the amino acid residues of the murine Hox-1.11 and human HOX2H proteins are identical, and an additional 18.6% are conservative amino acid replacements. These results suggest that the Hox-/.// and Hox-2.8 genes originated by duplication of the same ancestral gene and then gradually diverged by mutation. Murine homeobox gene clusters Hox-/, Hox-2, Hox-3, and Hox-4 are located on chromosomes 6, 11, 15, and 2, respec- tively. The chromosome that contains the Hox-/.// gene was identified by Southern analysis of DNA preparations ob- tained from 17 Chinese hamster x mouse somatic hybrid cell lines that contain different sets of identified mouse chromo- somes (Table 1). Genomic DNA from each hybrid cell line was incubated with HindIII, subjected to gel electrophoresis, transferred to a nylon membrane, and hybridized to a 32P- labeled Hox-1.11 cDNA fragment. The Hox-1.11 cDNA probe hybridized to a 4 kb murine HindIIL DNA fragment and a 3-kb hamster DNA fragment. DNA from 10 of the 17 hybrid cell lines examined contained the murine 4-kb fragment of the Hox-1!./1 gene. Correlation of the murine chromosome con- tent of each hybrid ceil line with the results of Southern Proc. Natl. Acad. Sci. USA 89 (1992) 6283 Table 1. Analysis of concordance between Hox-/.1] DNA hybridization to mouse DNA from 17 Chinese hamster x mouse somatic hybrid cell lines and the mouse chromosomes present in each cell line No. of hybrids* Mouse % chromosome +/+ -/- +/- f+ discordance 1 8 5 2 2 24 2 9 2 1 5 35 3 4 4 4 2 43 4 7 5 3 2 29 5 2 6 8 1 53 6 10 7 0 0 0 7 8 1 2 6 47 8 8 5 2 2 24 9 5 5 4 2 38 10 1 6 9 1 59 ll 0 7 10 0 59 12 6 1 3 5 53 13 7 5 2 2 25 14 1 6 8 1 56 15 10 0 0 6 38 16 6 6 2 1 20 17 9 1 1 6 42 18 9 4 1 1 13 19 8 4 2 3 29 x 7 3 3 4 41 *Symbols represent the presence (+/—) or absence (—/ ) of Hox- 1.11 DNA probe hybridization to a 4-kb mouse DNA AindIll restriction fragment determined by Southern analysis and the pres- ence (/+) or absence ( /—) of the mouse chromosome that contains the Hox-]./] gene. The number of discordant observations is the sum of the +/— and —/+ observations. The percent discordance is the number of discordant observations, divided by the total number of observations, multiplied by 100. The results show that the Hox-/.// gene resides in mouse chromosome 6. analysis showed that the Hox-/./] gene resides on murine chromosome 6, which suggests that the Hox-/.// gene is a member of the Hox-/ cluster of homeobox genes. The amino acid sequence similarity of Hox-1.11 and Hox-2.8 suggests that Hox-/.1/ is the second gene from the 3’ end of the Hox-/ cluster of genes located between Hox-/.5 and Hox-1.6, and that Hox-1.11, like Hox-2.8, is a member of the Drosophila proboscipedia (pb) subfamily of homeobox genes. DAYS MOUSE EMBRYO 10 12 14 16 18 Fic. 4. Northern analysis of Hox-1.11 poly(A)* RNA at different stages of mouse embryo development. Each lane contains 10 ug of poly(A)* RNA from mouse embryos 10, 12, 14, 16, or 18 days after fertilization as indicated. Poly(A) RNA was subjected to electro- phoresis, transferred to a Nytran membrane, and hybridized to the 5’ portion of Hox-1.11 3?P-labeled cDNA (without the homeobox). The positions of 28S and 18S ribosomal RNA (4718 and 1874 nucleotide residues, respectively), a 1.6-kb RNA standard, and a diffuse band of Hox-1.11 poly(A)+ RNA, approximately 1700 resi- dues long are shown. 6284 Biochemistry: Tan et al. Bae Fic.5. In situ hybridization of a 416-nucleotide-residue Hox-1.11 358-labeled RNA probe (without the homeobox) to Hox-1.11 poly(A)* RNA ina parasagittal section of a mouse embryo 12.5 days after fertilization (A) and a transverse section of a 14-day-old mouse embryo (B), exposed to x-ray film for 5 and 12 days, respectively. MY. mid and posterior mylencephalon; SC, spinal cord; M, mesen- chyme near the larynx; T, thymus; S, sternum; PV, prevertebrae;: TE, telencephalon; and DI, diencephalon. The expression of Hox-1.11 poly(A)t RNA during mouse embryo development was determined by Northern analysis of poly(A)* RNA from embryos at different stages of devel- opment (Fig. 4). The **P-labeled Hox-1.11 ATc2 cDNA probe contained nucleotide residues 1-510 shown in Fig. 1 but not the homeobox. Only one diffuse band of Hox-1.11 poly(A)* RNA, approximately 1.7 kb long, was detected. The abun- dance of Hox-1.11 poly(A)* RNA is low in 10-day-old embryos, maximum in 12-day-old embryos, and then pro- gressively decreases in abundance in 14-, 16-, and 18-day-old mouse embryos. The size of Hox-1.11 poly(A)+ RNA found by Northern analysis (1.7 kb) agrees well with the size of Hox-1.11 mRNA determined by nucleotide sequence analy- sis [1454 nucleotide residues without a poly(A) tail, and 1704 nucleotide residues assuming a poly(A)* tail of 250 residues]. Tissues that contain Hox-1.11 poly(A)* RNA in 12.5- and 14-day-old mouse embryos were identified by in situ hybrid- ization (Fig. 5). A Hox-1.11 S-labeled RNA probe without the homeobox was hybridized to Hox-1.11 poly(A)+ RNA in sagittal or transverse serial sections of mouse embryos, and sections were subjected to autoradiography. Hox-1.11 poly(A)* RNA is expressed prominently in mid and posterior myelencephalon (but not in the pons), spinal cord, larynx, thymus, sternum, and vertebrae. In other sections not shown here, Hox-1.11 poly(A)” RNA also was detected in the VII and VIII cranial ganglia, spinal ganglia, lungs, ribs, and intestine. These results agree with and extend a previous report that Hox-1.11 poly(A)+ RNA in mouse embryos is expressed in posterior hindbrain, cranial ganglia VII and VIII, and mesenchyme of the second and third branchial arches (8). Nothing is known about the functions of Hox-1.11 protein. However, homologous recombination in embryonic stem cells and transgenic mouse technology have been used to obtain strains of mice with Hox-1.6 (25, 26) or Hox-/.5 (27) mutations. Homozygous loss of function mutants of the Hox-1.6 or Hox-1.5 genes have different phenotypes, but each phenotype consists of the absence of some tissues and anatomical defects in other tissues in the head and thorax. Proc. Natl. Acad. Sci. USA 89 (1992) Therefore, Hox-1.6 and Hox-1.5 proteins are required for the development of different parts of the head and thorax. Hox-1.6, Hox-1.11, and Hox-!.5 genes are the first, second, and third genes, respectively, from the 3’ end of the Hox-/ cluster of homeobox genes and are expressed in different, overlapping patterns in mouse embryos. Therefore, we ex- pect that Hox-1.11 homeobox protein also may be required for the development of some tissues in the head and thorax, such as the medulla oblongata, some cranial ganglia or cranial nerves, and tissues that originate from or interact with the cephalic neural crest. 1. Duboule, D. & Dolle, P. (1989) EMBO J. 8, 1497-1505. 2. Graham, A., Papalopulu, N. & Krumlauf, R. (1989) Cell 57, 367-378. 3. McGinnis, W. & Krumlauf, R. (1992) Cell 68, 283-302. 4. Lewis, E. B. (1978) Nature (London) 276, 565-570. 5. Garcia-Bellido, A. (1977) Am. J. Zool. 17, 613-629. 6. Lumsden, A. & Keynes, R. (1989) Nature (London) 337, 424-428. Fraser, S., Keynes, R. & Lumsden, A. (1990) Nature (London) 344, 431-435. 8. Hunt, P., Gulisano, M., Cook, M., Sham, M.-H., Faiella, A., Wilkinson, D., Boncinelli, E. & Krumlauf, R. (1991) Nature (London) 353, 861-864. 9. Nazarali, A., Kim, Y. & Nirenberg, M. (1992) Proc. Nail. Acad. Sci. USA 89, 2883-2887. 10. Guo, L. H., Yang, R. C. A. & Wu, R. (1983) Nucleic Acids Res. 11, 5521-5540. 11. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Nail. Acad. Sci. USA 74, 5463-5467. 12. Shapiro, M. (1990) Binary 2, 187-190. 13. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual (Cold Spring Harbor Lab., Cold Spring Harbor, NY), 2nd Ed. 14. Hoggon,M. D., Helden, N. F., Buckler, C. E. & Kozak,C. A. (1988) J. Virol. 62, 1055-1056. 15. Dony, C. & Gruss, P. (1987) EMBO J. 6, 2965-2975. 16. Hogan, B., Costantini, F. & Lacy, E. (1986) in Manipulating the Mouse Embryo (Cold Spring Harbor Lab., Cold Spring Harbor, NY), pp. 228-232. 17. Kozak, M. (1987) Nucleic Acids Res. 15, 8125-8148. 18. Padgett, R. A., Grabowski, P. J., Konarska, M. M., Seiler, S. & Sharp, P. A. (1986) Annu. Rev. Biochem. 55, 1119-1150. 19. Sheets, M. D., Ogg, S.C. & Wickens, M. P. (1990) Nucleic Acids Res. 18, 5799-5805. 20. Birnstiel, M. L., Busslinger, M. & Strub, K. (1985) Cell 41, 349-359. 21. Rubock, M. J., Larin, Z., Cook, M., Papalopulu, N., Krum- lauf, R. & Lehrach, H. (1990) Proc. Natl. Acad. Sci. USA 87, 4751-4755. 22. Odenwald, W. F., Taylor, C. F., Palmer-Hill, F. J., Friedrich, V., Jr., Tani, M. & Lazzarini, R. A. (1987) Genes Dev. 1, 482-496. 23. Krumlauf, R., Holland, P. W. H., McVey, J. H. & Hogan, B. L. M. (1987) Development 99, 603-617. 24. Acampora, D., D’Esposito, M., Faiella, A., Pannese, M., Migliaccio, E., Morelli, F., Stornaiuolo, A., Nigro, V., Sime- one, A. & Boncinelli, E. (1989) Nucleic Acids Res. 17, 10385— 10402. 25. Lufkin, T., Dierich, A., LeMeur, M., Mark, M. & Chambon, P. (1991) Cell 66, 1105-1119. 26. Chisaka, O., Musci, T. S. & Capecchi, M. R. (1992) Nature (London) 355, 516-520. 27, Chisaka, O. & Capecchi, M. R. (1991) Nature (London) 350, 473-479.