Materials and Methods


Orthologs mouse cDNAs

Amino acid sequences from HC21 confirmed genes were used in a TBLASTN search of GenBank to identify orthologous mouse genes. These mouse genes were then used in a BLASTN search to identify corresponding ESTs. HC21 predicted genes and HC21 confirmed genes with no orthologous cDNA sequences in Genbank were compared directly to mouse dbEST sequences to identify putative orthologous EST sequences (


cDNA clones were acquired from four different sources: the IMAGE and BMAP collections (obtained from Research Genetics,; the NIA 15K Mouse cDNA clone set, provided by The National Institute on Aging (; the NIID clone set, from the Japanese National Institute of Infectious Diseases. A few clones had previously been isolated in our laboratory (1, 2)( Since the human ITSN gene encodes alternatively spliced transcripts with ubiquitous and brain-specific expression (3), we collected a cDNA specific for each form. All cDNA clones were sequenced at the 5’ and 3’ ends to confirm their identity.


As no mouse ESTs were available for C21orf13, C21orf29, C21orf57, C21orf68, C21orf83, C21orf95, Cldn17, Erg, Pred53, Prss7, Slc5a3 and Uromodulin-like, a different approach had to be taken. Comparative mapping of HC21 with MC10, MC16 and MC17 was used to identify these orthologs and, based on the most highly conserved regions, putative mRNA sequences were constructed in silico. RT-PCR on RNA from normal mouse tissues was then used to amplify gene-specific fragments, which were then cloned into pCRII-TOPO (Invitrogen). The identity of each clone was confirmed by sequencing. Because Erg and its paralog Fli1 are highly similar at the DNA level, we performed ISH with probes for both genes to be sure that the Erg probe staining was specific.


The availability of the Celera Genomics ( mouse genome sequence at TIGEM allowed us to confirm that the identified ESTs represent portions of the mouse orthologs of the HC21 genes by in silico mapping (4). For three genes, the mouse orthology is ambiguous. H2BFS, a histone H2B family S member, maps to 21q22, and a paralogous gene maps to 6p21. It is likely that the latter represents the ancestral copy from which HC21 gene arose, since the mouse orthologous gene maps to MC13 in a region syntenic to 6p21. TPTE is the only gene, which has been mapped to the short arm of HC21(5). The human TPTE gene is present in multiple copies in the human genome, but the active copies are thought to lie on 21p, 13 and 15q. Its mouse ortholog was mapped to a region of MC8 that shows conserved synteny with human 13q14.2-q21. This region of the human genome contains a partial and highly diverged copy of TPTE that is likely to represent the ancestral copy from which the other copies of TPTE arose through duplication events (1). We did not attempt to isolate orthologs of C21orf31, C21orf34, C21orf35, C21orf61, C21orf69, PRED5, PRED6, PRED14, PRED15, or PRED38 as sufficiently large probes suitable for ISH could not be amplified. In addition, attempts to amplify fragments of Znf298 and Slc37a1 were unsuccessful.


Automated section in situ hybridization

Automated ISH was performed on sections of E14.5 NMRI embryos essentially as described (6). In this study we used a signal amplification step, based on enzyme catalyzed reporter deposition, which results in an approximately 100 fold increase in sensitivity (7). To increase throughput, we used a Tecan Genesis platform equipped with two racks (2 x 48 hybridization chambers): pre-hybridization, hybridization, post-hybridization steps and color detection reactions were carried out automatically allowing processing of as many as 500 sections per day. Image data were acquired on a compound microscope with a scanning stage. Individual images are stored as bitmap files accessible through our web page (


Whole mount in situ hybridization

Embryos were harvested at E9.5 or E10.5 from CD1 mice. Antisense riboprobes were prepared as above, and ISH experiments were performed as previously described (8).


For both whole mount and section ISH, the reproducibility of the results was established as follows. First, for approximately 25% of the genes two alternative probes were used, and expression profiles confirmed. Second, for the remaining genes, approximately 60% were hybridized at least twice and in all cases the staining pattern was confirmed. Finally, if no staining was observed after multiple hybridization, a second probe was selected and tested several times to confirm the negative result.


RT-PCR expression studies

Total RNA derived from 12 normal mouse adult tissues (brain, heart, kidney, thymus, liver, stomach, muscle, lung, testis, ovary, skin and eyes) and four developmental stages (E8.5, E9.5, E12.5 and E19) was extracted, reverse-transcribed and normalized as previously described (9). Prior to reverse-transcription, RNA samples were shown to be free of genomic DNA contamination by PCR using MLH1 intronic primers flanking exon 12 (5’ TGG TGT CTC TAG TTC TGG 3’ and 5’ CAT TGT TGT AGT AGC TCT GC 3’).


Primers for PCR were designed using the primer3 program ( Typically, primers were designed within the ORF, 250 bp apart (, a size >40% longer than the mean size of internal mammalian exons (10). We chose a single PCR rather than a nested-PCR approach to avoid false positive results due to illegitimate transcription (11). Similar amounts of the 20 cDNAs (final dilution 250x) were mixed with HotStarTaq Master Mix (Qiagen) and 4 ng/ul of each of the primers (Operon) with a BioMek 2000 robot (Beckman). The first ten cycles of PCR amplification were performed with the annealing temperature decreasing from 60 to 50ºC, followed by 35 cycles at 50ºC. Amplimers were separated on ‘Ready-to-Run’ precast gels (Pharmacia). Three pairs of primers (2%, 3/161) failed to amplify a fragment from any sample, due either to the paucity of the corresponding transcript or to unfavorable primer design.


Identification of co-expressing gene clusters

Clustering of expression patterns was tested using an adaptation of the method described by Tang and Lewontin (12), originally designed to identify hot- and cold-spots of variation in nucleotide and amino acid sequences. For a particular tissue, co-expression clusters were identified based on gene order from centromere to telomere. Distance between genes was not incorporated into the model, making the test more conservative from a biological point of view since genes that are closer together, and thus more likely to be co-regulated, are weighted equally in terms of their co-expression properties to genes that are far apart. For all tissues, genes considered to have very low expression (labeled +/-, were considered as negative in the analyses. Also, TPTE, which maps to 21p, was omitted from the analyses. We also performed tests to compare the clustering between different tissues (13). Significance levels were derived by randomly permuting 1000 times the hits along the sequence of genes, as suggested in the original method.



1.         M. Guipponi et al., Hum Genet 109, 569 (2001).

2.         A. Reymond et al., Genomics 78, 46 (2001); M. Guipponi et al., submitted, (2002).

3.         M. Guipponi et al., Genomics 53, 369 (1998); C. Pucharcos, C. Casas, M. Nadal, X. Estivill, S. de la Luna, Biochim Biophys Acta 1521, 1 (2001).

4.         M. T. Davisson et al., Genomics 78, 99-106. (2001).

5.         M. Hattori et al., Nature 405, 311 (2000).

6.         U. Herzig et al., Novartis Found Symp 239, 129 (2001).

7.         J. C. Adams, J Histochem Cytochem 40, 1457 (1992).

8.         E. M. Surace, B. Angeletti, A. Ballabio, V. Marigo, Invest Ophthalmol Vis Sci 41, 4333 (2000).

9.         J. Michaud et al., Genomics 68, 71-9 (2000).

10.        E. S. Lander et al., Nature 409, 860-921 (2001).

11.        J. C. Kaplan, A. Kahn, J. Chelly, Hum Mutat 1, 357-60 (1992).

12.        H. Tang, R. C. Lewontin, Genetics 153, 485-95. (1999).

13.        E. T. Dermitzakis, A. G. Clark, Mol Biol Evol 18, 557-62. (2001).