Whole-Genome Homology Maps

[Introduction] [Methods] [Downloads] [Acknowledgements] [References]


Whole-genome homology maps attempt to identify the evolutionary relationships between and within multiple genomes. The term "syntenic" is often used to describe regions of multiple genomes that are believed to have evolved from the same region in an ancestral genome. However, it has been pointed out that this use of the term is incorrect (Passarge et al. 1999) and thus we will use the terms "homologous", "orthologous", and "paralogous" instead. Ideally, given K genomes, we would like to identify all orthologous genomic regions as well as paralogous regions within each genome and hypothetical ancestral genome. Maps listing these relationships are extremely valuable to researchers performing comparative analyses of genomic sequence. Presented here is initial work on creating an orthology map for the human, mouse, and rat genomes.


Our basic strategy in building homology maps is to use exons that are orthologous in multiple genomes as map "anchors." Given K genomes, the steps in the map construction are as follows:


Lines in the map files are of the form:
 [Segment #] [Chrom] [Start] [End] [Strand] ...  
where the last 4 fields are repeated for each genome in the map. The fields are tab-delimited. For coordinates on the reverse strand "-", the start coordinate is greater than the end coordinate. Coordinates are 0-based and half-open (the larger of Start and End is one more than the coordinate of the last base included in the segment). Pieces for which no orthologous region could be identified in one of the genomes have "NA" in the fields for the appropriate genomes. The order of the genomes in each line is given by the order of the genomes in the name of the map file.

Lines in the anchor files are of the form:

 [Segment #] [Genome1] [Chrom1] [Strand1] [Start1] [End1] [Genome2] [Chrom2] [Strand2] [Start2] [End2] 
Each line represents an anchor between two of the genomes (Genome1 & Genome2) in a certain map segment (Segment #). The coordinate conventions are the same as for those in the map file.