Cancer is a disease driven by mutations in the genome that alter the structure, function, and regulation of genes. These mutations range from single letter changes in the DNA sequence to more drastic rearrangements, gains, or losses of large pieces of DNA. In some types of cancer these large-scale alterations are directly implicated in the pathogenesis of cancer and provide targets for cancer diagnostics and therapeutics.
I will describe computational methods for reconstructing tumor genome architectures and analyzing rearrangements in tumor genomes at high resolution using a technique called End Sequence Profiling (ESP). These methods are inspired by techniques in comparative genomics and view a tumor genome as a rearranged version of the normal human genome. In this
framework, both the human and tumor genomes are represented by permutations and the problem is to find a parsimonious sequence of rearrangement operations that transform one permutation into another. I will also
describe how computational analysis of ESP data suggests mechanisms that produce complicated patterns of overlapping rearrangement and duplication events that are observed in some tumor genomes.
Another experimental technique called array comparative genomic hybridization (aCGH) has become indispensable in the identification of duplicated and deleted segments of DNA in tumor genomes. ESP provides an effective complement to aCGH, and I will discuss how to combine data from both types of experiments using network flow techniques in order to obtain a comprehensive view of tumor genome architecture. I will demonstrate the application of these methods to ESP and aCGH data from breast cancers.
Finally, I will describe the implications of this work for the recently proposed Cancer Genome Atlas, a genome project for cancer.