Inferring Protein-Protein Interactions from Protein
Domain Combinations

Jesus Izaguirre, PhD,
Department of Computer Science and Engineering
University of Notre Dame

December 3rd, 2004, 12 - 1 pm in room 1210 Medical Sciences Center (1300 University Ave.)


A goal of contemporary proteome research is the elucidation of the protein-protein interactions in the cell. Based on currently available protein-protein interaction and domain data of S. cerevisiae, we introduce a novel method, Maximum Specificity Set Cover (MSSC), to predict protein-protein interactions. This algorithm features two stages: First, we select high quality protein-protein interactions that participate in topological motifs based on a clustering measure. Second, we use MSSC to assign probabilities to domain pairs. MSSC is also modified to include the possibility of having more than one domain from each protein causing the protein-protein interaction. This approach allows us to predict previously unknown protein-protein interactions with a degree of sensitivity and specificity that clearly out-scores other approaches. We find that the predicted interaction network preserves the characteristics of the initial web of protein-protein interactions. We also observe high levels of coexpression among putative interactions. We extend our method to infer protein-protein interactions in multicellular organisms where interaction data currently does not exist. Starting from predictions in yeast, we find a set of orthologous interactions in A. thaliana, C. elegans, D. melanogaster, M. musculus, and H. sapiens.

