Homework Assignment #2

Due Tuesday, 10/19, 11:59pm

Substitution matrix

BLOSUM-62 substitution matrix file. You can assume that any substitution matrix given to your program will be in the same format as this file.

Sample input/output

You may wish to test your program with the following inputs/outputs. The output is the correct alignment given the BLOSUM-62 substitution matrix, g = -10 and s = -1.
  1. simple example: input, output
  2. calcium ion proteins: input, output
  3. yeast kinase proteins input, output
Additionally, we will test your programs on several held-aside sequence pairs.

Sequences for problem #4

In HW #1, you assembled the genomic region of Enterobacter cloacae subsp. cloacae ATCC 13047 that contained the gene lacI. A file containing the protein sequence encoded by this gene is below: Your task is to locate this gene in fragments of three other genomes. You should do this by using your AlignLocal program to locally align the E. cloacae protein sequence against all six possible translations of the genomic fragments given below. Use the blosum-62 substitution matrix given above, g = -10, and s = -1. For each genome, identify which translation contains the orthologous protein sequence. In addition, use your alignments to make a guess as to which species is most closely related to E. cloacae. Note: In order for you to use the blosum-62 matrix given above, these translations were modified by replacing all stop codons with the amino acid Alanine.