Tracking changes in a multiple sequence alignment

20 Jan

I had few free hours during this weekend so I’ve hacked together couple of scripts that in theory could help me visualize changes between subfamilies in the protein multiple sequence alignment. In essence, I took the alignment, chose a master sequence that correspond to a known structure, removed all columns with gaps in the master sequence, and visualized fragments of the alignment (sliding window with 15 sequences) with Weblogo – software for preparing sequence logos from alignments. On the video below you can see:

  • two boxes showing the same template structure (second is just rotated); size of C-alpha atoms correspond to overall conservation at that position; first few residues do not have corresponding positions in the alignment
  • sequence logo of actual alignment window
  • sequence logo of the whole alignment – as a reference

There are several of things I’m not yet happy with. First of all, visualization of changes on the structure is hardly readable, even with video of much higher quality (probably I should do it with Chimera’s “worm” representation). Second thing is that I have no information which species/proteins I’m looking right now at (another box with highlights on a species tree of the family?). Also, I should remove some redundancy from the alignment; sometimes sliding window contains copies of the same protein. But overall it looks promising enough to convince me to spend few more hours on this small project. However, I would probably do the final version with Processing.


Tags: , , , ,

2 responses to “Tracking changes in a multiple sequence alignment

  1. Allen Liu

    January 23, 2008 at 23:06

    This is a very COOL script you have going. I used worked on a project on elucidating the Dicty kinome a while back and I remember utilizing the Weblogo tool. Although it had a couple of issues back then, I’m sure it has improved a lot. If you are processing in real-time, then it also seems to be heck of a lot faster as well.

  2. freesci

    January 24, 2008 at 08:30

    Thanks Allen. As far as I know Weblogo is being constantly improved, so it’s safe to assume that I used completely different version that you did. Preparing PNG file for alignment of 500 sequences takes around 1sec with Weblogo, so it’s not much. For window of 15 sequences it is almost instant. In my script it takes much more time to combine individual images with ImageMagick, than to prepare them. So in this sense, it’s not real-time – creation of this short video took around 15 minutes, but Weblogo was the least demanding software element.

%d bloggers like this: