Category Archives: Visualization

Skyrails and STRING

Of course I couldn’t resist not to play a little bit with Skyrails after I saw it at Flowing Data blog. Skyrails is a graph visualization system that was designed with expandability and awesome look in mind. All menus can be programmed in odd-looking, but quite easy to learn language, which helps in writing customized interface to particular data.

My quick attempt was to take some sample data from STRING, feed it into Skyrails and see if that makes any sense. My choice was #1 example from STRING main page, which was trpA protein from E. coli K12. The main graph on the trpA interactions page looks as follows:

The same graph in Skyrails:

Of course Skyrails has a 3D representation, is fully interactive, with a little work one can filter some of the connections out, put images of structures instead of green dots, etc. etc. It doesn’t look as clear as STRING, because it wasn’t optimized for such use – in practice it’s much clearer. The video below shows the basic interactions with this dataset.

Is it useful? At the moment, not really. It has already lots of features that more mature programs lack (completely programmable menus are great idea), but usage is still crude and in some cases the flashy effects are disturbing. However, it’s worth to keep an eye on Skyrails. First, development is pretty much guaranteed, as the author said he starts a PhD on this project. Second, the basic roadmap includes features that again aren’t present anywhere else, like client-server architecture (so you can talk to Skyrails system from external application – dynamic, time-aware visualization?). And third – it’s the most cool-looking visualization system I’ve found so far (will it make into a movie, like Genome Valence from Ben Fry did?).

Reblog this post [with Zemanta]

Posted by on September 9, 2008 in Software, Visualization


Tags: , , , ,

Relaxing before weekend – PDB file and Panda3D

Software for visualization of molecules is in majority of cases very focused on its job and rarely allows for something outside its scope (one of exceptions is VMD – you can plot 3d surfaces using its graphic engine). Every couple of months I check status of various 3D engines to see how they are suited for molecular visualization. Recently, I had another look at Panda3D, free 3D engine Disney is using to do some of its games. As an exercise in Python I’m learning right now, I’ve tried to import a PDB file into Panda3D and rotate it.

Panda3D doesn’t have a native support for molecules, instead it supports its own egg format for models. Fortunately, there’s an egg format exporter for Blender, so I imported hemoglobin molecule in cartoon representation into Blender (procedure described at the bottom of this page) and then exported in Panda3D format. The rest was pure Python (and extensive copy/paste from tutorials found on the web). Following code will load model from hbg.egg file, set up some lights and rotate camera around it.

import direct.directbase.DirectStart
from direct.showbase.DirectObject import DirectObject
from pandac.PandaModules import *
from direct.task import Task
import math

#Load the protein model
protein = loader.loadModel("hbg")

#setup lights
light1 = AmbientLight('light1')
light1.setColor(VBase4(0.12, 0.12, 0.12, 1))
plnp = render.attachNewNode(light1)

light2 = PointLight('pointlight')
plnp2 = render.attachNewNode(light2)

#Task to move the camera
def SpinCameraTask(task):
  angledegrees = task.time * 6.0
  angleradians = angledegrees * (math.pi / 180.0)*math.sin(angleradians),-20.0*math.cos(angleradians),2), 0, 0)
  return Task.cont

taskMgr.add(SpinCameraTask, "SpinCameraTask")


Not so impressive screenshot is shown at the top. It’s not a rocket science and state-of-the art visualization, but I’m positively surprised how easy is today to get such thing up and running. Game industry is a large one and even proprietary engines are quite cheap (for non-commercial purposes one can have them for small hundreds of dollars), so I expect quite a few scientific projects built on such platforms coming soon. SL engine is not the last one to be used for such purpose.

Reblog this post [with Zemanta]

Posted by on August 15, 2008 in Visualization


Tags: , , , ,

Visualization of internal repeats in proteins (or DNA)

There’s a number of protein families that have internal repeats (like TPR, Armadillo, ankyrin etc.). I’m very interested in many of them for reasons I will explain in other post. Assessing arrangement of these repeats is straightforward in majority of cases – most of them tend to occur next to each other, with little or no insertions between them (finding them at first is completely different story). However, there are proteins where internal repeats are separated by other domains or repeats, which can result in a real mess (or in scientific language: mosaic-like architecture). When couple of months ago I looked for some visualization method that would allow me to have a quick overview of internal structure of such proteins, I’ve stumbled across The Shape of Song – visualization method developed by Martin Wattenberg, researcher at IBM. This fitted my requirements so I’ve implemented it with some help of Processing (and which I’ve added later to a protein analysis server that has a chance to be published next month). Resulting visualization is below:

Internal repeats in a protein

Repeats are colored according to repeat type and are connected according to repeat family. If you think about it in terms of SCOP (Structural Classification of Proteins) hierarchy, colors represent class, while arcs connect superfamilies. The longer and more complicated analysed sequence is, the more useful this approach seems to be, so for short proteins typical domain bubbles would work better.

People that are into genomic sequences may notice similarity of this approach to Circos developed by Martin Krzywinski (whose work I really admire, especially on HDTR). Basically the idea behind both is pretty much the same, but I’ve never thought about straightening that circle until I saw The Shape of Song. My thinking is sometimes dramatically schematic…


Tags: , , , , ,

CLANS – java tool for cluster analysis of sequences

As frequent visitors of this blog have already noticed, I am a big fan of different tools for data visualization. Today I would like to point you to java software called CLANS (CLuster ANalysis of Sequences) developed by my former colleague Tancred Frickey. CLANS runs (PSI)BLAST on your sequences, all vs all, and clusters them in 2D or 3D according to their similarity. This method allows for rapid classification of huge datasets and has the advantage over, lets say, phylogenetic tree, that one can quickly assess results of the clustering in a visual way (I cannot imagine making any sense of looking at phylogenetic tree with 1500 branches, while the graphical output, as on the animation below, is pretty easy to read).

CLANS animation

Beauty of the idea behind CLANS is that you can apply this method almost to any dataset which can be translated into all-vs-all relations. CLANS page has examples from protein clustering, microarray analysis and (which I like the most) image showing how standard aminoacids cluster in space according to BLOSUM62.


Tags: , , ,

Tracking changes in a multiple sequence alignment

I had few free hours during this weekend so I’ve hacked together couple of scripts that in theory could help me visualize changes between subfamilies in the protein multiple sequence alignment. In essence, I took the alignment, chose a master sequence that correspond to a known structure, removed all columns with gaps in the master sequence, and visualized fragments of the alignment (sliding window with 15 sequences) with Weblogo – software for preparing sequence logos from alignments. On the video below you can see:

  • two boxes showing the same template structure (second is just rotated); size of C-alpha atoms correspond to overall conservation at that position; first few residues do not have corresponding positions in the alignment
  • sequence logo of actual alignment window
  • sequence logo of the whole alignment – as a reference

There are several of things I’m not yet happy with. First of all, visualization of changes on the structure is hardly readable, even with video of much higher quality (probably I should do it with Chimera’s “worm” representation). Second thing is that I have no information which species/proteins I’m looking right now at (another box with highlights on a species tree of the family?). Also, I should remove some redundancy from the alignment; sometimes sliding window contains copies of the same protein. But overall it looks promising enough to convince me to spend few more hours on this small project. However, I would probably do the final version with Processing.


Tags: , , , ,

Linux screencasting software

Just a short note today. If you look for screencasting software for your linux box, I recommend two titles: recordMyDesktop and Wink.

The first one is a typical desktop activity recorder – you mark capture area and that’s all. No fancy options: just a pure video stream from your screen. Video has very good quality (theora and vorbis codecs).

Wink is a screencaster oriented towards preparing interactive tutorials and presentations. You can record screen activity, but also pause the video, add text boxes with explanations, buttons waiting for user interaction (for example “Next” buttons). Output formats are: SWF, standalone EXE (for Windows machines only), PDF, PostScript and HTML. No typical video files, which on the other hand is not really a problem, as the framerate of the recording is pretty small. Another issue is that it apparently cannot record properly windows rendered with OpenGL (like molecular viewers) – window’s interior comes black. Even with these limitations I think Wink is better for preparing tutorials (for example on usage of some online bioinformatics service) than typical screencasting software.


Posted by on January 15, 2008 in Software, Visualization


Tags: , ,

Protein cartoons with Pymol

Here is a short tutorial on the protein cartoons with Pymol. I picked as an example a hemoglobin and focused only on the cartoon representation of the protein, but keep in mind it does not necessarily explores all options of this software. Also, since I’m blind to stereo images, I’m not sure if all of following tips make sense with stereo representation of molecules.

  1. Change protein representation to cartoons.
  2. Turn off depth cue (under”Display”) – unless you want to put an accent on some part of the protein this option is unnecessary, because it’s hard to get 3D feeling from a 5cm on 5cm print.
  3. Turn off specular reflections (under “Display”) – most likely printer is able to show less colors than your screen, and will render specular reflections as harsh white blobs
  4. Change background color into white (as above) – that’s obvious, black background is for viewing on screen
  5. Change view to orthoscopic (as above) – maybe it’s a matter of a personal taste, but perspective view (default in Pymol) creates unnecessary distortions, that again do not help in shape perception on the small print
  6. Turn on option of “fancy helices” (“Settings/Cartoons”) – this renders helices with tubular edges like in Molscript (leave it off if you don’t like it)
  7. Turn on option “smooth loops” (as above) – perception of the secondary structure elements arrangement becomes much easier
  8. Turn on option “highlight color” (as above) – again, it’s a matter of a personal taste; this option make an internal surface of helices grey (you may change the color via command line)
  9. Turn shadows off (“Settings/Rendering/Shadows”) – I feel that on a small print they only disturb the image

What I also do is turning on matte finish on the cartoons. While it doesn’t necessarily look better on the screen, when in print it helps to mask printing artefacts (like raster), when looked at from normal viewing distance.

Then you can test these settings by clicking “Ray”. If you like the final image, save it, read its dimensions and multiply them by 3. Then type into command-line box: ray multipliedX, multipliedY and press enter.

Below I embedded a video showing more or less what I’ve just described.

Feel free to comment if you have any suggestions on improving this process.


Posted by on December 12, 2007 in Software, Visualization


Tags: , ,