Category Archives: Software

Ubiquity – coding something useful in less than 20 minutes

Ubiquity is the new experimental extension to Firefox that will (I’m sure it will) make enormous impact on the way we use the browser. It allows to remix various services and extend functionality of the browser in very easy way (if you don’t get the point of Ubiquity yet, I recommend watching the video that came with official announcement; I needed to see that – description didn’t tell me much about how powerful it can be).

I didn’t have much time to play with it yet, but in spare 20 minutes I attempted to code a command that would show me the image of a structure from PDB given its code and eventually take me to its homepage. Suprisingly it was very easy (and I’m not a JS coder). The source is pasted below.

  name: "pdb",
  description: "Goes to Protein Data Bank given PDB code.",
  icon: "",
  help: "You can specify the PDB code and pressing enter will take you to particular structure's homepage." +
    " If you type pdb code and press arrow down, you should see an image from PDB site.",

  takes: {"PDB code": noun_arb_text},

  execute: function( directObj) {
    var pdbcode = directObj.text;

  preview: function( pblock, directObj ) {
    var pdbcode = directObj.text;

    pblock.innerHTML = "Preview of the structure:<br/>";
    pblock.innerHTML += "<img src=\"" + pdbcode + "_bio_r_250.jpg\" />";


It of course could be improved by using also a selected text, or allowing to keyword search the PDB (or basically any other biological database), but its current functionality suits me just fine. Ubiquity is not yet such a stable platform as Greasemonkey (or Chickenfoot), but it’s worth to keep an eye on it. I’m sure we will read sooner or later an article in peer-reviewed journal describing Ubiquity commands for life sciences :).

Reblog this post [with Zemanta]
1 Comment

Posted by on August 27, 2008 in Software


Tags: , , , , ,

Configuring Torque and InterProScan

Image via Wikipedia

If by the chance, you want to use InterProScan with Torque Resource Manager (queueing system based on PBS project) it doesn’t work by default (it’s tested with LSF, configuration files are supplied for original PBS and Sun Grid Engine). Fortunately there are two small changes needed in the InterProScan config files to make it work. First, during iprscan configuration, choose PBS54 as your queueing system. Then, in the file pbs54.conf (IPRSCANHOME/conf) remove “-d” switch from following two lines:

asyncsub=qsub [%optqueue][%optresource] -d -o /dev/null -e /dev/null "[%toolcmd]"
syncsub=qsub [%optqueue][%optresource] -d -o /dev/null -e /dev/null -I "[%toolcmd"]

Assumming that Torque binaries are available in the global PATH (qsub, qdel etc., on my machine they sit under /usr/local/bin), change in default shell in the enviroment file – from #!/bin/sh to #!/bin/bash. Also, you can add another directories to the PATH in that file (I didn’t). Voilla. InterProScan jobs are now queued.

Zemanta Pixie

Posted by on July 10, 2008 in bioinformatics, Software


Tags: , , ,

Bug tracking systems in science

I’m not going to describe painful process of correcting entries in biological databases or errors in publications when one is not the author – we all know how difficult and unrewarding it is. All major databases contain wrong entries – I see misannotated (or nonexistent) genes in Genbank, artificial domains in PFAM or poorly solved structures in PDB. It’s even worse in publications, where across the whole spectrum of journals I see errors which in theory shouldn’t slip through peer review (this includes such prominent publishers like NPG).

One of the best idea I heard that addressed this issue was to build a bug tracking system (I would like to give credit to the author, but I cannot find the source; wasn’t that one of biobloggers?). It’s simple and efficient. Something is wrong? Fill a bug report. It would be linking to the original entry, would be available for aggregation (for example to track report’s author activity), and possibly could be closed by somebody else than database maintainers or authors if it’s wrong. Because it would be external to all databases, maybe it could grow to provide “community corrected” versions of these databases?

What do you think? How useful such system could be?


Posted by on April 18, 2008 in Comments, Community, Software


Tags: , , ,

CLANS – java tool for cluster analysis of sequences

As frequent visitors of this blog have already noticed, I am a big fan of different tools for data visualization. Today I would like to point you to java software called CLANS (CLuster ANalysis of Sequences) developed by my former colleague Tancred Frickey. CLANS runs (PSI)BLAST on your sequences, all vs all, and clusters them in 2D or 3D according to their similarity. This method allows for rapid classification of huge datasets and has the advantage over, lets say, phylogenetic tree, that one can quickly assess results of the clustering in a visual way (I cannot imagine making any sense of looking at phylogenetic tree with 1500 branches, while the graphical output, as on the animation below, is pretty easy to read).

CLANS animation

Beauty of the idea behind CLANS is that you can apply this method almost to any dataset which can be translated into all-vs-all relations. CLANS page has examples from protein clustering, microarray analysis and (which I like the most) image showing how standard aminoacids cluster in space according to BLOSUM62.


Tags: , , ,

Tracking changes in a multiple sequence alignment

I had few free hours during this weekend so I’ve hacked together couple of scripts that in theory could help me visualize changes between subfamilies in the protein multiple sequence alignment. In essence, I took the alignment, chose a master sequence that correspond to a known structure, removed all columns with gaps in the master sequence, and visualized fragments of the alignment (sliding window with 15 sequences) with Weblogo – software for preparing sequence logos from alignments. On the video below you can see:

  • two boxes showing the same template structure (second is just rotated); size of C-alpha atoms correspond to overall conservation at that position; first few residues do not have corresponding positions in the alignment
  • sequence logo of actual alignment window
  • sequence logo of the whole alignment – as a reference

There are several of things I’m not yet happy with. First of all, visualization of changes on the structure is hardly readable, even with video of much higher quality (probably I should do it with Chimera’s “worm” representation). Second thing is that I have no information which species/proteins I’m looking right now at (another box with highlights on a species tree of the family?). Also, I should remove some redundancy from the alignment; sometimes sliding window contains copies of the same protein. But overall it looks promising enough to convince me to spend few more hours on this small project. However, I would probably do the final version with Processing.


Tags: , , , ,

DNASIS SmartNote – online notebook for bioinformatics analysis

I’ve found recently a video showing new web-based application for scientist. This is DNASIS SmartNote – an online notebook for sequence analysis, project organisation and sharing results, thoughts and data with other users/collaborators.

This service is provided by MiraiBio which belong to Group of Hitachi Software. This company provides instruments and software for biological research.

As soon as I resolve issues with obtaining a working account on the SmartNote (so far I cannot log in), I’ll post more about this service.


Posted by on January 19, 2008 in bioinformatics, Services, Software


Tags: , , , ,

Linux screencasting software

Just a short note today. If you look for screencasting software for your linux box, I recommend two titles: recordMyDesktop and Wink.

The first one is a typical desktop activity recorder – you mark capture area and that’s all. No fancy options: just a pure video stream from your screen. Video has very good quality (theora and vorbis codecs).

Wink is a screencaster oriented towards preparing interactive tutorials and presentations. You can record screen activity, but also pause the video, add text boxes with explanations, buttons waiting for user interaction (for example “Next” buttons). Output formats are: SWF, standalone EXE (for Windows machines only), PDF, PostScript and HTML. No typical video files, which on the other hand is not really a problem, as the framerate of the recording is pretty small. Another issue is that it apparently cannot record properly windows rendered with OpenGL (like molecular viewers) – window’s interior comes black. Even with these limitations I think Wink is better for preparing tutorials (for example on usage of some online bioinformatics service) than typical screencasting software.


Posted by on January 15, 2008 in Software, Visualization


Tags: , ,