Bioinformatics can mean developing new algorithms for biological data analysis. Scientists who code and release the software face often an issue of making the program portable. I see three clear solutions to that issue. First, one can spend a lot of time porting the source to other platforms (plus testing, fixing and yelling at incompatibilities). This is not easy even within the linux OSes (remember broken HMMER binary packages with Debian and Ubuntu?), not to mention porting to OSX or Windows. What can we do? Second solution is to build a web interface around the software. This is extremely popular and makes almost everyone’s life easier. However there are drawbacks: maintenence of the service (it costs money and grant agencies are not willing to spend a dime on it) and batch access requests from some users (there’s always somebody who wants to feed into your software 5 millions sequences or 50 thousands structures). The third solution to the software portability issue can address at least the second of these drawbacks: one can create a virtual machine with a proper enviroment for developed software, and release it together. Yes, release a software together with the whole enviroment. And it’s not that difficult, as it seems.
We face computing clouds, internet companies that do not have a single server, virtual appliances for quick installation of, let’s say, blog server with WordPress, without any knowledge about software requirements. Virtual appliances, this is complete virtual machines, can contain already configured software (most trivial example would be LAMP – Linux, Apache, MySQL and PHP). So far I found only one such appliance for bioinformatics: it’s called DNALinux Virtual Desktop Edition and contains, among others, BLAST, EMBOSS, Pymol, BioPerl and Biopython. Since VMWare server is free (although registration is required), this makes pretty nice alternative for those with Windows machines, as it allows for running windowed linux at a speed of ca. two-thirds of a native system. VMWare software can create a virtual machine out of the working system, but I wouldn’t recommend that as we usually have much more software installed than it’s needed to run our own programs. So creating a virtual appliance for, let’s say, BLAST, would mean installing a fresh copy of our favourite linux under VMWare Server with nothing more than necessary libraries, copy of BLAST executables and possibly a web interface. Voilla. Virtual appliance for BLAST, anybody?
While it may seem a bit of overkill at first, I don’t think it is in the long run. Porting the software to other operating systems is only part of the story – maintenance to keep it working with newer version of the libraries is another. There’s a lot of programs that are not actively maintained for a long time. I have two quick examples where virtual appliance approach would save them from forgetting: PovChem (rendering of molecules, depends on some ancient libraries) or MACAW (it doesn’t work on anything but Mac OS 9, Windows version crashes the system). OK, MACAW may be not fair, as we face here legal issues with the operating system, but I believe any heavy software user already didn’t count how many times hadn’t tried some well-thought software because of its requirements.
Have a look and try. I’m already running two operating systems (good bye dual-boot) and this is definitely a future for our desktops with already too much processing power. But honestly I dream about a day, when all possible bioinformatics algorithms and biological data will be available at some computing cloud and running Taverna will be a good alternative to all day data munging.
Wolfram Mathematica 6 – no New Kind of Science (yet)
Not so long ago Animesh Sharma pointed to quite old interview of Steven Wolfram about the book “The New Kind of Science” and asked if concepts concerning a biological framework made their way into Mathematica software.
I’ve just returned from Poland Mathematica Conference, and I can answer that question: no, they didn’t. While there were people using Modelica and Mathematica to model some stochastic processes in cells, Mathematica itself does not provide much of a support for any sophisticated description of biological mechanisms. Implications of concepts from The New Kind of Science book looked very promising – it’s a pity that we are not given tools to verify them ourselves.
Posted by Pawel Szczesny on October 30, 2007 in bioinformatics, Comments, Software
Tags: bioinformatics, mathematica, wolfram