RSS

“Startup weekends” in science

News about yet another “startup-weekend-like” event keep hitting me more and more often. They do not always are about creating a company or a product. Sometimes it’s about collaborative coding a game or writing a novel – all in very short time. In many cases it works amazingly well – being so tight on time forces people to be ultra-productive and to be focused only on important parts of the project. I envy people attending such meetings, not necessarily because of possible outcomes, but because of the energetic atmosphere that is present there.

Deepak wrote some time ago about “Bursty work” – idea, that work can be done by distributed teams focused around high value projects, instead of teams gathered around company/startup. That actually made me think if we can join these two ideas in science: to have ultra-productive and distributed team working on time-constrained project.

Lets assume that the average publication in the field of bioinformatics/computational biology takes six months of work of one scientist. It doesn’t really matter if it’s new server, database or protein family annotation. So a team of four people should do the same work in six weeks or faster (why faster? knowledge and skills are not distributed evenly, so someone else may code the necessary script faster than I would do it). If we would increase even further the number of people involved, create a distraction-free environment and prepare enough coffee for everyone, the whole process could be done in a week. Even if the assumptions here are not really correct, I’m pretty sure that quite a number of valuable papers could be done this way in a week.

So what do you think? What about creating a platform that allows for:

  • creating a project that has a clear and appealing outcome (for example publication, or at least manuscript in Nature Precedings)
  • creating a project workspace with all necessary tools (wiki, chat, svn, etc. plus small computational backend for testing)
  • creating a number of roles, that need to be filled by people with certain skills
  • joining the project if the skills match requirements
  • setting an clear deadline (for example, a countdown clock that will forbid to commit changes to the project after certain amount of time, leaving the workspace read-only)

I agree that science takes time, especially the quality science. But on the other hand, I have a feeling that we waste a lot of time learning things by ourselves, instead of learning form others, we waste this time because the outcome is not well defined, and finally we waste time solving everything ourselves instead of bouncing the idea against other people (this is what collaboration is all about). So what about creating an artificial environment that forbids wasting time?

Utopian? Maybe. Naive? Most likely. Worth considering? I hope so. Let me know.

 
 

Tags: , , ,

Jane – Journal/Author Name Estimator

Jane – Journal/Author Name Estimator is a new web based application that can suggest potential reviewers or target journals for a manuscript based on its title and abstract. It was just published by Bioinformatics under Advance Access (but unfortunately it’s not an open access article). I have tested two of my upcoming publications and Jane performed well: I wasn’t surprised by most of predicted names and journal titles. The topic I’m writing about in these papers is rather narrow, so don’t treat it as any performance measure – test it yourself, if you are interested.

Probably I’m not going to use it as authors suggested – I consider this application a helpful literature research tool.

 
1 Comment

Posted by on January 28, 2008 in bioinformatics, Papers, Research, Services

 

Tags: , , ,

Visualization of internal repeats in proteins (or DNA)

There’s a number of protein families that have internal repeats (like TPR, Armadillo, ankyrin etc.). I’m very interested in many of them for reasons I will explain in other post. Assessing arrangement of these repeats is straightforward in majority of cases – most of them tend to occur next to each other, with little or no insertions between them (finding them at first is completely different story). However, there are proteins where internal repeats are separated by other domains or repeats, which can result in a real mess (or in scientific language: mosaic-like architecture). When couple of months ago I looked for some visualization method that would allow me to have a quick overview of internal structure of such proteins, I’ve stumbled across The Shape of Song – visualization method developed by Martin Wattenberg, researcher at IBM. This fitted my requirements so I’ve implemented it with some help of Processing (and which I’ve added later to a protein analysis server that has a chance to be published next month). Resulting visualization is below:

Internal repeats in a protein

Repeats are colored according to repeat type and are connected according to repeat family. If you think about it in terms of SCOP (Structural Classification of Proteins) hierarchy, colors represent class, while arcs connect superfamilies. The longer and more complicated analysed sequence is, the more useful this approach seems to be, so for short proteins typical domain bubbles would work better.

People that are into genomic sequences may notice similarity of this approach to Circos developed by Martin Krzywinski (whose work I really admire, especially on HDTR). Basically the idea behind both is pretty much the same, but I’ve never thought about straightening that circle until I saw The Shape of Song. My thinking is sometimes dramatically schematic…

 
 

Tags: , , , , ,

CLANS – java tool for cluster analysis of sequences

As frequent visitors of this blog have already noticed, I am a big fan of different tools for data visualization. Today I would like to point you to java software called CLANS (CLuster ANalysis of Sequences) developed by my former colleague Tancred Frickey. CLANS runs (PSI)BLAST on your sequences, all vs all, and clusters them in 2D or 3D according to their similarity. This method allows for rapid classification of huge datasets and has the advantage over, lets say, phylogenetic tree, that one can quickly assess results of the clustering in a visual way (I cannot imagine making any sense of looking at phylogenetic tree with 1500 branches, while the graphical output, as on the animation below, is pretty easy to read).

CLANS animation

Beauty of the idea behind CLANS is that you can apply this method almost to any dataset which can be translated into all-vs-all relations. CLANS page has examples from protein clustering, microarray analysis and (which I like the most) image showing how standard aminoacids cluster in space according to BLOSUM62.

 
 

Tags: , , ,

Imaginary protein nanodevices #1

Simple nanodevice - coiled-coil and leucin-rich-repeat protein

This post starts a series devoted to imaginary nanodevices made of proteins. I’m going to play around with known protein structures to see if some of them can form an interesting arrangement. Basic requirement is lack of obvious sterical clashes at the level of a main chain trace. If that is fulfilled I would assume very slight chance that particular arrangement is possible. However, in most cases I won’t bother inventing how to recreate it in the lab, since I don’t feel competent enough. The whole series is more fiction than science and my goal is mainly stretching my and readers imagination.

Lets start with something simple. Structure depicted above is a dimer of leucin rich repeat (LRR) protein (PDB: 1A4Y, chains A and D) with a trimeric coiled-coil (my own model made with BeammotifCC) fitted in. The opening is wide enough to accommodate three helices without any problems. Picture below shows main chain trace of the coiled-coil (in red) surrounded by LRR dimer (all atoms, blue and sea green). As you can see, any coiled-coil made of aminoacids with small side chains would not create any sterical issues. In fact, approximate size of the opening (~35 Angstroms) is much larger than the opening size of the membrane anchor of trimeric autotransporter adhesins (twelve stranded beta-barrel, PDB: 2GR7), which also accommodates a trimeric coiled-coil. So why not to use a beta-barrel instead of LRR? Well, beta-barrels are hardly present outside membranes 🙂 .

Simple nanodevice - coiled-coil and leucin-rich-repeat protein

One can ask question if the single LRR protein can make a full ring. It looks possible from the structure of the single repeat (beta-turn-alpha) – interactions with preceding and following repeats are virtually the same. However, secondary structure elements of these repeats are not perfectly aligned with the axis of the opening. Their tilt forces consecutive repeats to form an imaginary spiral, not a circle (although the tilt does not seem to be large enough to actually allow for spiral folding of larger number of repeats – but that’s only my assumption, it would be worth to check).

So that’s it for now. If you feel that I’m rediscovering wheel, writing something completely silly, or you have any suggestions, please feel free to discourage/encourage me with comments.

 
Comments Off on Imaginary protein nanodevices #1

Posted by on January 21, 2008 in Imaginary nanodevice, Proteins, Structural biology

 

Tags: , , , ,

Tracking changes in a multiple sequence alignment

I had few free hours during this weekend so I’ve hacked together couple of scripts that in theory could help me visualize changes between subfamilies in the protein multiple sequence alignment. In essence, I took the alignment, chose a master sequence that correspond to a known structure, removed all columns with gaps in the master sequence, and visualized fragments of the alignment (sliding window with 15 sequences) with Weblogo – software for preparing sequence logos from alignments. On the video below you can see:

  • two boxes showing the same template structure (second is just rotated); size of C-alpha atoms correspond to overall conservation at that position; first few residues do not have corresponding positions in the alignment
  • sequence logo of actual alignment window
  • sequence logo of the whole alignment – as a reference

There are several of things I’m not yet happy with. First of all, visualization of changes on the structure is hardly readable, even with video of much higher quality (probably I should do it with Chimera’s “worm” representation). Second thing is that I have no information which species/proteins I’m looking right now at (another box with highlights on a species tree of the family?). Also, I should remove some redundancy from the alignment; sometimes sliding window contains copies of the same protein. But overall it looks promising enough to convince me to spend few more hours on this small project. However, I would probably do the final version with Processing.

 
 

Tags: , , , ,

DNASIS SmartNote – online notebook for bioinformatics analysis

I’ve found recently a video showing new web-based application for scientist. This is DNASIS SmartNote – an online notebook for sequence analysis, project organisation and sharing results, thoughts and data with other users/collaborators.

This service is provided by MiraiBio which belong to Group of Hitachi Software. This company provides instruments and software for biological research.

As soon as I resolve issues with obtaining a working account on the SmartNote (so far I cannot log in), I’ll post more about this service.

 
4 Comments

Posted by on January 19, 2008 in bioinformatics, Services, Software

 

Tags: , , , ,

Linux screencasting software

Just a short note today. If you look for screencasting software for your linux box, I recommend two titles: recordMyDesktop and Wink.

The first one is a typical desktop activity recorder – you mark capture area and that’s all. No fancy options: just a pure video stream from your screen. Video has very good quality (theora and vorbis codecs).

Wink is a screencaster oriented towards preparing interactive tutorials and presentations. You can record screen activity, but also pause the video, add text boxes with explanations, buttons waiting for user interaction (for example “Next” buttons). Output formats are: SWF, standalone EXE (for Windows machines only), PDF, PostScript and HTML. No typical video files, which on the other hand is not really a problem, as the framerate of the recording is pretty small. Another issue is that it apparently cannot record properly windows rendered with OpenGL (like molecular viewers) – window’s interior comes black. Even with these limitations I think Wink is better for preparing tutorials (for example on usage of some online bioinformatics service) than typical screencasting software.

 
2 Comments

Posted by on January 15, 2008 in Software, Visualization

 

Tags: , ,

Freelance freedom

This lovely comic is a work of N.C. Winters. New episodes are published every Monday at Freelance Switch.

Freelance freedom

 
1 Comment

Posted by on January 14, 2008 in Career, Fun

 

Tags: , ,

Freelancing science in 2008

I was pretty busy last couple of weeks which resulted in massive amount of unread items in Google Reader. The New Year came unnoticeable but brought a lot of changes to my scientific life and to this blog. First of all, “freelancing science” became real – as of the first of January 2008 I’m no longer an employee and have no plans to be one soon. While I’m going to hold one of my academic affiliations, it’s no longer a formal agreement that binds me to a single place. This will allow me to jump with others into pool called “open science”. If you wonder how am I going to make money, all I can say is that I’m wondering about that too :).

Second thing is that I plan write more about bionanotechnology here, as I hope to merge protein science in silico with nanotech at some point. Molecular machines, multimeric complexes etc. here I come.

I plan to explore even more the topic of molecular graphics, maybe in a form of a separate site. In times of “Ice Age” (or any current animated movie) most of modern scientific visualizations look like Windows 98 next to Apple’s Leopard.

So stay tuned and I wish you an exciting year of 2008. Mine is going to be exciting for sure.

 
7 Comments

Posted by on January 8, 2008 in Career