Research

Problems in Life Science Broken Down with Grid Computing

Do you get impatient when your computer gets bogged down saving a document, loading a Web page, or burning a CD, and the processor won't do anything else except that task for a few seconds?

Imagine having to run a computer program that would tie up your computer's processor for hours, or even days. That's the dilemma for many scientists as computer models and calculation programs become more complicated.

UW health sciences researchers will be getting some relief in that area thanks to IBM, which donated 56 of its Blade Servers, or powerful, ultra-fast computers, to the UW in 2003. The Department of Computer Science and Engineering is partnering with the genome sciences, bioengineering, and biochemistry departments, as well as the Cell Systems Initiative, to put use those servers for complicated calculations

The servers have been connected in a grid to perform complex, parallel calculations typical for the life sciences, according to Martin Tompa, professor of computer science and engineering.

"The idea is to have some or all of the machines to do work in parallel," said Tompa.

Doing this means taking multi-part problems and breaking them down into separate portions, each of which can be calculated on a separate computer. Fed into a grid of fast computers, calculations that would take a single computer days to solve can be completed in hours.

For instance, Tompa's research consists of finding short sequences of DNA in a particular gene that appear in many different species. A typical search for a group of 20 base pairs in a 1,000-pair-long gene would not only have to search the approximately 1,000 possible locations on that gene, but also for each of those possibilities in a human gene. There are nearly another 1,000 possibilities for a mouse gene, and so on through each additional species being searched.

That means that such an investigation for five species would have 1,000 to the fifth power possibilities, and an equal number of calculations required to search those possibilities. That's one quadrillion, or 1,000 trillion, calculations, which would take a typical research computer days or weeks to finish. If those trillions of calculations are farmed out to 56 machines, however, the task is divided equally and the total computing time drops to a few hours.

Not all problems can benefit from a computing grid, however. The grid saves time only on those calculations that have to be repeated many millions or billions of times, and that can be broken down into smaller, independent groups of problems.

Many types of life science research fit those requirements, such as Dr. David Baker's studies of the formation of protein structures. Baker is an associate professor of biochemistry and adjunct associate professor of genome sciences and of bioengineering. His research team uses computer programs to piece together short peptides, which combine to form complex folded proteins. That sort of analysis could benefit from grid computing, Tompa said.

"Often, when you're trying to find a overarching solution by piecing together lots of small solutions, you're going to have exponential growth in the number of computations," he explained. That's common throughout the life sciences, and a computer grid system will make doing the math for such research much quicker.

© 2003 - 2004 UW Medicine
Maintained by UW Health Sciences and Medical Affairs News and Community Relations
Send questions and comments to drrpt@u.washington.edu