The Center for Microbial Proteomics is housed in the Benjamin Hall Interdisciplinary Research Building, a joint public/private venture that opened in 2006. Previous to 2006, this group was affiliated primarily with the Department of Microbiology, which in turn evolved out of the P. I.'s work in the Department of Medicinal Chemistry at the UW in the previous decade. The P. I. originally received his training in protein chemistry and mass spectrometry at the University of Nevada, Reno and the University of Virginia. Our primary research focus in recent years has been large-scale protein expression studies applied to problems in oral and environmental microbiology, most notably in the context of the collaborative relationships the group has maintained for many years with R. J. Lamont (now at the University of Louisville) and John A. Leigh at the University of Washington, as well as other researchers in the Puget Sound area, The University of Louisville, The University of Florida, Europe and Japan.
The group’s focus has shifted in recent years towards proteogenomic approaches to comprehensive differential gene and protein expression in microbial systems. Our central area of interest is proteome wide protein expression analysis for the dental pathogen P. gingivalis and model biofilms containing this organism and other representative early, late and middle colonizers. As of this writing we are heavily involved in comprehensive proteomic studies of two model interaction partners for P. gingivalis, Streptococcus gordonii and Fusobacterium nucleatum. The broad goals of the work are to understand the subtle changes that take place at the molecular level as P. gingivalis transforms from a relatively benign organism to a more overt pathogen, and the roles of the various biofilm interaction partners in P. gingivalis pathogenicity. We often develop and test our proteomic analytical ideas with a more tractable system of environmental and energy conservation interest, Methanococcus maripaludis. Publications in both areas are listed elsewhere on the web page. Starting in 2009 we have branched out methodologically by using RNA-Seq technology as well as the more direct readouts of gene expression based on protein measurements. The proteogenomic focus of recent years has proven to involve a substantially different and more complex set of bioanalytical problems from that posed by using tandem mass spectrometry and protein chemistry to solve more limited problems of protein structure. As a consequence, our work has taken on a more computational focus by necessity, and much work remains to be done with respect to developing and refining the advanced quantitative methods that are required for meta-proteomic and meta-transcriptomic studies of biofilms and biofilm-host interactions. It is now feasible, although not quite routine, to assay complete proteomes expressed from small genomes, e.g. bacterial pathogens, with the expectation that the resulting datasets will be accurate, reproducible, and relatively complete. The advent of inexpensive high throughput sequencing of genomic and cDNA has allowed the routine strain specific re-sequencing of genomes and the sequencing of transcriptomes for individual experiments, capabilities not available until very recent years. The greater complexity of eukaryotic gene expression, e.g. host response, will require proteomic databases based on annotations that are cell line or tissue specific, and that take into account the various forms of splicing that occur as much as possible, along with temporal changes in protein structure that are dependent on maturation, timing of the cell cycle, etc. In addition, in order to comprehensively measure host response to P. gingivalis invasion and infection at the protein level, mass spectrometry throughput would have to be much higher than what is currently possible in the absence of parallel data streams from several instruments. We are very interested in scaling up to look comprehensively at host-response. The logical next step in our work will involve a parallel array of several small and relatively inexpensive mass spectrometers feeding data simultaneously through a high bandwidth connection to a supercomputing cluster. The details as to how to do this are now well understood. What remains to be done to make this vision a practical tool is essentially engineering. In anticipation of future developments we have worked out the parallel post-acquisition data handling issues using the UW's HYAK system (http://escience.washington.edu/content/hyak-0) as a test bed. We have an interest in the fundamental aspects of data analysis for quantitative large-scale proteomics and transcriptomics experiments, as evidenced by our recent publications.
Besides the whole cell and model oral community protein and mRNA expression projects of current interest, the laboratory has worked on more focused protein characterization studies. The methods and instrumentation development research in the laboratory has always been driven by biology, primarily studies in the field of microbial pathogenesis and environmental microbiology. To a large extent this has involved protein and peptide sequencing strategies using on-line microcapillary HPLC ESI (electrospray ionization) mass spectrometry with low energy CAD (collisionally-activated dissociation), 2D gels and other techniques now collectively referred to as "proteomics." However, protein identification and relative abundance calculations tend to be only one part of our work. With respect to large-scale proteogenomic and transcriptomic studies, improving and better understanding the fundamental quantitative analytical properties of peptide spectral counting and transcription RPKM (Reads Per Kilobase of exon model per Million mapped reads) data are also active areas of concentration.