Enrich: Analysis of Protein Function by Enrichment and Depletion of Variants
Enrich is an analysis tool for using high throughput sequencing data to assess protein sequence-function relationships. Enrich takes FASTQ files as input, translating and identifying unique protein sequences and calculating enrichment ratios between libraries for each sequence. Enrich can be run from the command line or in an interactive mode, and is capable of using paired-end read data. Each step of the pipeline can be run separately or the entire sequence of steps can be run consecutively. Enrich can employ DRMAA to parallelize analyses in high-performance computing environments. For a description of Enrich input, function, and output please see the following publications:
High-resolution mapping of protein sequence-function relationships. Fowler DM, Araya CL, Fleishman SJ, Kellogg EH, Stephany JJ, Baker D, Fields S. Nat Methods. 2010 Sep;7(9):741-6. Epub 2010 Aug 15.
Deep mutational scanning: assessing protein function on a massive scale. Araya CL, Fowler DM. Trends Biotechnol. 2011 May 9. [Epub ahead of print]
Protein Functional Analysis by Enrichment and Depletion of Variants (Enrich). Fowler DM, Araya CL, Fields S. manuscript in preparation (email D. Fowler for a copy)
Enrich is implemented in Python and can run without any dependencies. Enrich can be installed using the easy_install module which is part of setuptools. Please see the documentation for instructions on how to install Enrich and dependencies used for optional functions like generating plots.
For support of Enrich, please write to Doug Fowler (dfowler at uw dot edu)