Noble Research Lab

Department of Genome Sciences
University of Washington

Our research group develops and applies computational techniques for modeling and understanding biological processes at the molecular level. Our research emphasizes the application of statistical and machine learning techniques, such as hidden Markov models and support vector machines. We apply these techniques to various types of biological data, including protein and DNA sequences, data from high-throughput genomic assays such as ChIP-seq and Hi-C, and tandem mass spectr.. We are currently developing methods for analyzing shotgun proteomics data, for characterizing protein function, structure and interactions, and for understanding the structure and regulatory influence of chromatin.

A postdoctoral fellowship is available, focusing on the analysis of genome 3D architecture

    Back row: Damon May, Charles Grant, Lindsay Pino, Andy Lin, Giancarlo Bonora, Tim Durham, Joe Janizek. Front row: Jie Liu, Dejun Lin, Kate Cook, Bill Noble. Seated: Jeff Howbert, Otis.

    Click here for older pictures.

    Lab members

    • William Stafford Noble, Professor, Genome Sciences

    • Jeff Howbert, Research Scientist, Genome Sciences

    • Giancarlo Bonora, Postdoctoral fellow, Genome Sciences

    • Kate Cook, Postdoctoral fellow, Genome Sciences

    • Max Libbrecht, Postdoctoral fellow, Genome Sciences

    • Dejun Lin, Postdoctoral fellow, Genome Sciences

    • Jie Liu, Postdoctoral fellow, Genome Sciences

    • Gurkan Yardimci, Postdoctoral fellow, Genome Sciences

    • Timothy Durham, Ph.D. student, Department of Genome Sciences

    • Alex Hu, Ph.D. student, Department of Genome Sciences

    • Andy Lin, Ph.D. student, Department of Genome Sciences

    • Damon May, Ph.D. student, Department of Genome Sciences

    • Lindsay Pino, Ph.D. student, Department of Genome Sciences

    • Jacob Schreiber, Ph.D. student, Department of Computer Science and Engineering

    • Charles Grant, Senior programmer, Department of Genome Sciences

    • Kaipo Tamura, Software Engineer, Department of Genome Sciences

    • Alice Cheng, Software Engineer, Department of Genome Sciences

    • Dawn Counts, Administrative assistant, Genome Sciences

    Poster produced for display in the lobby of the Genome Sciences Department, May, 2012.



    All of the software listed below is available with source code at the URLs specified. When indicated, some of the software is augmented with a free web server. Dates indicate release dates of the software, with multiple years indicating multiple released versions.

    1. Meta-MEME is a motif-based hidden Markov model toolkit for modeling DNA and protein sequences. The Meta-MEME tools have been incorporated into the MEME Suite. 1998-2008.
    2. Family Pairwise Search is a protein homology detection algorithm that combines sequence similarity scores from a pairwise alignment algorithm such as Smith-Waterman or BLAST. Source code and a web server are available. 1999-2000.
    3. Gist implements the support vector machine learning algorithm for classification, as well as kernel principal components analysis. A web server based upon Gist is available. 1999-2006.
    4. matrix2png is a visualization tool for the display of matrix data. It is available for download or interactive web use. 2002-2006.
    5. Prism is a web interface to matrix2png that includes features specifically for visualizing microarray data. 2003.
    6. Rankprop uses diffusion across a network of protein similarities to identify remote homology relationships. Source code and a web server for searching the non-redundant protein database are available. 2004-2008.
    7. SVM-fold makes predictions of superfamily and fold level classifications of proteins based on the Structural Classification of Proteins hierarchy using the support vector machine learning algorithm. A web server is available. 2004-2007.
    8. ChargeCzar uses a support vector machine to discriminate between +2- and +3-charged tandem mass spectra, with the goal of reducing database search time by eliminating the need to search twice with each spectrum. 2005.
    9. BiblioSpec enables the identification of peptides from tandem mass spectra by searching against a database of previously identified spectra. 2006.
    10. HyFi identifies primer and microarray probe binding sites in genomic DNA. 2006.
    11. Percolator post-processes the results of a shotgun proteomics database search program, re-ranking peptide-spectrum matches so that the top of the list is enriched for correct matches. 2007-2008.
    12. HMMSeg performs wavelet smoothing and unsupervised HMM segmentation on genomic data sets. 2007.
    13. svmvia implements the full regularization path optimization algorithm for training a support vector machine. 2007.
    14. Ishtar designs PCR primers that target multiple loci. 2007.
    15. Pythia designs PCR primers from a thermodynamic point of view. 2007.
    16. Philius predicts protein transmembrane topology and signal peptides. 2008.
    17. Crux analyzes shotgun proteomics tandem mass spectra, associating peptides with observed spectra. 2008-2012.
    18. qvality performs nonparametric estimation of posterior error probabilities. 2008.
    19. Genomedata provides efficient storage of multiple tracks of numeric data anchored to a genome. 2010.
    20. Segway performs simultaneous segmentation and clustering of genomic signal data such as those from ChIP-seq and DNase-seq, finding recurring patterns in these data. 2010-2012.
    21. Segtools provides exploratory data analysis on genomic segmentations. 2010-2011.
    22. Fido uses a probability model to rank proteins according to the posterior probability of their presence in a complex mixture, based on evidence derived from a shotgun proteomics experiment. 2010.
    23. Tide is an ultra-fast implementation of the SEQUEST algorithm for identifying fragmentation mass spectra. 2011.
    24. Fit-Hi-C is a tool for assigning statistical confidence estimates to intra-chromosomal contact maps produced by genome-wide genome architecture assays such as Hi-C. 2014.
    25. Pastis infers the three-dimensional structure of the genome on the basis of Hi-C data. 2015.
    26. DRIP Toolkit is a tandem mass spectrometry search engine that uses a dynamic Bayesian network model. 2016.

    Former lab members

    • Ferhat Ay, Institute Leadership Assistant Professor of Computational Biology, La Jolla Institute for Allergy and Immunology
    • Zafer Aydin, Assistant Professor, Computer Enginering Department, Abdullah Gul University, Kayseri, Turkey
    • Asa Ben-Hur, Associate Professor, Department of Computer Science, Colorado State University, Fort Collins
    • Xiaoyu Chen, Illumina
    • Eleazar Eskin, Associate Professor, Department of Computer Science, Department of Human Genetics, University of California, Los Angeles
    • Michael Hoffman, Scientist, Princess Margaret Cancer Centre, Toronto, Canada; Assistant Professor, Department of Medical Biophysics, University of Toronto
    • Victoria Haghighi, Associate Professor, Department of Psychiatry, Columbia University.
    • Lukas Käll, Assistant Professor, Center for Biomembrane Research, Department of Biochemistry & Biophysics, Stockholm University.
    • Attila Kertesz-Farkas, Assistant Professor, School of Data Analysis and Artificial Intelligence, the Faculty of Informatics, National Research University Higher School of Economics in Moscow, Russian Federation.
    • Aaron Klammer, Pacific Biosciences
    • Darrin Lewis, Postdoctoral fellow, Cold Spring Harbor Laboratory
    • Li Liao, Associate Professor, Department of Computer and Information Sciences, University of Delaware
    • Wenxiu Ma, Assistant Professor, Department of Statistics, UC Riverside
    • Tobias Mann, Director of Bioinformatics, Progenity
    • Sean McIlwain, Bioinformatics Researcher, Greater Lakes Bioenergy Research Center, University of Wisconsin
    • Merja Oja, VTT Technical Research Centre of Finland
    • Paul Pavlidis, Professor of Psychiatry, University of British Columbia
    • Sheila Reynolds, Senior Research Scientist, Institute for Systems Biology
    • Oliver Serang, Assistant Professor of Computer Science, Freie Universit├Ąt Berlin and the Leibniz-Institute of Freshwater Ecology and Inland Fisheries (IGB)
    • Ilan Wapinski, Systems Biology Fellow, Department of Systems Biology, Harvard University
    • Habil Zare, Assistant Professor, Department of Computer Science, Texas State University

    Hike to Heather Lake, May 2006. A party, July 2006. Annual picnic, August 2006. Hike to Lake 22, June 2007. Annual picnic, August 2007. Hike to Wallace Falls, May 2008. Annual picnic, October 2008. Hike to Heather Lake, June 2009. Hike to Gold Creek, June 2010. Annual picnic, August 2010. Goodbye party for Michael Mathews, December 2010. Hike to Boulder River, May 2011. Hike to Talapus Lake, June 2012. Hike to Bridal Veil Falls, July 2013. Goodbye party for Habil Zare, June 2014. Hike to Annette Lake, June 2014. Hike to Snow Lake, July 2015. Hike to Denny Creek, August 2016.

    The lab is located in Foege, room S220.

    Terms and Conditions Online Privacy Statement