Noble Research Lab

Department of Genome Sciences
University of Washington

Our research group develops and applies computational techniques for modeling and understanding biological processes at the molecular level. Our research emphasizes the application of statistical and machine learning techniques, such as hidden Markov models and support vector machines. We apply these techniques to various types of biological data, including DNA and protein sequence data, as well as gene expression data from microarray experiments. We are currently developing methods for analyzing shotgun proteomics data, for characterizing protein function, structure and interactions, and for understanding the structure and regulatory influence of chromatin.


Postdoctoral fellowships are available






    Front row: John Halloran, Charles Grant, Alice Cheng, Wenxiu Ma, Otis. Back row: Jeff Howbert, Dylan Holmes, Yvonne, Nelle Varoquaux, Max Libbrecht, Attila Kertesz-Farkas, Bill Noble.

    Click here for older pictures.

    Lab members

    • William Stafford Noble, Professor, Genome Sciences

    • Jeff Howbert, Research Scientist, Genome Sciences

    • Ferhat Ay, Postdoctoral fellow, Genome Sciences

    • Attila Kertesz-Farkas, Postdoctoral fellow, Genome Sciences

    • Wenxiu Ma, Postdoctoral fellow, Genome Sciences

    • Timothy Durham, Ph.D. student, Department of Genome Sciences

    • Alex Hu, Ph.D. student, Department of Genome Sciences

    • Max Libbrecht, Ph.D. student, Department of Computer Science and Engineering

    • Haoran Cai, Masters student, Department of Statistics

    • Charles Grant, Senior programmer, Department of Genome Sciences

    • Kaipo Tamura, Software Engineer, Department of Genome Sciences

    • Alice Cheng, Undergraduate, Department of Computer Science and Engineering

    • Dylan Holmes, Undergraduate, Department of Computer Science and Engineering

    • Dawn Counts, Administrative assistant, Genome Sciences


    Poster produced for display in the lobby of the Genome Sciences Department, May, 2012.

    Publications

    Software

    All of the software listed below is available with source code at the URLs specified. When indicated, some of the software is augmented with a free web server. Dates indicate release dates of the software, with multiple years indicating multiple released versions.

    1. Meta-MEME is a motif-based hidden Markov model toolkit for modeling DNA and protein sequences. The Meta-MEME tools have been incorporated into the MEME Suite. 1998-2008.
    2. Family Pairwise Search is a protein homology detection algorithm that combines sequence similarity scores from a pairwise alignment algorithm such as Smith-Waterman or BLAST. Source code and a web server are available. 1999-2000.
    3. Gist implements the support vector machine learning algorithm for classification, as well as kernel principal components analysis. A web server based upon Gist is available. 1999-2006.
    4. matrix2png is a visualization tool for the display of matrix data. It is available for download or interactive web use. 2002-2006.
    5. Prism is a web interface to matrix2png that includes features specifically for visualizing microarray data. 2003.
    6. Rankprop uses diffusion across a network of protein similarities to identify remote homology relationships. Source code and a web server for searching the non-redundant protein database are available. 2004-2008.
    7. SVM-fold makes predictions of superfamily and fold level classifications of proteins based on the Structural Classification of Proteins hierarchy using the support vector machine learning algorithm. A web server is available. 2004-2007.
    8. ChargeCzar uses a support vector machine to discriminate between +2- and +3-charged tandem mass spectra, with the goal of reducing database search time by eliminating the need to search twice with each spectrum. 2005.
    9. BiblioSpec enables the identification of peptides from tandem mass spectra by searching against a database of previously identified spectra. 2006.
    10. HyFi identifies primer and microarray probe binding sites in genomic DNA. 2006.
    11. Percolator post-processes the results of a shotgun proteomics database search program, re-ranking peptide-spectrum matches so that the top of the list is enriched for correct matches. 2007-2008.
    12. HMMSeg performs wavelet smoothing and unsupervised HMM segmentation on genomic data sets. 2007.
    13. svmvia implements the full regularization path optimization algorithm for training a support vector machine. 2007.
    14. Ishtar designs PCR primers that target multiple loci. 2007.
    15. Pythia designs PCR primers from a thermodynamic point of view. 2007.
    16. Philius predicts protein transmembrane topology and signal peptides. 2008.
    17. Crux analyzes shotgun proteomics tandem mass spectra, associating peptides with observed spectra. 2008-2012.
    18. qvality performs nonparametric estimation of posterior error probabilities. 2008.
    19. Genomedata provides efficient storage of multiple tracks of numeric data anchored to a genome. 2010.
    20. Segway performs simultaneous segmentation and clustering of genomic signal data such as those from ChIP-seq and DNase-seq, finding recurring patterns in these data. 2010-2012.
    21. Segtools provides exploratory data analysis on genomic segmentations. 2010-2011.
    22. Fido uses a probability model to rank proteins according to the posterior probability of their presence in a complex mixture, based on evidence derived from a shotgun proteomics experiment. 2010.
    23. Tide is an ultra-fast implementation of the SEQUEST algorithm for identifying fragmentation mass spectra. 2011.
    24. Fit-Hi-C is a tool for assigning statistical confidence estimates to intra-chromosomal contact maps produced by genome-wide genome architecture assays such as Hi-C. 2014.

    Custom genome browser tracks

    • FIMO motif occurrences These UCSC custom tracks shows statistically significant matches between the human genome and motifs from the JASPAR CORE 2009 and TRANSFAC 10.2 motif databases. The quality of each match is summarized as a q-value, defined as the minimal false discovery rate threshold at which the match would be deemed significant. Only matches with a q-value ≤ 0.1 are shown. These tracks were generated using FIMO from the MEME Suite.

    Former lab members

    • Zafer Aydin, Assistant Professor, Department of Electrical and Electronics Engineering, Bahcesehir University, Istanbul, Turkey.
    • Asa Ben-Hur, Assistant Professor, Department of Computer Science, Colorado State University, Fort Collins
    • Xiaoyu Chen, Illumina
    • Eleazar Eskin, Associate Professor, Department of Computer Science, Department of Human Genetics, University of California, Los Angeles
    • Michael Hoffman, Scientist, Princess Margaret Cancer Centre, Toronto, Canada; Assistant Professor, Department of Medical Biophysics, University of Toronto
    • Victoria Haghighi, Associate Professor, Department of Psychiatry, Columbia University.
    • Lukas Käll, Assistant Professor, Center for Biomembrane Research, Department of Biochemistry & Biophysics, Stockholm University.
    • Aaron Klammer, Pacific Biosciences
    • Darrin Lewis, Postdoctoral fellow, Cold Spring Harbor Laboratory
    • Li Liao, Associate Professor, Department of Computer and Information Sciences, University of Delaware
    • Tobias Mann, Director of Bioinformatics, Progenity
    • Sean McIlwain, Bioinformatics Researcher, Greater Lakes Bioenergy Research Center, University of Wisconsin
    • Merja Oja, VTT Technical Research Centre of Finland
    • Paul Pavlidis, Associate Professor of Psychiatry, University of British Columbia
    • Sheila Reynolds, Senior Research Scientist, Institute for Systems Biology
    • Oliver Serang, Research Fellow, Department of Neurobiology, Harvard Medical School / Proteomics Center, Children's Hospital Boston
    • Ilan Wapinski, Systems Biology Fellow, Department of Systems Biology, Harvard University
    • Habil Zare, Assistant Professor, Department of Computer Science, Texas State University

    Hike to Heather Lake, May 2006. A party, July 2006. Annual picnic, August 2006. Hike to Lake 22, June 2007. Annual picnic, August 2007. Hike to Wallace Falls, May 2008. Annual picnic, October 2008. Hike to Heather Lake, June 2009. Hike to Gold Creek, June 2010. Annual picnic, August 2010. Goodbye party for Michael Mathews, December 2010. Hike to Boulder River, May 2011. Hike to Talapus Lake, June 2012. Hike to Bridal Veil Falls, July 2013. Goodbye party for Habil Zare, June 2014. Hike to Annette Lake, June 2014.

    The lab is located in Foege, room S220.

    Terms and Conditions Online Privacy Statement