![]()
Introduction to Statistical and Computational Genomics
GENOME 559
Department of Genome Sciences
University of Washington School of Medicine
Course description
Rudiments of statistical and computational genomics. Emphasis on basic probability and statistics, and an introduction to computer programming. This course is intended to introduce students with non-computer science backgrounds to the major concepts of programming and statistics.Learning objectives
After taking this course, students will be able to describe and perform basic analysis tasks relating to biological sequence analysis, phylogenetics, pedigree analysis, genetic association studies, population genetics and microarray analysis. Students will be able to demonstrate an understanding of fundamental statistical concepts, such as p-values, t-tests, chi-squared tests and multiple testing correction. Finally, students will be able to write computer programs to perform statistical and bioinformatics analyses.Instructional staff
Instructor: William Stafford Noble
Email: noble@gs.washington.eduInstructor: Mary Kuhner
Email: mkkuhner@gs.washington.eduInstructor: Larry Ruzzo
Email: ruzzo@cs.washington.eduMeeting times and locations
Tue/Thu 3:30-4:50 pm in Hitchcock 220
The class meets in a computer lab and will involve writing computer programs during class time.
Prerequisites
Substantial background in molecular and cellular biology, genetics, biochemistry or related disciplines.Course materials
Bioinformatics: Sequence and Genome Analysis by Mount. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 2004. Second edition.Learning Python by Lutz. O'Reilly, 2007. Third edition.Course requirements
Students will complete eight homework assignments during the course. Assignments will typically involve some written questions and some programming problems.
Examinations
The final exam will be open book, and will cover the entire quarter. The final exam is scheduled for Thursday, March 20, 4:30-6:20 pm in Hitchcock 220.
Course grade
10% for each homework assignment, and 20% for the final exam.Class schedule
Lecture Instructor Lecture topic Concepts Programming topic Reading Homework Tue Jan 6 Noble Sequence comparison: Introduction and motivation Substitution matrices, gap penalties Introduction to Python Thu Jan 8 Noble Sequence comparison: Dynamic programming Dynamic programming, Needleman-Wunsch Strings Mount: ch. 3; Lutz: ch. 1-4, 7 HW1 assigned Tue Jan 13 Noble Sequence comparison: More dynamic programming Numbers, lists and tuples Mount: ch. 6; Lutz: ch. 5, 8 Thu Jan 15 Noble Sequence comparison: Local alignment Smith-Waterman File I/O, if-then-elseLutz: ch. 9-12 HW1 due
HW2 assignedTue Jan 20 Noble Sequence comparison: Significance of similarity scores distribution, p-value, extreme value distribution forloopsMount: ch. 4; Lutz: ch. 13 Thu Jan 22 Kuhner Phylogeny: Parsimony heuristic search, "assumption-free" methods whileloops, modulesHW2 due Tue Jan 27 Kuhner Phylogeny: Models of the mutational process least squares Dictionaries Lutz: pp. 103-107 HW3 assigned Thu Jan 29 Kuhner Parsimony: Distance and likelihood maximum likelihood Defining functions Lutz: ch. 11 Tue Feb 3 Kuhner Phylogeny: Bayesian methods and MCMC Bayes' Theorem, Markov chains Printing and sorting Lutz: pp. 87-90, 455-456 HW3 due
HW4 assignedThu Feb 5 Kuhner Phylogeny: Validating phylogenies likelihood ratio test, bootstrap Regular expressions Lutz: pp. 447-450 Tue Feb 10 Ruzzo Pedigree analysis: Probabilities of genes on pedigrees LOD score Objects and classes (part 1) Lutz: ch. 19-20 HW4 due
HW5 assignedThu Feb 12 Ruzzo Pedigree analysis: Modes of inheritance ascertainment bias Objects and classes (part 2) Lutz: ch. 21 Tue Feb 17 Ruzzo Pedigree analysis: Fitting models to pedigrees nuisance parameters Biopython HW5 due
HW6 assignedThu Feb 19 Ruzzo Association studies: Detecting association between a trait and a gene chi-square, Bonferroni correction, relative risk Tue Feb 24 Ruzzo Association studies: Avoiding pitfalls in association studies bias, correlation vs. causation Thu Feb 26 Ruzzo Population genetics: Categorical data analysis chi-square and multinomial distributions, testing in population genetics HW6 due
HW7 assignedTue Mar 2 Ruzzo Population genetics: Genetic Diversity Statistical approaches for quantifying DNA sequence variation; Expectation of a random variable Tue Mar 4 Ruzzo Whole genome association Linear regression HW7 due
HW8 assignedTue Mar 9 Ruzzo Whole genome association t-test, p-value Thu Mar 11 Ruzzo Whole genome association family-wise error rate, false discovery rate HW8 due