Schedule: Tuesday & Thursday
Location: TTI-C conference room 526 on the 5th
floor,
Instructor: Jinbo Xu (jinboxu@gmail.com, office: TTI-C room 528)
First meeting date:
Please drop me an email if you are interested in this course.
Each student will be required to present papers on a specific topic.
Students can register this course through the
With availability of a large-scale of genomic, expression and structural data, mathematics/statistics/computer science is being extensively used for the understanding of biological data at the molecular level. This course will focus on the application of mathematical models and computer algorithms to the problems in the field of molecular biology. In particular, this course consists of two major components. The first component will cover some fundamental computational approaches to biological sequence analysis including pairwise/multiple sequence alignment, homology search and sequence motif discovery. The second component will cover computational approaches to protein/RNA structural bioinformatics including structure comparison, structure prediction and function prediction. If we have more time, we will also touch other topics such as biological network analysis.
Students are highly encouraged to read the following materials before attending this class since they will not be covered in the class.
1. The Department of Energy's Primer on Molecular Genetics.
2. The Department of Energy's Overview of the Human Genome Project.
3. Hunter's molecular biology for computer scientists.
4. National New Biology Initiative: A New Biology for the
21st Century.
Here is an old syllabus for this course. A temporary reading list is available at here.
Graduate students or senior undergraduate students with Math/CS/statistics/biology background. To be able to finish the assignments and the final project, students should also be able to do some programming using C++, Java, Matlab or other scientific computing software.
There will be no examination for this course. The final grade consists of three components: one assignment, one final project and being present. There are two possible options for the assignment: one is to present several papers on a specific topic in the class (2 lectures in total) and the other is to implement an existing algorithm, analyze its performance and write a technical report (around 5 pages). The assignment will account for 35% of the final grade. The final project requires the students to write a survey paper on a specific topic or to propose a novel approach to a bioinformatics problem. The final project accounts for 55% of the final grade. All the students are required to finish both the assignment and the final project. However, undergraduate students will be marked more generously. The students have to attend the class to earn the remaining 10%.
1. Assignment. If you choose to give 2 lectures, please choose the papers for a specific topic in the reading list. If you choose to implement an algorithm, please choose one problem from the following list. (to be updated)
· Implement a dynamic programming algorithm for both global alignment and local alignment
· Implement a protein secondary structure prediction algorithm (SVM, CRF)
· Implement a protein structure alignment program
· Implement a RNA secondary structure prediction algorithm
The due date of the assignment is in the middle of the fall quarter. You can use existing libraries or Matlab to implement your algorithm. However, please clearly point out your contribution in your report. If you use other bioinformatics libraries, please pay more attention to result analysis.
2. Final project. Please choose one topic from the following list. Students are also encouraged to propose the topics that they are interested in. However, you can not work on the same topic for both your assignment and your final project. That is, if you choose an assignment on protein secondary structure prediction, then you have to do a final project on a topic other than secondary structure prediction.
a. Homology search
b. Multiple sequence alignment
c. Sequence motif discovery
d. Protein structure alignment
e. Protein secondary structure prediction
f. Protein threading
g. Homology modeling
The due date of the final project is early in December,
2009. Please send me a brief abstract (one paragraph) to tell me what you are
going to do before