Statistical Laboratory Seminar Notice

Juergen Kleffe
Charité-University Medicine, Campus Benjamin Franklin, Institute of Molecularbiology and Biochemistry
Section: Molecularbiology and Bioinformatics, Berlin, Germany

4:10 pm Monday, September 29, 2003
1104 Gilman

COFFEE: 3:45 p.m., 104 Snedecor Hall

Abstract
Expressed sequence tags (ESTs) offer a fast and inexpensive route to gene discovery, alternative splicing, gene expression and gene regulation. But the more than five million human ESTs stored in Genbank also show the limitations of the current programs for sequence comparisons. For instance, given a gene, the series of BLAST programs is very useful to search for evidence supporting or opposing its annotation. A more complex problem is to search for an as large as possible subset of genes with annotations supported by matching ESTs or cDNAs in a given way. Then even milli seconds add up to many days if we apply in turn BLAST based programs to a large number of candidate genes. We describe considerably faster methods for simultaneous sequence comparisons based on suffix trees, suffix arrays and their modifications and programs to answer questions like: Which are the known genes of some organism with full cDNA support? Which genes find confirming or opposing EST or cDNA support? What are the genes alternatively annotated in different database entries?