Juergen Kleffe
Charité-University Medicine, Campus Benjamin
Franklin, Institute of Molecularbiology and Biochemistry
Section: Molecularbiology and Bioinformatics, Berlin, Germany
4:10 pm Monday, September 29, 2003
1104 Gilman
COFFEE: 3:45 p.m., 104 Snedecor Hall
Abstract
Expressed sequence tags (ESTs) offer a fast and inexpensive route to gene
discovery, alternative splicing, gene expression and gene regulation. But
the more than five million human ESTs stored in Genbank also show the
limitations of the current programs for sequence comparisons. For instance,
given a gene, the series of BLAST programs is very useful to search for
evidence supporting or opposing its annotation. A more complex problem is
to search for an as large as possible subset of genes with annotations
supported by matching ESTs or cDNAs in a given way. Then even milli seconds
add up to many days if we apply in turn BLAST based programs to a large
number of candidate genes.
We describe considerably faster methods for simultaneous sequence
comparisons based on suffix trees, suffix arrays and their modifications
and programs to answer questions like: Which are the known genes of some organism with full cDNA support? Which genes find confirming or opposing EST or cDNA support? What are the genes alternatively annotated in different database entries?