Algorithmic Approaches to Data-Driven Knowledge Discovery in Bioinformatics
Dr. Vasant Honavar
Associate Professor of Computer Science
Iowa State University
Friday, November 10, 2000
12:10 pm
Room 171 Durham Hall
In this talk, I will give an overview of algorithmic approaches to data-driven knowledge discovery from large biological datasets with emphasis on current research projects in the Artificial Intelligence Research Laboratory at Iowa State University. I will describe the application of this approach to two problems in computational molecular biology: assignment of protein sequences to functional families; and genetic network inference from gene expression data. I will conclude with a brief discussion of some of our related work on an infrastructure for selective information retrieval, information fusion, and knowledge discovery from large, dynamic, distributed, autonomous biological data sources.
Dr. Steven Willson
Complex Adaptive Systems Seminar Series
Using genetic algorithms to help align DNA strings
Dr. Steven Willson
Professor of Mathematics
Iowa State University
Friday, November 10, 2000
1:30 pm
Room 1304 Howe Hall
Abstract Two DNA strings for analogous proteins in different taxa frequently need to be "aligned," so that it becomes clear at which sites substitutions occurred and at what regions insertions or deletions occurred. The standard procedure for aligning strings involves giving a score to each match, each mismatch, each opening of a gap, and each extension of a gap. There is a well-known fairly fast algorithm to find an alignment with an optimal total score using those scoring parameters.
We consider different scoring schemes that may have certain numbers of parameters. The choice of the scoring parameters themselves is not easy. We will look at some experiments in which scoring parameters were sought by means of genetic algorithms. We will look at the effects of some different kinds of scoring parameters on a variety of datasets. We may even discuss some methods for alignment that do not depend on scoring parameters. I hope that we will brainstorm about different approaches to the problem.