I am interested in large scale analyses of proteins, genomes and metagenomes.
Metagenomics is the study of genomic material extracted directly from the environment. New sequencing technologies have enabled the study of whole populations of genomes taken from microbial communities in the field, as opposed to single species clonal cultures in the lab. Metagenomics offers a way to study how genomes evolve to cope with the microbial biotic and abiotic environments. Our lab helped developed a method to study the correlation between the human gut microbiota and gut gene expression. We are applying this method towards studying infant gut development the effect of gut microbes on human health and wellness.
Bacterial Genome Evolution: Gene blocks are a common occurrence in bacteria: these are genes which lie close together on the chromosome, and may participate in a common cellular or biochemical function. Operons are gene blocks whose member genes are co-transcribed. We have developed a new method to describe the evolution of operons and gene blocks in bacteria. We describe a small set of evolutionary events that can take place in gene block evolution, and count these events to create a new type of molecular clock that tells us how fast or how slow certain gene blocks may have evolved. We hope to learn how new funcitons are acquired by ensembles of genes such as these.
Another interest of mine is the prediction of protein function. Genomics, proteomics and various other ``-omics'' inundate us with sequence and structure information, but the biological functions of those proteins in many cases still eludes us. Computational prediction of protein and gene function is a rapidly growing research field in bioinformatics . I am the co-organizer of the automated computational protein function prediction meetings: AFP. The AFP meetings bring together researchers to discuss various methods for protein function prediction. My personal interest in function prediction lies in predicting function from protein structure . We have recently started work on predicting gene function based on its genomic context in bacteria, using both genomic and metagenomic data towards that end.
We are interested in locating ``structural signatures'' that span different protein folds. My working hypothesis is that there are short local structural commonalities between proteins that otherwise share no obvious structure or function. Detecting these commonalities can help us understand protein evolution, folding, and design.  , 
Different Representations of Protein Structures
The computational representation of a protein's 3D structure is a challenging problem because of varying and often conflicting considerations: at first sight it seems that as far as information is concerned, more is better, hence the drive to atomic level description. However, elaboration on the atomic level can be very ``noisy'' and be time and memory intensive. Therefore we often ask what is the minimal information we need to achieve a specific task, without going into the unnecessary detail of representing each and every atom. I am interested in different computational representations of protein structures suitable for different tasks. In one study we have shown that a 1D representation of protein structures can be used for fast database searching and alignments, and still preserve relevant structural information. .