One of the problems bioinformatics seeks to deal with is the explosion of biological data that is now available for many canonical organisms. This data must be curated, organized, stored, and made accessible to researchers of various technical skill levels. One of the key challenges related to the storage of such data is in finding ways to compare and integrate various sources of data.
Here we consider the example present for the organism Maize (corn). Two separate groups have created unique databases for the same organism using the same starting gene model. By utilizing different protein prediction algorithms and applying varying levels of curation, the resulting databases are significantly different. We perform a detailed comparison between these two databases.
Identifying a key area of curation present in one database and not the other, we seek to merge these high-quality manual GO annotations by transferring them automatically from one database to the other and connecting them with the equivalent objects in the second database. We designed and developed a java-based tool with a graphical user interface which guides users through the transfer process.
Finally, we propose a method which seeks to integrate two different types of biological data, transcriptional regulation and metabolic reactions. In contrast to existing methods which seek to use transcriptional regulation networks to limit the solution space of the constraint-based metabolic model, we seek to define a transcriptional regulatory space which can be associated with the metabolic distribution of interest. This allows us to make inferences about how changes in the regulatory network could lead to improved metabolic flux.