Vincent J. Carey
RDBMS in Bioinformatics: The Bioconductor Experience
****************************************************

Bioconductor (http://www.bioconductor.org/) is an open source collection
of resources aimed at transparently advancing the theory and practice
of bioinformatics, with a focus on expression arrays and the R
statistical computing environment. I will sketch the key data
structures and data flow processes addressed in Bioconductor thus
far. I will review the role played by RDBMS in the development
and curation of packaged annotation networks and in the analysis of
Serial Analysis of Gene Expression (SAGE) libraries.  Non-relational
database technologies such as BerkeleyDB and HDF5 have also played
a role in tools for archiving and navigating expression array data.
At present the role of RDBMS in Bioconductor is less pronounced than
had been anticipated. This will change as requirements for query
optimization, data structure standardization, and greater volumes of
data and metadata emerge.