Conclusions
The development of high throughput methods
and the establishment of commercial sources for even
highly specialized biochemical reagents for research in
molecular and cell biology over the past fifteen years has
brought a huge increase in the volume and diversity of
biological and biomedical data. Clinical use of these
technologies has already begun and extensive, even
routine, application is imminent. Full, efficient
exploitation of these expensive investments in data
collection will require complementary investments in
data management technology.
To date most efforts to manage this data have relied
either on commercial off-the-shelf DBMSs developed for
business data, or on homegrown systems that are neither
flexible nor scalable. Better data management
technology is needed to effectively address specific data
management needs of the life sciences. Such needs
include support for diverse data types (such as
sequences, graphs, 3D structures, etc.) and queries (e.g.,
similarity based retrieval), data provenance tracking, and
integration of numerous autonomous databases.