Due to the characteristics of medical data/applications such as the heterogeneity, the extremely huge/ever-increasing size, and the expensive storage, it would be beneficial to exploit the power of cloud-based systems to handle such challenges. This is because these systems provide promising solutions of cost-effectiveness, disaster recoverability, elasticity, manageability, and availability.
Our work focuses in particular on Digital Imaging and Communications in Medicine (DICOM: http://medical.nema.org), one of the most important medical standards. The primary objectives of this standard are to achieve inter-operability between medical imaging systems and to facilitate medical data exchange.
The wide use of this standard in the medical domain has led to the development of some DICOM management systems: the Picture Archiving and Communication System (PACS) [14], the most widely used DICOM management system, using mostly relational databases to store DICOM files; eDiaMoND [21], a grid-enabled database of mammogram images and the ORDICOM data type in Oracle 11G [20] enabling to store DICOM file as an object in a column of a database table. Unfortunately, such systems are highly expensive, IT experts dependent, weak expressiveness or/and not scalable. Particularly, in current systems the crash of a server may prevent doctors from getting the required image if it is not stored on a separate portable disk.
The disparity of data management requirements (analytical/statistical needs) has lead to the development of new DBMSs other than the traditional databases. These systems propose either a column-oriented architecture (DSM [10], C-Store [24]) or a hybrid row-column one (Fractured Mirrors [23], HYRISE [17]). The drawbacks of such systems, regarding our context, are the high tuple reconstruction time and/or their inability to overcome DICOM heterogeneity issue. In fact, Even data integration systems [22] [7] cannot cope with our heterogeneity problem. That is due to fact that DICOM files are subject to very large-scale heterogeneity and ever-evolutive schema.
Data management in the cloud is a new domain of research. Different systems have been proposed: DBMS instances (RDS[4], SQL Azure [5]), key-value(s) stores (Amazon S3 [4]) Column-oriented store systems (Bigtable [9]) or MapReduce [12]). Current systems are not fully developed, do not address the schema evolution issue which is very common in our context, and/or cannot provide efficient management of complex/heterogeneous data structures, which is the case of medical data.
This paper presents first steps towards a medical hybrid data management system for cloud computing. Our solution provides an appropriate storage model that overcomes the intense heterogeneity, complexity and huge size of DICOM files. Thereby, the system can support any DICOM file and provide high expressiveness. The system is extendable and adaptable to new imaging modalities. It takes advantage of the cloud features to provide a highly available and cost-effective solution that enables users to find a good compromise between storage spaces
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
Cloud-I '12, August 31 2012, Istanbul, Turkey
Copyright 2012 ACM 978-1-4503-1596-8/12/08…$15.00.