However, there are many challenges
on the road to ever more advanced computing,
including, but not limited to,
system power consumption and environmentally
friendly cooling, massive
parallelism, and component failures,
data and transaction consistency, metadata
and ontology management, precision
and recall at scale, and multidisciplinary
data fusion and preservation.
Above all, advanced computing systems
must not become so arcane and
complex that they and their services
are unusable by all but a handful of
experts. Open source toolkits (such as
Hadoop, Mahout, and Giraph), along
with a growing set of domain-specific
tools and languages, have allowed many research groups to apply machine
learning to large-scale scientific
data without deep knowledge of machine-
learning algorithms. The same
is true of community codes for computational
science modeling.