Data Storage
The systems we manage store information. The capacity for computers to
store information has doubled every year or two. The first home computers
could store 120 kilobytes on a floppy disk. Now petabytes—millions of millions
of kilobytes—are commonly bandied about. Every evolutionary jump
in capacity has required a radical shift in techniques to manage the data.
You need to know two things about storage. The first is that it keeps
getting cheaper—unbelievably so. The second is that it keeps getting more
expensive—unbelievably so.
This paradox will become very clear to you after you have been involved
in data storage for even a short time. The price of an individual disk keeps
getting lower. The price per megabyte has become so low that people now
talk about price per gigabyte. When systems are low on disk space, customers
complain that they can go to the local computer store and buy a disk for next
to nothing. Why would anyone ever be short on space?
Unfortunately, the cost of connecting and managing all these disks seems
to grow without bound. Previously, disks were connected with a ribbon cable
or two, which cost a dollar each. Now fiber-optic cables connected to massive
storage array controllers cost thousands. Data is stored multiple times, and
complicated protocols are used to access the data from multiple simultaneous
hosts. Massive growth requires radical shifts in disaster-recovery systems, or
backups. Compared to what it takes to manage data, the disks themselves
are essentially free.
The shift in emphasis from having storage to managing the data through
its life cycle is enormous. Now the discussion is no longer about price per
gigabyte but price per gigabyte-month. A study published in early 2006 by a
major IT research firm illustrated the variability of storage costs. For arraybased
simple mirrored storage, the report found two orders of magnitude
difference between low-end and high-end offerings.