When we were discussing data explosion way back, we would conclude that providing larger storage space was the solution to manage it.
But that was 3-4 years back and before. Now it is being stated by large storage platform vendors that data explosion is being caused by absence of efficient copy data management practices.
It is now known that data is steadily growing by about two to five percent each year.
International Data Corporation, a market research firm, states that copy data comprises over 60% of the entire enterprise disk capacity globally. It also adds that by the year 2016, expenditure incurred for copy data storage could reach $50 billion, and this alone would consume more than 315 million terabytes (TB) of space!
Let us examine why this is happening.
If we look at testing and development workflows, numerous code applications copies are generated in several locations, where several teams’ access and use it. Each team then snapshots it, replicates it, and backs it up on disks, tapes, etc. At the time of this process coming to an end, we end up with couple of extra copies of the same data.
Database administrators then usually resort to practices, such as replicating the original database. Following that, the storage infrastructure department protects data by snapshotting it and replicating it. Then, the restore and backup departments procure the database data for exporting it into the media they choose.
So, we finally have the same data on tapes and disks ostensibly for it to be protected. During this entire workflow, a minimum of 10 copies of data will be generated.
This is just one example. Copy data is rife at every level and it is the main culprit behind data explosion.
This has led to the development of platforms for copy data management by a few companies. Notable among them are Actifio and Hitachi Data Corporation.
These firms are trying to help us shrivel the exponential growth of copy data. Simultaneously, they are minimizing the expensive process of data consumption by platforms, which have been created expressly to reproduce the data.
Copy data management does not mean just reducing data copies, it also entails doing away with them, if possible forever.
Since platforms vary in various ways, sometimes your data copies may need high input/output.
Therefore, it is advisable to understand what these platforms are empowered with and how they function. Then you have to accordingly choose a platform which meets your requirements and is able to function in your environment.