Digital Repositories and In situ “Virtualized” Data Curation

Data management has focused on buckets almost from Day One; file system buckets, database buckets, backup buckets, archival buckets, and buckets of many other types.  There is a natural tendency to want to consolidate objects with similar attributes to simplify their management and access.

Technologies like cloud computing and “borne digital” information are disruptive because they do not gracefully coexist with the containerized management technique embodied by physical aggregation (i.e., buckets).  Borne digital information can and does appear almost anywhere, in virtually any form, and with cloud-based storage systems, can be stored almost anywhere.

The proliferation of borne digital information throughout the enterprise means that finding and moving it (e.g., to support litigation and eDiscovery) becomes a tedious and time-consuming responsibility.  By tedious and time-consuming, consider the amount of information that has to be combed, the availability of that information (especially when it is on offline media or geographically isolated and subject to access latency).  Consider as well how often such processes need to run to consolidate that information. 

Obviously, in an age of virtualization, what is needed is a virtualized bucket (“vBucket”).  In situ is latin for "in place" or "in position."  In archeological terms, in situ refers to an artifact that has not been moved from its original place of deposition. In other words, in situ means stationary, or "still."  Thus, instead of moving information around the enterprise to consolidate; why not manage it in place – in situ?

Managing archival objects in place avoids a number of disadvantages:

·         The amount of data in the typical enterprise makes it unwieldy or impossible to continually scour file systems and move data into segregated containers; imagine cloud-based storage assets with petabyte or even exabytes of data,

·         Segregating archival information into physical containers increases the vulnerability of media or other errors affecting this information

There are advantages to in-situ management as well:

·         Information can be migrated to new technologies without specialized procedures or planning