Data Is the New Oil, and That Makes It an Environmental Hazard

essidsolutions

Data is valuable. However, it can become old, lose its value, and even become risky to keep. In this article, Bill Tolson, VP of global compliance and eDiscovery, Archive 360, discusses why organizations should actively manage their data.

Data is valuable, but it’s not precious.  

With that premise in mind, it’s painfully clear that many organizations don’t dispose of dark, aging, and valueless data. This is not by intent; it just builds up over time, with little or no management by anyone. Worse, companies have little idea of the data volumes and types they’re storing, how long they’ve had it, and how sensitive it is — personally identifiable information (PII), intellectual property (IP), litigation-related content, regulated content, etc. 

Some of this data is stored in on-premise enterprise storage resources such as application-specific repositories, file shares and ‘home’ drives, email systems, and individual cloud storage accounts such as Microsoft’s OneDrive. Additionally, up to 80% of all corporate data resides on individuals’ workstations and laptops, with zero insight on value or sensitivity.

For decades, companies have only actively managed ‘records’ — regulated files classified by government mandates such as SEC Rule 17a-4, MiFID II, IIROC, FDA 21 Part 11, etc. Now, based on industry estimates, can you guess what percentage of data created and received by organizations actually meets this standard? 

It’s 5%. That’s all. 

Yes, 95% of all corporate data is not managed at all by the central corporate authority. Instead, individual employees are expected to manage their own electronic data. 

Personal insight: I’ve been in the information management/eDiscovery profession for almost 30 years. I have never met an employee who actually managed his or her individual electronic files. Instead, today’s information workers are so overloaded with data they don’t even have the time to determine if a given email or file is a regulated record, much less go back and categorize or delete files. Employee-controlled data is quickly forgotten and goes dark — the company can’t access it and doesn’t even know about it. 

See how quickly this can happen: 

Chart: Most content becomes inactive and goes dark very fast. Once it reaches a very low probability of reuse, long-term archiving or defensible disposal should be considered.

For most incoming data, the value falls off quickly. Be honest: How often do you go back to emails after a month? The same holds with spreadsheets, PDFs, etc. This means that most data quickly becomes valueless unless it’s considered a regulated record.

This accumulation of unmanaged and mostly dark data raises numerous risks, all based on value:  How much of it contains PII (subject to GDPR, CCPA/CPRA, and other privacy regulations); how much of it contains corporate IP and other sensitive content; how much of it is potentially relevant to an eDiscovery request, and so on. 

Most importantly, how much is there that is valueless and should be disposed of?

In today’s business, legal, and regulatory environment, it is imperative that all company-owned data, and not just ‘corporate records,’ be actively managed. Active information management enables companies to retain regulated data for the required amount of time, find responsive files to respond to eDiscovery or compliance requests, and react to privacy ‘right to be forgotten’ requests. It also allows them to utilize non-records for analytics and delete valueless data to free up storage resources and reduce overall liability. 

Learn More: How To Drive Human-AI Collaboration in a Post-COVID-19 World

Data: Cost vs. Value vs. Risk

Most data lose value quickly (see reuse graphic). Consider emails setting up a pool for March Madness. A day after the final, those emails offer no value to anyone and should be deleted (unless a loser complains to HR). Yet those emails remain in employee email boxes for years or until IT announces that inbox limits have been reached.  

It’s not just the cost and trouble of expensive storage resources; dark data poses a major risk for IP theft, higher eDiscovery costs and risks, and privacy-based regulatory response. 

So why do organizations keep valueless data in pricey enterprise-class storage when it can be moved to more economical archives or even deleted? The basic answer: corporate culture. Other than perhaps financial services, most industries ignore non-records and see everything as employee property.  

Many years ago, meeting with the General Counsel of a large North American company, I asked what his retention policies were. He said they had decided to retain Canadian data for 30 years and all U.S. data for 15 years. I ran some quick numbers and told him that with those standards, he would need tens of petabytes of spinning disk (remember, this was many years ago). He smiled and said he was retiring in two years. 

Here’s a different metric: The average information worker creates and receives between 50 and 200 megabytes per day. Over time, that’s not an oil well; it’s a tsunami. 

Leveraging the right data is indeed critical; managing and disposing of the rest is vital too.  

Did you enjoy reading this article? Let us know your thoughts in the comment section below or on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We would love to hear from you!