How to Govern the Data Lifecycle in the Age of Data Sprawl

essidsolutions

Data security has become increasingly challenging and demanding in today’s data-agile world. Here, Thi Thumasathit, VP Products and Customer Success at Dasera, a data lifecycle security platform, explains how IT and Security teams can safely govern the use of their sensitive data. 

It’s shocking to think about how much data the average consumer generates in just one day. But it’s even more shocking to think about how many employees have access to that consumer data. In 2020, the average employee had access to 10 millionOpens a new window records or datasets. In larger companies, that number grows to 20 million records or datasets. Imagine how much of that consumer data is sensitive — confidential to the company collecting the data, confidential to the consumer, or both — and needs to be properly monitored and governed.

Forty-eight percentOpens a new window of employees have more access to data than they actually need to perform their day-to-day jobs. It’s the role of IT or SecOps to manage access control. They ensure that every team and every individual employee has the data they need to do their job — and ideally, they have access only to the data they need. 

So why is it complex to govern how data is stored and used? Why are permissions so difficult for companies to manage? And why do companies who recognize that permissions have their limitations continue to believe that access control policies are enough when it comes to protecting their data?

In the world we live in now, data is everywhere —  in the first cup of coffee we buy on our way to work, the picturesque moments we capture on our phones, and the series of TV episodes we stream late into the night. Companies around the world race and compete to get their hands on that data and mine that data. Technology has advanced so much that companies can collect and analyze more data than ever to meet their business needs. 

Learn More: How Your Data Governance Program Supports a Data-Driven Culture

Data is continuously in motion as it moves throughout an organization. It’s touched by various employees along the way. In the process, data is copied, manipulated, edited, and transformed in all ways we can imagine — and in places, we sometimes cannot see, until at the end of its life, it is finally archived or deleted. 

We call this journey the Data Lifecycle. 

What is the Data Lifecycle? 

At the very essence, the data lifecycle is the series of steps that data goes through from the moment it is created to when it is eventually archived or deleted. In short, the data lifecycle is composed of 3 key stages: 

  • Pre-usage — Preparing data for use 
  • Usage — Infinite ways data is used 
  • Post-usage — Data archival or deletion

Let’s try to understand these stages from one of your customers’ point of view, say a thirty-year-old, Susan Phillips, who gives you her personal information when creating an account with your business. 

  • Pre-usage

The data lifecycle starts when Susan gives her data to your company and provides consent (or not) to how the data will be used. Every interaction Susan has with your company is then recorded to a production database, loaded into a data lake for storage, and likely transformed into a data warehouse for analysis. Susan’s data likely goes through a series of steps to be prepared for business use: cataloging, classification, cleaning, and organization. 

  • Usage

This phase is where your data custodians lose their line of sight. Loss of visibility means decisions are made in the dark with little to no context, there’s less control over the safe, compliant use of sensitive data, and your company is inherently more at risk. 

During this stage, hundreds  — and maybe even thousands  — of your employees’ view, copy, edit, manipulate, and transform Susan’s data to all corners of your infrastructure using different tools and applications. Each interaction with Susan’s data becomes more complex and harder to trace as more copies of the data are made. 

Furthermore, employees have different needs for accessing data, skill sets for handling data, and training levels concerning regulated data (e.g., GDPR, CCPA, PCI). To make matters more complicated, each employee is part of a larger functional team — e.g., Data Science, Sales, Customer Success, or Marketing. To correctly understand appropriate (or inappropriate) use of data, the context of an individual employee’s role, team and department must also be considered.

  • Post-usage

When Susan’s data is archived for future use or deleted, this is where her data lifecycle ends. However, it’s highly likely that your company is unaware of all the copies or duplicates of her data that exist in your data stores. This can occur when Susan requests all of her personal data to be deleted (if Susan is an EU citizen) or simply because the retention period for her data has been met. 

Why Is Data Governance an Issue Even With Access Control?

Most companies depend on access control as the first line of defense for data governance and compliance. Why do so many companies have data governance and compliance issues? The problem is two-fold.

First, access control policies are binary. They control who has and doesn’t have access to data, but they can’t keep up with all the changes made to data after access is given. 

Second, access control policies work best in a static environment. With the proliferation of cloud technologies, businesses have become more agile. New data stores and database clusters can be spun up at a moment’s notice. The speed of DevOps and the speed of data proliferation has outstripped the speed of SecOps’ ability to ensure the accuracy of all access controls. If SecOps teams can’t keep up with these changes, then too many users may be able to access data in these new data stores.  

Learn More: How to Fast-Track Insights Without Sacrificing Data Privacy or Security

Why Is Collaboration Chaotic and Fragmented in Today’s Data Security Landscape?

It’s easy to say that one person, the Chief Information Security Officer, should be responsible for protecting the entire data lifecycle. In reality, data security and compliance have many stakeholders — i.e., IT, Data Custodians, Compliance, Legal, DevOps, and all data users throughout the organization. 

But without the context of how sensitive data is being used, these teams are flying blind and operating in silos. Since they can’t monitor data use and act collaboratively, the data lifecycle is equivalent to the Wild West, where sensitive data is being moved, changed, over-permissioned, and viewed outside of intended data governance security guidelines. That’s why companies need to adopt a data lifecycle mindset and bring all stakeholders together with a single agenda to keep data secure and compliant throughout the data lifecycle.

Let us know if you liked this article on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We would love to hear from you!