The Power of Data Lineage and the Story It Tells


Many organizations have limited information on their data – let alone who’s accessing it – putting them at risk of data breaches, privacy violations and legal headaches. Jim Vint, MD and CRO, and Rebecca Patterson, MD of Breakwater, explore common pitfalls of data management and how data lineage can empower organizations to make informed business decisions.

 You don’t know what you don’t know. This can come back to bite organizations, especially ones handling sensitive data. Organizations need to know what happens (and has happened) to data as it moves through the organization. Unfortunately, many organizations are at risk of data breaches, privacy violations and legal headaches because they don’t know their data well enough – the what (data is there), where (it is stored), how (can it be accessed) and who (can access it) of all their resources.

Enter data lineage: the comprehensive and compounding history of where the information came from, what happened to it, and where it ultimately went. Establishing data lineage is key to protecting your organization and identifying potential vulnerabilities. It can avert cybersecurity threats and limit legal liability while helping streamline and improve business processes. 

The simple but ominous reality “you don’t know what you don’t know” should keep you up at night. Every business, company, and organization has data that comes with inherent security concerns and overall data management considerations. And yet, most businesses, companies, and organizations are only informed of, and therefore focus on, one-half of their data’s story.

See More:  Crossing the Data Chasm: A Strategic Imperative

Data at Rest

 The familiar half of the story is “data at rest” – data that is maintained on computer data storage in any digital form (e.g., cloud storage, file hosting services, databases, data warehouses, archives, tapes, off-site or cloud backups, mobile devices, etc.). But because businesses, companies, and organizations are generally familiar with their data at rest, there can be a false sense of security about it. 

For example, you may think your email data sits in a designated repository, but in reality, access to it resides in multiple locations, including servers, devices, local drives, and mobile devices, sometimes even outside your organization – increasing the security footprint and making it more challenging to manage. And when data starts to move, “data in motion” becomes even more complicated to monitor.

Knowing what happens (and happens) to your data as it moves through an organization is knowledge about data in motion. This includes who has accessed it, what they did with it, and where it went over time. The combination of these data points creates data lineage: the comprehensive and compounding history of where the information came from, what happened to it, and where it ultimately went. That, coupled with intelligence about data at rest, tells a more comprehensive and realistic story about an organization’s data. 

 Lineage puts your data to work for you. It creates a vast wealth of information that allows you better to manage endpoints and devices with access to your network, facilitates discovery requests, enables faster response to cyber breaches, enhances compliance with privacy requirements, manages, classifies, and identifies ROT reduction, reduces the lift-and-shift efforts during data migrations, and create a foundational baseline for overall better decision-making for the organization.

 It provides clarity into how data is being used by individuals, business units, cross-functional teams, and the entire organization​. It supports look-back and forward-looking business decisions around which workflows and processes are best suited for automation and efficiency because you will better understand the business economics of data usage​. 

How Does Data Lineage Actually Work?

In theory, it sounds great, but how does data lineage truly help an organization?

For one, it significantly reduces the data risks associated with document management in litigation or an investigation. Lineage facilitates rapid identification, culling, and analysis of data and documents. Imagine being able to redact privileged information across a document population once, and every subsequent time that document needs to be sent to a third party, that same information can be automatically redacted. The risk of producing privileged information or any sensitive data is materially reduced.

Similarly, lineage can save an organization significant time and expenses because it allows for instant, consumable information about data that facilitates the execution of more effective proactive data strategies and quicker responses to data-driven events. To illustrate, lineage creates the foundation for the identification, analysis, and strategic migration of data between systems in a legally defensible manner. Data lineage can provide intel to know within minutes whether any sensitive data was compromised in a cyber breach. It also can identify failures in a controlled environment before a security incident occurs.

Lineage also creates a tremendous opportunity for insight and intelligence from employees – and organization-mapping. Lineage creates the ability to see how data is being used quickly and easily, and, perhaps more importantly, it adds dimension to data risk by marrying the “who, what, when, where” of data to the organization.

See More: How Data Observability Can Help Companies Win The Data Race

The Way Forward with Data Lineage

Data lineage animates the organizational reality of data usage and can change the way companies make intelligent, informed, risk-based decisions across the business, from Discovery to Legal to Security to HR and beyond. 

Harnessing the power of data lineage provides an incredible opportunity to transform how an organization conducts discovery, reacts to breaches, chooses technology, and makes informed business decisions. Data lineage helps put companies in a position to reduce spend, manage data risk, minimize data, move off legacy platforms, and provide better visibility into regulatory compliance and litigation response. With this much power and insight into your data at rest and data in motion, it’s the type of solution your company can’t afford to ignore.

What benefits of data lineage have you been harnessing? Tell us on FacebookOpens a new window , TwitterOpens a new window , and LinkedInOpens a new window . We’d love to know!