How Graph Analytics Can Transform Enterprise Data Protection

essidsolutions

Graph analytics is slowly rising to prominence, but however useful they are, knowledge graphs are still not given their due credence. In this article, Cyberhaven’s CO Howard Ting explains how graph analytics may provide fresh viewpoints and insightful information for improved business decisions. He also discusses its use cases, the difficulties it may help overcome, and the precise ways it can aid in data security.

Data’s economic worth has, over the years, risen due to businesses’ growing reliance on  big data for strategic decision-making. Corporate customers want “data” more than ever, whether organized or unstructured or from various app sources. However, traditional query tools and languages like SQL fall short when it comes to evaluating this kind of complicated data at scale. Thus, enters graph analytics. Although graph analytics is still in its infancy, it attracts big enterprises who want to be the tool’s pioneers.

In a conversation with Spiceworks, Howard Ting, CEO of Cyberhaven, goes into further detail about graph analytics, including its definition, use cases, how it contributes to data protection, and more. 

See More: Role of Interactive Maps in Data Visualization

How Graph Analytics is Changing Data Security

What is graph analytics, and how can it be instrumental in data protection?

When people think about the word “graph,” they may picture a diagram with bars or lines in an Excel document or the Wall Street Journal. But in computing, a graph is a way of storing information. Specifically, a graph database is optimized for storing data types that have many relationships between individual pieces of data. This makes graphs the ideal way for social networks like Facebook and LinkedIn to store people and their connections to each other. But how would a graph be useful in preventing sensitive data from leaving a company?

Figure 1: A graph is a way of storing large amounts of data that is highly connected 

After all, when a user tries to upload a confidential file to a personal cloud storage service, what’s the connection, asks Ting. “It helps to think of a file or a piece of information not as an isolated object, but as the latest in a long series of edits, versions, and transformations.” A single file can splinter into hundreds of copies stored in different places. People may copy content from a file or application and paste it into another file or application. 

“It turns out that tracking the history of every piece of data solves many challenges in protecting information from theft or misuse and a graph database is an underlying technology to store these billions or trillions of data points.” 

Graph analyses resolving challenges

Content-based signatures and content tagging were the two major methods used by DLP solutions to identify and regulate data. Both strategies face significant obstacles. These problems can be solved via graph analysis. How? Ting adduces.

One challenge with relying on keywords or regular expressions (RegEx) to classify sensitive data is that it “produces numerous false positives.” Let’s say you create a policy to protect customer PII. “The problem is that the company has many files containing addresses, phone numbers, and birthdates so you get alerts for things like employees sending in an application for their gym membership. But customer data tends to be stored in only a few specific systems.” 

Graph analysis enables you to craft a more targeted policy that triggers only when those patterns are present and comes from a system handling customer information. 

Another limitation of looking at the content is that “it’s very difficult to classify something that’s confidential or commercially valuable IP.” Minutes from the last board meeting, product designs, proprietary research, schematics, unreleased financial results, and source code are just some examples of things that keywords struggle with. But a human being can plainly see it’s sensitive. “That’s why many companies invested in tools that allow employees to tag sensitive files as such. These products are usually limited to a few file types like Microsoft Office files and the tag is attached to the file so if you copy content and paste it somewhere else the tag is lost.”

Figure 2: Tracking every single event for every piece of data in a graph makes it possible to classify and protect forms of data that are not possible with content patterns alone

“Not all sensitive information contains a recognizable pattern, but all data has a history.” When you track the origin of every piece of information, who handled it, and every step it takes as it flows throughout the organization, you can better classify what’s important and protect it. “A data security policy built in this context can stop source code from leaving the company. As it sees a developer accessing the source code repository on Github, copying content from their browser window or IDE, pasting it into a TXT file on their computer, and then uploading that file to their personal Google Drive can help.” 

Advantages and use cases of graph analytics

Ting believes that some of the greatest threats to enterprise data come from inside the company, whether it’s a malicious employee taking data with them when they quit joining a competitor or even a well-intentioned employee sharing something with the wrong person mistakenly. “Companies that invested in insider risk management (IRM) products to prevent these types of threats found that they overwhelmed SOC teams with erroneous alerts. Two of the biggest limitations of these tools are they look at behavior without considering the sensitivity of the data and they create anomalies based on a limited set of events without connecting it to every related event.”

For instance, an insider risk tool might create an alert for every large upload to a file-sharing website. “By combining behavior analysis with context about the data, a graph-based approach can distinguish between an employee uploading a video from the company picnic from one uploading a CAD design file.” He suggests that instead of flagging every large export from Salesforce, graph analysis can connect that event with everything else a user subsequently does with the data. An employee exporting data to create charts on their computer for the upcoming board meeting is different from a sales rep reuploading the exported file to their Dropbox before quitting. 

See More: Are Graph Databases the Next Big Thing for Big Data Analytics?

How can graph analysis help enterprises handle their data?

Using graph analysis can enable enterprises to see and manage their data and risk in new ways. Ting believes that by significantly reducing false positives compared with legacy DLP and IRM products, data protection using graph analysis can give companies the confidence to implement real-time blocking. This stops threats without worrying about users being constantly blocked from doing their everyday jobs. 

“Obviously stopping incidents from happening has a huge impact on reducing the overall risk to the company’s sensitive data.” Graph analysis can also surface more latent risks. As sensitive data is shared through messaging applications like Slack, internal file sharing applications, and email, it can fall into the hands of employees who wouldn’t otherwise have access to it at the source. He says that auditing employees’ data on their computers and comparing that to their access permissions at the source can highlight risks from data sprawl. 

Do you think graph analytics can replace traditional data analytics tools? Let us know on LinkedInOpens a new window , Facebook,Opens a new window and TwitterOpens a new window . We would love to hear from you!

MORE ON DATA ANALYZATION