7 Essential Tools to Ensure Data Protection and Drive Innovation

essidsolutions

Privacy laws continue to evolve as consumers are becoming increasingly aware of the security of their personal data. Ulf Mattsson, Chief Security Strategist at Protegrity, discusses seven critical tools that businesses can use to boost data protection while deriving relevant insights from stored data.

The winds of change are gusting through the business world. More than a dozen state governments are crafting data compliance legislation, while Virginia recently passed a data privacy law mostly similar to California’s Consumer Privacy Act. The flurry of legislation adds to a long list of established regulations such as GDPR, HIPAA, PCI DSS, and Sarbanes-Oxley. At the same time, consumers are attuning to how businesses control their data.

Responsible businesses feel these headwinds and recognize customers and employees have a fundamental right to privacy—and these organizations are taking active steps to ensure sensitive data is fully protected.

But it’s a big undertaking—especially as organizations look to unlock the full value of data while keeping it secure. Advanced analytics, machine learning (ML), cloud, the Internet of Things (IoT), and extended supply chains place new and often onerous demands on organizations. As data streams in from various sources and across organizational boundaries, protecting trade secrets and personally identifiable information (PII) is imperative.

Techniques for preserving privacy can be divided into three categories, each with its benefits and constraints: reversible (R) data transformations, non-reversible (N) software-based techniques, and hardware-based security mechanisms (TEE). 

Here is a look at seven privacy-preserving tools and technologies that rely on those techniques. They could benefit your organization, especially as it further pursues data-driven AI projects.

1. Differential Privacy (N)

This framework allows organizations to publicly share information from a dataset using an algorithm to compile statistics about the dataset. It extracts characteristics and data without revealing personal or private data, regardless of how unique an individual’s data is. This ensures the identity of individuals in the database stays protected. 

However, differential privacyOpens a new window and related k-anonymityOpens a new window work best on larger datasets, and it’s important to note the technique can add “noise” to data. This means that the technique can degrade the accuracy and correctness of a statistical operation in some cases. 

Learn More: How New Security and Encryption Layers Strengthen Cloud Databases

2. Data De-identification (N or R)

Frequently, there’s a need to strip identifying information from data. Identifying data may include names, phone numbers, medical record numbers, and sensitive dates. Data de-identificationOpens a new window accomplishes this task. In some cases, it can function as a subset of differential privacy. The technology doesn’t alter or impact the original dataset; it simply extracts relevant data and creates a new dataset that’s typically referred to as the “destination dataset.” 

De-identification technologies frequently work at the dataset level, the FHIR store level, and the DICOM store level, depending on the project’s requirements. They can share information with non-privileged groups, assemble datasets from multiple sources, and anonymize data for machine learning. Some techniques include two-way reversible methods and non-reversible one-way methods.

3. Tokenization (R)

This technique is increasingly popular because it preserves the format of data. It substitutes a sensitive data element with a non-sensitive, randomized equivalent that can be used for various analytics and ML tasks and is resistant to attacks from quantum computers. The replacement data element is known as a token, which essentially serves as a mapping or translation mechanism. 

TokenizationOpens a new window is suitable for analytical applications as well as other applications that may require fast operations. It can also search on encrypted data values, translating clear text values, or enabling “fuzzy search” on protected data. The high level of flexibility built into tokenization technology is appealing to businesses.

Learn More: Top 8 Big Data Security Best Practices for 2021

4. Format-Preserving Encryption (R) 

This technique preserves the format of data. It substitutes a sensitive data element with a non-sensitive encrypted equivalent used for various analytics and ML tasks. It’s often used when there’s a need for a masked data set or when it’s desirable to maintain the actual number of digits on a credit card number or Social Security number, so the data can be used by legacy systems or adhere to regulatory standards. It is less secure than tokenization and approximately ten times slower than traditional encryption. Like Advanced Encryption Standard (AES), it is not resistant to attacks from quantum computers. 

5. Hashing (N)

This algebraic function converts data into a compressed numeric hash or hash value. While encryption works two ways (encrypt and decrypt), hashingOpens a new window involves an irreversible one-way operation. Although the technique can be used for various purposes, it is particularly valuable when applied to certain security and privacy aspects. 

For example, hashing technology makes it possible to store passwords and other sensitive data without revealing the actual string. A company or website can’t view the plaintext password—and if a user forgets or loses it, a complete reset is required. Yet, there are risks associated with hashing, including the incorrect use of the technology, which can open the door to security breaches. 

6. Trusted Execution Environments (TEE)

Another tool for protecting data is a TEEOpens a new window . It relies on an isolated area on a processor that functions independently from the main operating system. This trusted environment allows data to be stored and processed while in a protected state. The technology is already widely used in smartphones, tablets, smart TVs, set-top boxes, and IoT devices. 

TEE often complements encryption and essentially uses a root of trust, a set of functions that can always be trusted, usually because a TEE resides at the silicon level and cannot be accessed by outside devices. Another thing that makes the technology attractive is that it can operate on clear-text information, meaning it’s faster and more scalable than homomorphic encryption, particularly in clouds.

Learn More: Microsoft Exchange Server Hack Shows Why Risk Assessment Is Key to Data Security

7. Homomorphic Encryption (R)

This emerging technique makes it possible to perform computations on encrypted data. Because the underlying data remains invisible, it’s ideal for industries like finance and healthcare—or when there’s a need for multi-party computing. For example, a group of credit card companies would use it to share data to improve fraud detection without revealing customer data. 

Homomorphic encryptionOpens a new window is ideal for advanced analytics and ML tasks. Partially homomorphic encryption (PHE) is simpler and is used to hide some of the data, but fully homomorphic encryption (FHE) locks down the data completely. The technology remains in a nascent state, and it isn’t widely used due to relatively slow speeds. But algorithms are improving, and homomorphic encryption will likely be a powerful tool for protecting data when more powerful quantum computers appear.

In some cases, an organization may want to use more than one of these seven methods with the same data—or at various points in the data lifecycle. Also, specific industry standards and solutions call for different privacy-preserving techniques. In the end, it’s critical to understand how and where you need the different data protection tools and how they can lead you down a path toward responsible AI. 

By preserving privacy in analytics and machine learning, it’s possible to enjoy the best of both worlds: extracting the maximum value from data while building greater trust with customers, business partners, and others. Effective use of these techniques can fuel business growth and success. Privacy and innovation go together. You can no longer have one without the other, and neither is optional.

Let us know if you liked this article on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We would love to hear from you!