How to Build AI and ML Applications in the Age of GDPR

As AI/ML technology matures, ethical compliance and GDPR must play a central role in conversations around AI development. This article discusses three ways GDPR could influence AI/ML, recommending a five-pronged action plan to address these.

Strict ethical compliance or game-changing innovation? The pursuit of finding an answer to this old debate is at the heart of the recent developments in artificial intelligence (AI) and machine learning (ML), which tries to minimize ethical trade-offs when dealing with data privacy, consumer rights and user wellbeing.

Being data-intensive technologies, AI/ML can sometimes overstep the boundaries of ethical compliance, especially because it is so difficult to define what exactly counts as ethical. Human biases can bleed into AI/ML training data, which skews its operations. When real-world user data is leveraged for AI/ML training (for example, teaching object recognition models by feeding them real faces with personal identifiers), it may inadvertently breach a person’s data rights. AI may be used to make decisions in cases where the customer would rather interact with a human.

The EU General Data Protection Regulation (GDPR) lays down clear guidelines for technology companies to navigate through this complex terrain.

Learn more: How To Minimize the GDPR’s Impact on Your SEO StrategyÂ

Understanding the GDPR’s Implications for AI and ML Development

GDPR was essentially formulated to regulate how companies use personal data. It has been in action since May of 2018 but has reached a new dimension of maturity in the last one year. Between January 2020 and 2021, regulators have issued a total of $193.4 millionOpens a new window in fines â€“ and this number could increase in 2021 without due process around AI/ML development and awareness of its compliance implications.

Broadly, you could sum these up as:

Training data compliance â€“ If AI/ML models are trained on real-world datasets (an aggregation of information collected from thousands of users), the potential for breaching the rights of an individual is high.
The right to explanation â€“ Articles 13, 14, and 15 of GDPR mention that customers can ask their providers to explain how their data is used. Additional proposals under discussion will ask for greater transparency depending on the AI’s magnitude of impact.
100% automation vs. human involvement â€“ According to Article 22 of GDPR, the consumer has the right to request active human involvement in cases where the AI makes a major decision â€“ for example, determining someone’s mortgage interest rate.

In 2021, AI/ML development must take these factors into account if they are to avoid heavy penalties and â€“ more importantly â€“ earn consumer and industry trust.

Learn more: GDPR is Two Years Old, But Compliance Still ConfusesÂ

5 Ways to Develop Ethical and Privacy-sensitive AI and ML Applications

Fortunately, one could adopt a number of strategies for staying on the right side of compliance without any significant innovation trade-off in terms of innovation.Â Â

1. Partner with a specialized synthetic data provider

When scrubbing training datasets before AI/ML development (i.e., removing any personal identifiers that could violate user rights), there is always a concern around efficacy. AI/ML training thrives on context. Therefore, without specific information, is it possible to build truly effective AI apps?Â

This is where synthetic data providers come in. These companies specialize in the creating and ingestion of non-real (but still realistic) information to power AI/ML training. Consider providers like TonicOpens a new window , which mimics production data without recreating identifiable information, and SynthesisOpens a new window , which provides artificially generated image recognition data.Â

2. Use blockchain architecture to maintain transparency and accountability

This is an emerging idea, one that could revolutionize data transparency and accountability in AI development. Blockchain uses distributed ledger technology to store data in immutable blocks. This would allow users to trace the end-to-end data utilization pipeline and know exactly how information was collected, applied, and stored. There have been some advancements in this direction, such as the DeepBrainChainOpens a new window project, which is a high-performance computing network built on blockchain that also ensures data security and privacy through encryption and smart contracts.

3. Forego black-boxing in favor of explainable AI (XAI)

One of the primary traits of AI that gets in the way of ethical compliance is its black-boxed nature. This means that end-users will be able to only see the final insights generated by AI or observe the automated action executed by it, without a glimpse under the hood. This conflicts with GDPR, which makes it mandatory to reveal precisely how data is used. In fact, a new set of proposalsOpens a new window published in April 2021 categorizes AI risk (in terms of legal oversight) as unacceptable, high, limited, and minimal, with a corresponding need for explainability.Â

To address this, XAI replaces the automated function with an explainable model and explainable interface, which the consumer can peruse before executing a task. GoogleOpens a new window has an excellent XAI service to enable this capability.

4. Comply with GDPR Article 22 and provide the option to have a human in the loop

According to Article 22, consumers have the right to question a fully automated task of decision executed by AI and ask to have a human in the loop. Imagine a scenario where an AI assists the hiring process or aids in calculating loan interest rates based on a person’s credit score. As per GDPR, there has to be a provision for having a human in the loop (HITL), and this has to be a key consideration when designing AI apps and workflows. HITL shouldn’t just be used for training ML, but also for executing AI-based decisions. For example, AI automation company JIFFY.aiOpens a new window has a provision for HITL in automated workflows, so that AI actions can be ratified, modified, and revisited by a human.

5. Leverage stringent data governance at the implementation end

Finally, enterprises implementing AI/ML apps have to be mindful of data privacy considerations and GDPR. For instance, if a customer data platform is being used to personalize campaigns, product recommendations, ad content, etc., the proper masking and redaction processes must be followed. The new proposals from April apply to AI developers and enterprises using the technology alike, which makes it crucial to deploy stringent data governance at the implementation end. This will help regulate data collection, utilization, consent management, and data retention internally, reducing non-compliance risk.

Learn more: Are GDPR Class Action Lawsuits the Next Big Headache for Data Professionals?Â

The Bottomline: AI/ML and GDPR Don’t Have To Be at Loggerheads

These five strategies highlight how compliance and innovation can intersect in the field of AI/ML development â€“ and, in fact, why they necessarily should. A focus on data privacy and individual consumer rights from the early stages of AI/ML training and development could preempt severe ethical concerns later on. It would build trust and transparency around the field, encouraging more widespread adoption. Ultimately, the success of AI/ML in future stages of maturity will hinge on its ability to align with regulations like GDPR, achieving meaningful and sustainable progress.

What are your thoughts on the innovation vs compliance debate? Comment below or tell us on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We would love to hear from you!