How to Improve the Accuracy of AI Systems With Diversified Data

With AI playing a greater role in our lives, it is important to ensure that AI projects work for everyone. This includes minimizing biased data by prioritizing accurate and responsible data labeling methods, says Kerri Reynolds, SVP of human resources and crowdsourcing at Appen.Â

Artificial intelligence (AI) has progressed from science fiction to a fact of life. As per GartnerOpens a new window , â€œAI is starting to deliver on its potential and its benefits for businesses are becoming a reality.â€ A BCG MIT Sloan reportOpens a new window reveals more than half of companies are already piloting or deploying AI. However, the current state of AI has also renewed concerns about its impact on society. Some worry AI models built on partial or poor-quality data won’t benefit all citizens equally. Others fear the disruption that will occur when AI begins eliminating jobs on a large scale. To smoothen the transition to the AI economy, analysts, enterprises, and governments are all wondering how we can make AI work for everyone.

Building AI That Works for Everyone

An AI project that doesn’t work for everyone, in the same way, doesn’t work for anyone â€“ users or the businesses developing it. Take this simple example: a speech recognition engine trained only to recognize native English-speaking males will fail to recognize many women and almost anyone with an accent. If this speech engine is deployed in a vehicle routing application, many drivers may become frustrated, late, or lost when their â€œdriving assistantâ€ misunderstands their destination. They may even become distracted while trying to communicate with the system, creating a dangerous situation. The result for the manufacturer? Consumer complaints, lawsuits, canceled orders, and lasting brand damage if the company name becomes associated with injured drivers and passengers.

Biased data will similarly skew or undermine AI-powered applications, from recommendation engines and supply chain optimization to medical diagnostics, facial recognition technologies and robotics.

Eliminating insufficient data depends on proper model training. Before being processed by an algorithm, training data must be collected, cleaned, and annotated by humans. Annotation refers to the application of labels and tags to raw data. These tags identify key features relevant to the decisions the machine learning algorithm will take. For example, is that a recognized English word? Is there a human in this image? Is that a spot on a heart or just a shadow? etc. The accuracy of these labels directly impacts the accuracy of the machine’s future predictions. A machine trained on poorly labeled data will commit errors, make low-confidence predictions, and ultimately never produce the desired results.

The only way to ensure unbiased and accurate labeling is to rely on a vast and highly diverse group, or â€œcrowd,â€ of smart, dedicated annotators representing the broadest range of values, cultures, education, and experiences. Ensuring this diversity is also referred to as â€œresponsible AI.â€ The number of AI annotators is already in the millions worldwide â€“ and growth will continue to accelerate. This then demands another question: Is AI working for the global crowd creating it?

Learn More: Using Data Catalogs to Deliver Actionable Insights for Businesses

Designing Responsible AI Supply Chains

The need for millions of new annotators will undoubtedly provide new opportunities for those who want to be part of the AI Economy built on a flurry of new applications. However, AI leaders have an ethical responsibility to understand who these annotators are and ensure they are treated fairly. It’s the right thing to do, and that way, they will be able to deliver consistent, high-quality results.

Data annotators and collectors are indeed the unsung heroes of the AI economy, ensuring AI works in the real world. They have unique skills and stories, and their different backgrounds and experiences make this global crowd an essential community in making AI work. Most annotators are contractors, and they are an incredibly diverse group, with motivations such as single parents supporting their families, students needing extra income, job seekers with limited opportunities in their local area, or people who are just excited to participate in the AI economy.Â

In my role, I oversee our engagements with over one million contributors in our crowd and am always impressed with their stories. One contributor from Sweden shared how data annotation helped her continue to earn money while undergoing chemotherapy. Another from the U.S. described her battle with severe disease: â€œWhen I found out I had multiple sclerosis, I went into a deep depression because my whole life had just been stolen from me. It felt like my goals and dreams were lost, everything had changed. Annotating for AI helped me get over that. I was able to do something productive.â€

Contributors are also proud of their impact on making AI work in the real world. As one annotator from Brazil shared: â€œAlgorithms learn from us and then learn from each other. If we can improve them with good knowledge based on strong ethics, we’re moving forward as a society.â€ And an annotator from India wrote, â€œMy contribution in building AI for the generations to come not only gives me pride but a greater sense of fulfillment. I feel that we as a team are building a foreground for one of the important technological advancements of tomorrow.â€

Hearing these stories and acknowledging the importance of these contributors to AI’s success demands that we commit to responsible AIOpens a new window . Enterprises and governments must adopt an AI annotator code of ethics that ensures the wellness of annotators through fair wages, healthy conditions, inclusion, respect for privacy and confidentiality, and more. Developing an AI annotator code of ethics also makes good business sense because it can help ensure a steady supply of consistent, high-quality training data created by contributors who know someone is looking out for their wellbeing.

AI is part of our daily lives, and there’s no turning back, so it is up to all of us to ensure that AI works for everyone. Companies that commit to responsible AI projects based on unbiased, high-quality data will ensure project success and glean better business insight, accelerate innovation and enjoy increased ROI â€“ all while taking a responsible and ethical approach that protects their brands and makes the world a better place.

Do you think it is possible to train an AI that works for everyone? Let us know your thoughts on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We would love to hear from you!

How to Improve the Accuracy of AI Systems With Diversified Data

Building AI That Works for Everyone

Designing Responsible AI Supply Chains

Contact ESSID Solutions

Reach out to us for a free consultation on big data consultancy and development services.