Managing data remains a key challenge for organizations looking to launch their own AI initiatives. Appen’s recent State of AI 2021 Report highlights findings in this area and the additional challenges organizations face implementing AI, explains Wilson Pang, CTO, Appen.
In the latest State of AI 2021 ReportOpens a new window , Appen set out to capture a snapshot of the current landscape of AI. We specifically made a point to ask our survey respondents about their priorities around data because data remains one of the most significant hurdles to the successful deployment of AI. In doing so, we found several commonalities in how organizations, large and small, are approaching data governance, many of which may prove interesting for AI practitioners working on their own initiatives.Â
Similar Approaches to Data Governance
As indicated by the report, the following findings help paint a picture of the current state of data governance in AI:Â
1. Priorities vary by size
The first distinction to make here is that priorities vary by what size the organization is. Smaller and medium-sized organizations, for instance, rate data diversity as a high priority in their AI endeavors. Large organizations, on the other hand, are more focused on scaling, which shouldn’t come as a surprise: larger companies have a wide range of business units and teams, so scalability is of top concern.Â
2. Increased reliance on external data providers
The vast majority of organizations pursuing AI now rely on external data providers. This may be the natural consequence of the fact that obtaining high-quality data is mission-critical to AI development, and any additional expertise or tools would give teams a leg up in the competition. Notably, organizations using external data providers are also more likely to consider themselves market leaders, have more deployments, and achieve greater ROI on their projects.
3. Companies are committed to data security
Out of the organizations we surveyed, nine out of ten say their business is good or excellent about addressing privacy and security issues related to AI. Data security should be of paramount importance to AI: these projects need very large volumes of data to function, and that data may include personally identifiable information. Commonly available tools like encryption and anonymization are key elements of AI data security. It seems evident from the report that teams working with AI data are seeing the value of having data protections in place and rightfully investing in these protocols.
For organizations that rely on third-party data providers, they’re 1.8 times more likely to say their company is excellent at data security and nine times less likely to say they’re poor at it. This tracks with the fact that many vendors offer strict security protocols that protect data privacy.
Learn More: How To Drive Human-AI Collaboration
4. Majority are open to sharing data
Interestingly, the majority of companies report being open to sharing their data. This varied slightly among larger companies, which are much less likely to share data than their smaller counterparts due to compliance concerns. Companies that aren’t using an external data provider are also less likely to be open to sharing, perhaps indicative that they are working with more sensitive, proprietary data.Â
Of course, data governance often comprises more than just the factors mentioned above. Nonetheless, it’s interesting to see from a numbers perspective what organizations consider their higher priorities when it comes to data management. This can be a helpful starting point to understanding how these businesses are working to overcome the data challenge inherent to AI development.Â
Solving Major AI Deployment Challenges
Collecting sufficient high-quality data is certainly a massive hurdle in itself, but there are additional barriers to deployment that AI practitioners face. We’ll cover a few common problems and, more importantly, how to overcome them.
1. Selecting the right problem
Where many AI practitioners get tripped up is at the start: selecting the right business problem. How can you ensure you’re solving for the right thing? A few things to keep in mind:
- Identify problems that AI should solve. Keep in mind there are still many problems that don’t require AI and would be better left without it. Make sure the problem you select is one that AI is fit to solve.
- Follow the data. If you have a lot of easily accessible data to help solve a particular problem, that could be indicative of which problem you should choose.Â
- Start small with quick wins. To prove the value of AI to the key stakeholders in your organization, let them see it successfully in action. Choose smaller problems to solve that will demonstrate ROI. From there, you can build momentum for larger initiatives.Â
2. Determine the right datasets
One of the many data issues that often appears with AI is that teams don’t select the right data. Ask yourself key questions about your data upfront, such as where your data will come from, whether you will have enough data, and whether you can source it ethically. You want to ensure you have pipelines for obtaining high-quality, secure data that covers all of the use cases of your selected problem.Â
3. Build the right organization
Determine if your organization is AI-ready. AI development requires an organization with the right structure as well as the right team members equipped with specialized skills. Without essential processes and people in place, your endeavor is very unlikely to succeed.
4. Make the right decision on build vs. buy
Determining whether you should build your own AI or buy all or parts of it is a key strategic decision that can make or break your success. Take into account several considerations when making that choice, including:
- Timing: If you need a fast turnaround, buying may be a more prudent expenditure.
- Budget: Even with a perfectly budgeted project, there could be hidden costs in building AI. Building should only be done if you have plenty of budget available.Â
- Scaling: If you only need AI for one use case, building may be the easiest method. But, if you plan to scale the solution, you may want to consider buying.
These factors will be organization-dependent and use case-dependent, of course, but keep these questions in mind as a baseline.
Getting the Data Right Is Top Priority
Among other elements, data remains a formidable challenge in AI. But it’s one that organizations must get right. Fortunately, it appears as though companies of all sizes are prioritizing data governance as a crucial piece of their AI initiatives. They’re practicing better data security, they’re seeking out help with external data providers, and overall valuing the impact that good data has on their models. Those that are making this effort are seeing the fruits of their labor with greater deployments and ROI. In an ideal future, AI practitioners will learn from these successes and see that comprehensive data governance is essential to deploying AI that works well.Â