It’s Time for the Industry to Sunset SDKs for Collecting User Data

essidsolutions

Data collected by SDKs is often used for innovative business needs, but companies really have no control over how the data is used. Antonio Tomarchio, founder and CEO, Cuebiq, calls for the industry to protect consumers and change how data is collected through mobile apps by sunsetting SDKs for all new data partners. He further explores how this can be done.

The average mobile app in the Apple App Store or Google Play store has 18 SDKs (Software Development Kits). 

Each SDK sends data straight from the end user’s device to the company that produced the SDK, which is not the owner of that data. 

In most cases, SDK technology is used for very legitimate needs such as apps analytics or location analytics. 

But regardless, the data owner has no control over that data. And once it leaves the device and is transmitted to a third-party organization, there is no way for the mobile app developer to have visibility into how the data is used.

And even if an SDK is not collecting PII, i.e., personally identifiable information (such as emails or phone numbers or names), it is still possible for unscrupulous actors to de-identify anonymous data by overlaying additional datasets such as real estate ownership databases. 

For these reasons alone, SDKs in mobile apps present significant challenges to the protection and proper use of data.

So to better protect the rights and privacy of the real data owner – the end-user – this has to change.

Therefore, I call on the broader industry to sunset SDKs for all new data owner partners. More so, we call for the industry to come together to significantly change how data is collected through mobile apps and leveraged for innovation.

Instead of collection through SDKs, data owner partners should be integrated into a platform as a service (PaaS), much like an API, shifting away from a data-sharing model to a sandboxed environment. Here data can be queried, sliced, diced, but that enables all exports to be interrogated (both AI and human) for policy compliance. This will better empower advertisers, analysts, and researchers while maximizing user privacy. 

Learn More: Why Blockchain Will Redefine the Internet of Things (IoT)

The creation of a privacy operative system rooted in differential privacy algorithms can dramatically increase the degree of anonymization of data.

Such algorithms can obfuscate device IDs as well as locations in residential areas to make sure that the highest precision in those areas will include hundreds of households. They also eliminate sensitive POIs such as single-disease healthcare centers, places of worship, government buildings and provide many other privacy protections. We also created a proprietary alternative ID system that will protect data in case of a data breach and more. 

PaaS solutions such as this offer customers the ability to utilize a protected sandbox environment (“clean data room”) to work with our privacy-enhanced location data (data with applied differential-privacy techniques) and limits exports to only approved aggregate analytics — either aggregated data or those created with privacy-preserving granularity – without the granular data ever leaving the premises. This eliminates data sharing altogether. 

I invite ALL players in the ecosystem to operate under a similarly structured framework. 

Apple and Google can also play a very critical role here in converging to a privacy-safe data ecosystem. 

Both companies are creating increasing restrictions for SDKs. We agree with this direction but urge them to support the creation of this new concept of a privacy-safe open data ecosystem through the launch of a certified program for data collection, processing and storage practices. 

This would enable an ecosystem that provides for decentralized innovation while abiding by their rules in terms of privacy and data security and constantly investing in privacy technology.

Big tech companies must show clearly that they are partners and not enemies of a concept of “open innovation”. 

The next few years will be critical in deciding which model of an AI-driven data society will emerge. Are we going to have an AI society where data is centralized and controlled in the hands of 5 to 10 organizations (or even worse in the hands of a centralized government in the case of authoritarian regimes)? Or, are we going to have an AI society where thousands and thousands of organizations can have a chance to innovate by accessing data in a privacy-safe way for a multitude of use cases? 

The first model will be a failure for society as it will dramatically increase ethical risks, data bias risks, and inequality. It will drastically curb innovation potentials as all AI developments will follow the agenda of a very limited set of companies. 

The second model must prevail as it is the only one that can have a chance to generate a more equal and prosperous society where the developments of AI can be a springboard for decentralized and useful innovation without controlling agendas.

The United States has a great opportunity to lead again by the example of its values and support the creation of a more equitable AI and data society. We hope that this will be the direction, and we will keep working relentlessly to contribute in any way we can.

Did you find this article helpful? Tell us what you think on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We’d be thrilled to hear from you.