Google Updates Privacy Policy to Enable Data Scraping Across the Internet

  • Google has made changes to its privacy policy, allowing the company to use any public data for training AI models.
  • The change expands the list of data sources the company uses, including even those platforms that are not Google’s services.

Google updated its privacy policy, which now states that publically available data would be used to train the company’s language and AI models. While the earlier wording limited this to Google Translate, the new policy will include products such as Cloud AI and Bard. Consequently, any data uploaded to a public space online is fair game for Google and will be used to develop AI systems going forward.

The new update from Google says that in addition to the collection of data that is publicly available online for AI data scraping, the company will also collect business information on websites for display on Google Services. This is a marked departure from conventional practices where companies would extract data from only their own platforms and services.

The update from Google came soon after OpenAI was subjected to a class-action lawsuit in California regarding scraping private data from the internet. OpenAI allegedly used data from social media, blogs, and other public websites to train ChatGPT without the consent of the users.

See More: Android Device Snooping App LetMeSpy Hacked, Data of Thousands Compromised

Privacy Policy Updates Create New IP and Privacy Challenges

The growing attention on the issue of web scraping has gained importance, with platforms such as Reddit and Twitter being particularly vocal about these concerns. Twitter has already set limits on the number of tweets that can be viewed by an account each day. Both Reddit and Twitter have eliminated free access to their APIs, even though such moves have proven controversial.

With Google and OpenAI setting new precedents for the use of data available online, internet users not only have to consider who can see the data but how such data can be used. In addition, the unregulated use of publicly available data also creates concerns about the use of copyrighted materials and other forms of intellectual property.

With Google’s business primarily focused on collecting user data and its sale to advertisers, data scraping could arguably be considered a core aspect of its business practices.

Is your organization taking any measures to protect website content from AI chatbots? Let us know on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We’d love to hear from you!

Image source: Shutterstock