Exercise caution when using Gemini, Google’s family of GenAI apps, to avoid sharing any potentially incriminating or sensitive information that you wouldn’t want others to access.
Today, Google released a support document that explains how it gathers data from users of its Gemini chatbot apps for the web, Android, and iOS. This serves as a public service announcement of sorts.
According to Google, human annotators regularly read, label, and process conversations with Gemini to enhance the service, even though these conversations are not linked to Google Accounts. (The origin of these annotators, whether they are in-house or outsourced, is not specified by Google, which could be relevant in terms of data security.) These conversations are stored for a duration of three years, along with additional data such as the user’s language preferences, device information, and location. Now, Google provides users with the ability to have some control over which data related to Gemini is kept and how it is managed.
Disabling Gemini Apps Activity in Google’s My Activity dashboard, which is enabled by default, ensures that any future conversations with Gemini will not be saved to a Google Account for review. This means that the three-year window for review will not be applicable. Meanwhile, you have the option to delete individual prompts and conversations from the Gemini Apps Activity screen.
According to Google, even if you turn off Gemini Apps Activity, Gemini conversations will still be stored in a Google Account for a maximum of 72 hours. This is done to ensure the safety and security of Gemini apps and enhance their performance.
Google advises users to refrain from sharing sensitive information or data that they would not want to be accessed by reviewers or used by Google to enhance their products, services, and machine learning technologies.
In all honesty, Google’s GenAI data collection and retention policies are quite similar to those of its competitors. As an illustration, OpenAI retains all chats with ChatGPT for a period of 30 days, regardless of whether the conversation history feature is disabled. The only exception is if a user has subscribed to an enterprise-level plan with a personalized data retention policy.
Google’s policy highlights the difficulties of finding a balance between privacy and the development of GenAI models that rely on user data for self-improvement. Recent data retention policies implemented by certain vendors have caused regulatory concerns and backlash.
Last summer, the FTC asked OpenAI to provide extensive information about their data vetting process for training models, including consumer data, and how they ensure its protection when accessed by third parties. According to the Italian Data Protection Authority, OpenAI does not have a legal basis for the mass collection and storage of personal data to train its GenAI models.
With the increasing prevalence of GenAI tools, organizations are becoming more concerned about the potential privacy risks involved.
According to a recent survey conducted by Cisco, a significant number of companies have implemented restrictions on the type of data that can be inputted into GenAI tools. In fact, 63% of the surveyed companies have established such limitations, while 27% have gone as far as completely banning the use of GenAI. According to the survey findings, a significant portion of employees, approximately 45%, have inputted what can be considered “problematic” data into GenAI tools. This data includes sensitive information about fellow employees as well as confidential files related to their employer.
OpenAI, Microsoft, Amazon, Google, and other companies provide GenAI products specifically designed for enterprises. These products ensure that data is not stored for any period of time, whether it is for model training or any other use. Consumers, unfortunately, are often left at a disadvantage.