In a recent turn of events, OpenAI has announced that it will halt the use of one of its ChatGPT voices, “Sky,” following allegations from actress Scarlett Johansson. Johansson claimed that the voice sounded “eerily similar” to her own, leading to widespread discussions and debates. The company made this announcement on Monday, stating on the social media platform X that it is “working to pause” Sky, which is one of the five voices available for ChatGPT users to select. This decision came in response to the concerns raised by Johansson and others about the striking resemblance of Sky’s voice to Johansson’s.
Johansson’s Concerns and OpenAI’s Response
The controversy began when Johansson, who is famously known for voicing a futuristic AI assistant in the 2013 film “Her,” issued a public statement. She disclosed that OpenAI CEO Sam Altman had approached her in September with an offer to lend her voice to the system. Altman believed that Johansson’s voice would provide comfort to people who might feel uneasy with the technology. However, Johansson declined the offer. She later expressed her shock and anger when she heard the demo of Sky’s voice, noting that it was so similar to her own that even her closest friends and news outlets could not distinguish between the two.
Johansson stated, “When I heard the released demo, I was shocked, angered, and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine that my closest friends and news outlets could not tell the difference.” She further explained that OpenAI agreed to take down the Sky voice reluctantly after she hired legal counsel, who wrote to Altman, questioning the process by which the company created the voice.
OpenAI, in response to these concerns, reiterated that Sky’s voice was not Johansson’s and was never intended to mimic hers. The company emphasized in a blog post that AI voices should not deliberately imitate a celebrity’s distinctive voice. They clarified that the voice of Sky belongs to a different professional actress. However, for privacy reasons, they did not disclose the identity of the actress behind Sky’s voice.
Legal Action and OpenAI’s Apology
Johansson’s legal team sent letters to Altman, demanding clarification on how the Sky voice was created. This legal pressure led to OpenAI’s decision to halt the use of Sky’s voice. In a statement sent to The Associated Press, Altman clarified that the voice actor for Sky was cast before any outreach to Johansson. He apologized for the miscommunication, stating, “The voice of Sky is not Scarlett Johansson’s, and it was never intended to resemble hers. Out of respect for Ms. Johansson, we have paused using Sky’s voice in our products. We are sorry to Ms. Johansson that we didn’t communicate better.”
Advancements in ChatGPT’s Voice Capabilities
OpenAI first introduced voice capabilities for ChatGPT in September, initially available only to paid subscribers. These capabilities allow users to engage in natural, back-and-forth conversations with the AI assistant. In November, OpenAI announced that the voice feature would become free for all users with the mobile app, significantly expanding access to this advanced technology.
The latest update to OpenAI’s generative AI model, known as GPT-4o, has brought about significant advancements in the sophistication of ChatGPT’s interactions. The model can now mimic human speech patterns with greater accuracy and detect users’ moods based on their verbal responses. During a demonstration of GPT-4o, the AI bot showcased its ability to add emotion to its voice and attempt to deduce a person’s emotional state from a selfie video. This update has allowed ChatGPT to engage in more realistic and emotionally nuanced conversations, marking a major milestone in the development of AI communication tools.
Future of AI Voices and Ethical Considerations
The controversy surrounding Sky’s voice has sparked broader discussions about the ethical implications of AI in replicating human voices. Critics argue that tech companies have historically used gendered and subservient female voices for AI assistants, reinforcing certain biases and stereotypes. This issue was highlighted in 2019 by the United Nations’ culture and science organization, which pointed to the “hardwired subservience” in default female-voiced assistants like Apple’s Siri and Amazon’s Alexa, even when confronted with sexist insults and harassment.
The fight over the rights to actors’ voices and images is becoming a significant concern in Hollywood as studios and tech companies explore the use of AI to create new forms of entertainment. Johansson’s case underscores the need for clear guidelines and ethical considerations in developing AI systems that can replicate human voices and likenesses. The technology’s ability to produce highly realistic audio and visual content raises important questions about consent, intellectual property rights, and the potential for misuse.
Comparisons to ‘Her’ and Public Reactions
The situation with Johansson has also drawn comparisons to the 2013 film “Her,” directed by Spike Jonze, in which Johansson voices an AI operating system that develops a romantic relationship with a human. The parallels between the film and the current controversy have not gone unnoticed, with many commentators highlighting the irony of the situation. OpenAI CEO Sam Altman appeared to tap into this narrative by simply posting the word “her” on the social media platform X on the day of GPT-4o’s unveiling.
Reactions to the model’s demos have also sparked discussions about the tone and nature of AI interactions. Some have noted that the interactions struck a strangely flirtatious tone, which has raised questions about the gendered ways in which tech companies develop and engage voice assistants. In one video posted by OpenAI, a female-voiced ChatGPT compliments a company employee on “rocking an OpenAI hoodie,” and in another instance, the chatbot responds with “Oh, stop it, you’re making me blush” after being praised.
These interactions have led to criticism that the technology is programmed to cater to certain user expectations, particularly those of male users. Desi Lydic, a senior correspondent for The Daily Show, commented on this issue, stating, “This is clearly programmed to feed dudes’ egos. You can really tell that a man built this technology.” Such critiques highlight the ongoing debate about the representation and roles of AI voices, especially those that are gendered, in digital interactions.
The Information is Collected from ktla and Yahoo.