OpenAI’s ChatGPT Now Can Speak, Hear, and See: a Multimodal Upgrade in History

ChatGPT Now Can Speak Hear and See

With the addition of additional speech and image capabilities in ChatGPT, OpenAI is once again pushing the boundaries of AI technology. These features are expected to change the way users engage with the AI model, providing a more intuitive and immersive experience.

Voice Conversations with ChatGPT

One of the most notable aspects of this version is the ability to do voice conversations using ChatGPT. Users may now converse with their AI helper in real time, bringing up a world of possibilities. ChatGPT’s voice skills are ready to assist you whether you’re on the go, looking for a bedtime story for your family, or settling a dinner table disagreement.

To begin using voice, go to the Settings menu in the mobile app, pick “New Features,” and enable voice conversations. Once activated, press the headphone icon in the top-right corner of the home screen to select one of five voices. Professional voice actors have meticulously developed these voices to provide a human-like audio experience. Furthermore, Whisper, OpenAI’s open-source speech recognition system, transcribes spoken words into text, improving the overall quality of the conversation.

Image Interaction with ChatGPT

The ability to share photos with ChatGPT is another game changer. Users can now use ChatGPT to troubleshoot difficulties, explore material, and evaluate complex data by displaying one or more photos. ChatGPT can help you figure out why your grill won’t start, design a dinner based on the contents of your fridge, or analyze a data graph for work.

Tap the photo button to capture or select an image to use this function. Tap the addition button first on iOS or Android to upload several photographs, or use the sketching tool to lead your assistant. Multimodal models, such as GPT-3.5 and GPT-4, power these picture capabilities by applying language reasoning skills to a wide range of visual input, such as photos, screenshots, and documents comprising text and images.

Gradual Deployment for Security and Resilience

Voice and image capabilities will be gradually handed out to Plus and Enterprise subscribers over the next two weeks. Voice will be available on both the iOS and Android platforms, with the option to opt in via settings, while photos will be available on all devices.

OpenAI recognizes the hazards involved with these increased capabilities. The emphasis for voice is on voice chat, and the technology was created in partnership with voice actors to assure authenticity and safety. Notably, Spotify is leveraging this technology for its Voice Translation service, which allows podcasters to increase their audience by translating content into several languages using their own voices.

To protect people’s privacy, OpenAI has limited ChatGPT’s capacity to analyze and make direct statements about them using image input. Real-world usage and user input will be critical in further improving these safeguards while ensuring the tool’s usability.


Subscribe to Our Newsletter

Related Articles

Top Trending

SMEs for Game Design & Mechanics
Top 15 SMEs for Game Design & Mechanics in Japan
Mother’s Day Gifts for New Moms
Mother’s Day Gifts for New Moms: Gentle, Practical, and Eco-Friendly Ideas
Mindful Handwriting
Ink Against the Algorithm: Why Writing by Hand Is the New Wellness Tech
Code Learning Games
Top 15 SMEs for Code Learning Games in USA
How To Run Effective Team Meetings That Don't Waste Time
How To Run Effective Team Meetings That Don't Waste Time: Maximize Your Productivity!

Fintech & Finance

Klarna global expansion
12 Key Facts About Klarna's Global Expansion
The Best Business Credit Cards for Entrepreneurs
The Best Business Credit Cards for Entrepreneurs
FCA embedded finance regulation
15 the UK's FCA Is Regulating Embedded Finance Products — And Why It Matters
How to Avoid Credit Card Interest Completely
Credit Card Interest-Free Strategies You Should Know Today
Online Banks vs Traditional Banks Which Should You Use
Online Banks vs Traditional Banks: Which One Is Better?

Sustainability & Living

EV Adoption in Australia
13 Critical Facts About EV Adoption in Australia
Non-Toxic Home Finishes UK
10 UK Startups Revolutionizing Home Renovations with Non-Toxic Finishes
Norway EV adoption
12 Must-Know Facts About Norway's EV Revolution
UK EV Grant Schemes
12 Key Facts About UK EV Grant Schemes 2026
Eco-Friendly Kitchen Brands in India
The Green Revolution: 15 Eco-Friendly Kitchen Brands India Needs Right Now

GAMING

SMEs for Game Design & Mechanics
Top 15 SMEs for Game Design & Mechanics in Japan
Mobile Game Psychology: How Developers Hook Players Fast
How Mobile Game Developers Hook Players With Psychology
Top Strategy Games for Mobile in 2026
Top Strategy Games for Mobile In 2026
How to Make Money Playing Mobile Games
How To Make Money Playing Mobile Games
Shillong Teer Result List Archives and Their Importance in Analysis
Shillong Teer Result List Archives and Their Importance in Analysis

Business & Marketing

How To Run Effective Team Meetings That Don't Waste Time
How To Run Effective Team Meetings That Don't Waste Time: Maximize Your Productivity!
Employee Engagement Strategies For 2026
The Most Effective Employee Engagement Strategies For 2026
Klarna global expansion
12 Key Facts About Klarna's Global Expansion
FCA embedded finance regulation
15 the UK's FCA Is Regulating Embedded Finance Products — And Why It Matters
emotional economy in business
How the Emotional Economy Is Shaping Modern Business Models

Technology & AI

Aya vs Google Translate
Aya vs Google Translate in 2026: Which AI Actually Understands Your Language
Mobile Game Psychology: How Developers Hook Players Fast
How Mobile Game Developers Hook Players With Psychology
Top Strategy Games for Mobile in 2026
Top Strategy Games for Mobile In 2026
South Africa insurtech revolution
17 Things Every Reader Must Know About South Africa's Insurtech Revolution
How to Make Money Playing Mobile Games
How To Make Money Playing Mobile Games

Fitness & Wellness

Mindful Handwriting
Ink Against the Algorithm: Why Writing by Hand Is the New Wellness Tech
The Hidden Signs of Emotional Manipulation
The Hidden Signs of Emotional Manipulation: The Ultimate Guide to Identify!
South Korea Sleep Economy 2026
South Korea’s Sleep Tech & Recovery Hardware Ecosystem: 10 Startups and SMEs to Watch
Digital Wellness
A 4-Year-Old Sketched Me at a Clinic: What Wellness Tech Still Can’t Measure
Plant-based meal delivery in Canada
Canada’s Best Plant-Based Meal Deliveries: 15 SMEs & Startups Fueling Your Fitness