OpenAI’s ChatGPT Now Can Speak, Hear, and See: a Multimodal Upgrade in History

ChatGPT Now Can Speak Hear and See

With the addition of additional speech and image capabilities in ChatGPT, OpenAI is once again pushing the boundaries of AI technology. These features are expected to change the way users engage with the AI model, providing a more intuitive and immersive experience.

Voice Conversations with ChatGPT

One of the most notable aspects of this version is the ability to do voice conversations using ChatGPT. Users may now converse with their AI helper in real time, bringing up a world of possibilities. ChatGPT’s voice skills are ready to assist you whether you’re on the go, looking for a bedtime story for your family, or settling a dinner table disagreement.

To begin using voice, go to the Settings menu in the mobile app, pick “New Features,” and enable voice conversations. Once activated, press the headphone icon in the top-right corner of the home screen to select one of five voices. Professional voice actors have meticulously developed these voices to provide a human-like audio experience. Furthermore, Whisper, OpenAI’s open-source speech recognition system, transcribes spoken words into text, improving the overall quality of the conversation.

Image Interaction with ChatGPT

The ability to share photos with ChatGPT is another game changer. Users can now use ChatGPT to troubleshoot difficulties, explore material, and evaluate complex data by displaying one or more photos. ChatGPT can help you figure out why your grill won’t start, design a dinner based on the contents of your fridge, or analyze a data graph for work.

Tap the photo button to capture or select an image to use this function. Tap the addition button first on iOS or Android to upload several photographs, or use the sketching tool to lead your assistant. Multimodal models, such as GPT-3.5 and GPT-4, power these picture capabilities by applying language reasoning skills to a wide range of visual input, such as photos, screenshots, and documents comprising text and images.

Gradual Deployment for Security and Resilience

Voice and image capabilities will be gradually handed out to Plus and Enterprise subscribers over the next two weeks. Voice will be available on both the iOS and Android platforms, with the option to opt in via settings, while photos will be available on all devices.

OpenAI recognizes the hazards involved with these increased capabilities. The emphasis for voice is on voice chat, and the technology was created in partnership with voice actors to assure authenticity and safety. Notably, Spotify is leveraging this technology for its Voice Translation service, which allows podcasters to increase their audience by translating content into several languages using their own voices.

To protect people’s privacy, OpenAI has limited ChatGPT’s capacity to analyze and make direct statements about them using image input. Real-world usage and user input will be critical in further improving these safeguards while ensuring the tool’s usability.


Subscribe to Our Newsletter

Related Articles

Top Trending

The Hidden Danger of Vaping
The Hidden Danger of Vaping: Scientists Now Link E-Cigarettes to Lung and Oral Cancer
Medical Tourism
Borderless Care Economy: Inside the Global Medical Tourism Boom Redefining Healthcare
Startup Visas In Europe
Startup Visas In Europe: Which Countries Offer The Best Terms? [Explained]
Underrated Psychological Anime
8 Underrated Psychological Anime That Will Mess With Your Head
How to Read Forex Charts Like a Pro
Elevate Your Skills: How to Read Forex Charts Like a Professional Trader

Fintech & Finance

How to Read Forex Charts Like a Pro
Elevate Your Skills: How to Read Forex Charts Like a Professional Trader
Forex Trading for Beginners A Complete Step-by-Step Guide
Forex Trading for Beginners: The Ultimate Step-by-Step Blueprint!
GDPR Compliance for European Startups A Practical Guide
GDPR Compliance for European Startups: A Practical Guide
Ai In Financial Services
How AI Is Making Financial Services More Accessible: Unlocking Opportunities
crypto remittances New Zealand
17 Critical Facts About How New Zealanders Are Using Crypto for International Remittances

Sustainability & Living

Medical Tourism
Borderless Care Economy: Inside the Global Medical Tourism Boom Redefining Healthcare
Green Building Certifications For Schools
Green Building Certifications For Schools: Boost Learning Environments!
Smart Water Management
Revolutionize Smart Water Management In Cities: Unlock the Future!
Homesteading’s Comeback Story, Why Americans Are Turning Back To Self Reliance In Record Numbers
Homesteading’s Comeback Story: Why Americans are Turning Back to Self Reliance In Record Numbers
Direct Air Capture_ The Machines Sucking CO2
Meet the Future with Direct Air Capture: Machines Sucking CO2!

GAMING

Geek Appeal of Randomized Games
The Geek Appeal of Randomized Games Like Pokies
Best Way to Play Arknights on PC
The Best Way to Play Arknights on PC - Beginner’s Guide for Emulators
Cybet Review
Cybet Review: A Fast-Growing Crypto Casino with Fast Withdrawals and No-KYC Gaming
online gaming
Why Sign-Up Bonuses Are So Popular in Online Entertainment
How Online Gaming Platforms Build Trust
How Online Gaming Platforms Build Trust With New Users

Business & Marketing

Startup Visas In Europe
Startup Visas In Europe: Which Countries Offer The Best Terms? [Explained]
How to Read Forex Charts Like a Pro
Elevate Your Skills: How to Read Forex Charts Like a Professional Trader
Forex Trading for Beginners A Complete Step-by-Step Guide
Forex Trading for Beginners: The Ultimate Step-by-Step Blueprint!
Pan-European Business
How To Build A Pan-European Business From Scratch [Start Your Journey]
Lean Waste Management
Lean Operations: How To Eliminate Waste In Your Business Processes

Technology & AI

Top Countries with the most AI Patents
Top 12 Countries With the Most AI Patents in 2026
Mental Health Impacts Of AI Companions
The Psychological Impact of AI Companions on Mental Health [All You Need to Know]
App Development For Startups With Garage2Global
iOS and Android App Development For Startups With Garage2Global
AI Data Privacy In Smart Devices
AI and Privacy: What Your Smart Devices are Collecting?
tech giants envision future beyond smartphones
Tech Giants Envision Future Beyond Smartphones: What's Next in Technology

Fitness & Wellness

The Hidden Danger of Vaping
The Hidden Danger of Vaping: Scientists Now Link E-Cigarettes to Lung and Oral Cancer
Regenerative Baseline
Regenerative Baseline: The 2026 Mandatory Standard for Organic Luxury [Part 5]
Purposeful Walk Spaziergang
Mastering the Spaziergang: How a Purposeful Walk Can Reset Your Entire Week
Avtub
Avtub: The Ultimate Hub For Lifestyle, Health, Wellness, And More
Integrated Value Chain
The Resilience Framework: A Collaborative Integrated Value Chain Is Changing the Way We Eat [Part 4]