OpenAI’s ChatGPT Now Can Speak, Hear, and See: a Multimodal Upgrade in History

ChatGPT Now Can Speak Hear and See

With the addition of additional speech and image capabilities in ChatGPT, OpenAI is once again pushing the boundaries of AI technology. These features are expected to change the way users engage with the AI model, providing a more intuitive and immersive experience.

Voice Conversations with ChatGPT

One of the most notable aspects of this version is the ability to do voice conversations using ChatGPT. Users may now converse with their AI helper in real time, bringing up a world of possibilities. ChatGPT’s voice skills are ready to assist you whether you’re on the go, looking for a bedtime story for your family, or settling a dinner table disagreement.

To begin using voice, go to the Settings menu in the mobile app, pick “New Features,” and enable voice conversations. Once activated, press the headphone icon in the top-right corner of the home screen to select one of five voices. Professional voice actors have meticulously developed these voices to provide a human-like audio experience. Furthermore, Whisper, OpenAI’s open-source speech recognition system, transcribes spoken words into text, improving the overall quality of the conversation.

Image Interaction with ChatGPT

The ability to share photos with ChatGPT is another game changer. Users can now use ChatGPT to troubleshoot difficulties, explore material, and evaluate complex data by displaying one or more photos. ChatGPT can help you figure out why your grill won’t start, design a dinner based on the contents of your fridge, or analyze a data graph for work.

Tap the photo button to capture or select an image to use this function. Tap the addition button first on iOS or Android to upload several photographs, or use the sketching tool to lead your assistant. Multimodal models, such as GPT-3.5 and GPT-4, power these picture capabilities by applying language reasoning skills to a wide range of visual input, such as photos, screenshots, and documents comprising text and images.

Gradual Deployment for Security and Resilience

Voice and image capabilities will be gradually handed out to Plus and Enterprise subscribers over the next two weeks. Voice will be available on both the iOS and Android platforms, with the option to opt in via settings, while photos will be available on all devices.

OpenAI recognizes the hazards involved with these increased capabilities. The emphasis for voice is on voice chat, and the technology was created in partnership with voice actors to assure authenticity and safety. Notably, Spotify is leveraging this technology for its Voice Translation service, which allows podcasters to increase their audience by translating content into several languages using their own voices.

To protect people’s privacy, OpenAI has limited ChatGPT’s capacity to analyze and make direct statements about them using image input. Real-world usage and user input will be critical in further improving these safeguards while ensuring the tool’s usability.


Subscribe to Our Newsletter

Related Articles

Top Trending

best action camera for vlogging
10 Best Action Cameras for Vlogging [GoPro Alternatives]
Habit Stacking
The Power of Habit Stacking: Small Changes, Big Results! Transform Your Life
Best Countries For English-Speaking Students
Top 5 Countries for English-Speaking Students [Outside US/UK]
Best Coding Bootcamps
Are Best Coding Bootcamps Still Relevant for Tech Jobs in 2026? Unlock Careers!
Ramadan
A Look At Ramadan And How Muslims Observe The Holy Month

Fintech & Finance

Robo-Advisors vs DIY Trading
Robo-Advisors Vs DIY Trading: Which Platform Style Fits You Best?
low spread forex brokers
12 Best Forex Trading Brokers With Low Spreads
Best small business credit cards 0% APR
13 Best Small Business Credit Cards with 0% APR Intro Rates
topstep dashboard
Mastering the Topstep Dashboard: Your Central Hub for Funded Trading Success
Family Banking Teaching Kids Financial Literacy with Credit
Family Banking: Teaching Kids Financial Literacy With Credit

Sustainability & Living

Corporate Greenwashing
What is Corporate Greenwashing: How to Spot Fake Eco-Friendly Brands?
Renewable Energy Jobs
Renewable Energy Jobs: The Fastest Growing Career Path [The Next Big Thing]
Ocean Acidification
Unveiling Ocean Acidification: The Silent Killer Of Marine Life!
Indigenous Knowledge In Climate Change
The Role of Indigenous Knowledge In Fighting Climate Change for a Greener Future!
best durable reusable water bottles
Top 6 Reusable Water Bottles That Last a Lifetime

GAMING

how much is 100 gifted subs on twitch
How Much Is 100 Gifted Subs on Twitch? A Complete Breakdown of Costs & Earnings
PlayMyWorld Latest News
Navigating the Future: PlayMyWorld Latest News and Platform Evolution
best gaming chair with footrest
13 Best Gaming Chairs With Footrests And Lumbar Support
best screen recording software
13 Best Screen Recording Software for Tutorials and Gaming in 2026
best streaming microphones
10 Best Streaming Microphones for Twitch and YouTube

Business & Marketing

carolyn chambers
Carolyn Chambers: A Pioneer in Telecommunications and Media Leadership
Robo-Advisors vs DIY Trading
Robo-Advisors Vs DIY Trading: Which Platform Style Fits You Best?
Best Real Estate Crowdfunding Platforms
10 Best Crowdfunding Platforms for Real Estate Investing
Best small business credit cards 0% APR
13 Best Small Business Credit Cards with 0% APR Intro Rates
topstep dashboard
Mastering the Topstep Dashboard: Your Central Hub for Funded Trading Success

Technology & AI

best action camera for vlogging
10 Best Action Cameras for Vlogging [GoPro Alternatives]
Best Coding Bootcamps
Are Best Coding Bootcamps Still Relevant for Tech Jobs in 2026? Unlock Careers!
apps and software aliensync
Mastering Digital Ecosystems: How Apps and Software AlienSync Streamlines Modern Workflows
Best Zoom Alternatives
14 Best Video Conferencing Alternatives to Zoom
Biotech Scalability Tools: What Investors Need to Know
What Investors Should Know About the Tools That Make Biotech Scalable

Fitness & Wellness

Prerona Roy Transformation
Scars, Science, and Scent: The Profound Rebirth of Prerona Roy
mabs brightstar login
Mastering the MABS Brightstar Login: A Professional Guide to the BrightStar Care ABS Portal
noblu glasses
Noblu Glasses Review: Do They Deliver Effective Blue Light Protection?
The Psychological Cost of Climate Anxiety Coping Mechanisms for 2026
The Psychological Cost of Climate Anxiety: Coping Mechanisms for 2026
Modern Stoicism for timeless wisdom
Stoicism for the Modern Age: Ancient Wisdom for 2026 Problems [Transform Your Life]