OpenAI’s ChatGPT Now Can Speak, Hear, and See: a Multimodal Upgrade in History

ChatGPT Now Can Speak Hear and See

With the addition of additional speech and image capabilities in ChatGPT, OpenAI is once again pushing the boundaries of AI technology. These features are expected to change the way users engage with the AI model, providing a more intuitive and immersive experience.

Voice Conversations with ChatGPT

One of the most notable aspects of this version is the ability to do voice conversations using ChatGPT. Users may now converse with their AI helper in real time, bringing up a world of possibilities. ChatGPT’s voice skills are ready to assist you whether you’re on the go, looking for a bedtime story for your family, or settling a dinner table disagreement.

To begin using voice, go to the Settings menu in the mobile app, pick “New Features,” and enable voice conversations. Once activated, press the headphone icon in the top-right corner of the home screen to select one of five voices. Professional voice actors have meticulously developed these voices to provide a human-like audio experience. Furthermore, Whisper, OpenAI’s open-source speech recognition system, transcribes spoken words into text, improving the overall quality of the conversation.

Image Interaction with ChatGPT

The ability to share photos with ChatGPT is another game changer. Users can now use ChatGPT to troubleshoot difficulties, explore material, and evaluate complex data by displaying one or more photos. ChatGPT can help you figure out why your grill won’t start, design a dinner based on the contents of your fridge, or analyze a data graph for work.

Tap the photo button to capture or select an image to use this function. Tap the addition button first on iOS or Android to upload several photographs, or use the sketching tool to lead your assistant. Multimodal models, such as GPT-3.5 and GPT-4, power these picture capabilities by applying language reasoning skills to a wide range of visual input, such as photos, screenshots, and documents comprising text and images.

Gradual Deployment for Security and Resilience

Voice and image capabilities will be gradually handed out to Plus and Enterprise subscribers over the next two weeks. Voice will be available on both the iOS and Android platforms, with the option to opt in via settings, while photos will be available on all devices.

OpenAI recognizes the hazards involved with these increased capabilities. The emphasis for voice is on voice chat, and the technology was created in partnership with voice actors to assure authenticity and safety. Notably, Spotify is leveraging this technology for its Voice Translation service, which allows podcasters to increase their audience by translating content into several languages using their own voices.

To protect people’s privacy, OpenAI has limited ChatGPT’s capacity to analyze and make direct statements about them using image input. Real-world usage and user input will be critical in further improving these safeguards while ensuring the tool’s usability.


Subscribe to Our Newsletter

Related Articles

Top Trending

UK High Potential Individual Visa Guide
Everything You Need on the UK High Potential Individual Visa!
Hreflang Tags international seo
International SEO: Hreflang Tags Demystified [Unlock Global Traffic]
AI Text to Video Generator Tools
15 Best AI Video Generators from Text Prompts
10 Best Comedy Specials on Netflix Right Now
Laugh Out Loud: 10 Best Comedy Specials On Netflix Right Now!
On This Day March 11
On This Day March 11: History, Famous Birthdays, Deaths & Global Events

Fintech & Finance

Gamified Finance Education for Kids
Level Up Your Child’s Future with “Gamified Finance Education for Kids”!
The Complete Guide to Online Surveys for Money Payouts
The Complete Guide to Online Surveys for Money Payouts
Is American Economic Expansion Sustainable
Is American Economic Expansion Sustainable? A Full Analysis (2025–2026)
Home Loan Eligibility: How Much Can You Get on Your Salary?
How Much Home Loan Can You Get on Your Salary and What Are the Other Eligibility Factors?
The ROI of a Master's Degree in 2026
The Surprising Truth About the ROI Of A Master's Degree In 2026

Sustainability & Living

Vertical Forests Architecture That Breathes
Transform Your Space with Vertical Forests: Architecture That Breathes!
Sustainable Fashion How to Build a Capsule Wardrobe
Sustainable Fashion: How to Build A Capsule Wardrobe
Blue Economy
Dive into The "Blue Economy": Protecting Our Oceans Together!
Sustainable Cities Urban Planning for a Green Future
Transform Your City with Sustainable Cities: Urban Planning for A Green Future
best smart blinds
12 Best Smart Blinds and Shades [Automated Curtains]

GAMING

best gaming headsets with mic monitoring
12 Best Gaming Headsets with Mic Monitoring
Best capture cards for streaming
10 Best Capture Cards for Streaming Console Gameplay
Gamification in Education Beyond Points and Badges
Engage Students Like Never Before: “Gamification in Education: Beyond Points and Badges”
iGaming Player Wellbeing: Strategies for Balanced Play
The Debate Behind iGaming: How Best to Use for Balanced Player Wellbeing
Hypackel Games
Hypackel Games A Look at Player Shaped Online Play

Business & Marketing

Confidence vs Ego Knowing the Difference
Confidence Vs Ego: Knowing The Difference [Mastering Self-Identity Explained]
The Complete Guide to Online Surveys for Money Payouts
The Complete Guide to Online Surveys for Money Payouts
Emotional Intelligence skill
Emotional Intelligence: The Skill AI Can't Replace [Unlock Your Potential]
Power Of Vulnerability In Leadership
The Power Of Vulnerability In Leadership And Life [Transform Your Impact]
Home Loan Eligibility: How Much Can You Get on Your Salary?
How Much Home Loan Can You Get on Your Salary and What Are the Other Eligibility Factors?

Technology & AI

AI Text to Video Generator Tools
15 Best AI Video Generators from Text Prompts
Best external SSD for PS5
10 Best External SSDs for PS5 Storage Expansion
How to Use AI For Content Creation Without Losing Your Voice
How to Use AI for Content Creation Without Losing Your Authentic Voice
Robots.txt File
Robots.txt File: The Most Dangerous File On Your Website [Beware]
Andrew Ting MD: Quality Data Powers Safer Healthcare AI
Andrew Ting MD Explains Why High-Quality Medical Data Is Key to Smarter, Safer AI in Healthcare

Fitness & Wellness

Mindfulness For Skeptics
Mindfulness For Skeptics: Science-Backed Benefits You Must Know!
Burnout Recovery A Step-by-Step Guide
Transform Your Wellness with Burnout Recovery: A Step-by-Step Guide
best journals for gratitude and mindfulness
10 Best Journals for Gratitude and Mindfulness
Finding Purpose Ikigai for the 2026 Professional
Finding Purpose: Ikigai for The 2026 Professional
Visualizing Success The Science Behind Mental Imagery
Visualizing Success: The Science Behind Mental Imagery