OpenAI’s ChatGPT Now Can Speak, Hear, and See: a Multimodal Upgrade in History

ChatGPT Now Can Speak Hear and See

With the addition of additional speech and image capabilities in ChatGPT, OpenAI is once again pushing the boundaries of AI technology. These features are expected to change the way users engage with the AI model, providing a more intuitive and immersive experience.

Voice Conversations with ChatGPT

One of the most notable aspects of this version is the ability to do voice conversations using ChatGPT. Users may now converse with their AI helper in real time, bringing up a world of possibilities. ChatGPT’s voice skills are ready to assist you whether you’re on the go, looking for a bedtime story for your family, or settling a dinner table disagreement.

To begin using voice, go to the Settings menu in the mobile app, pick “New Features,” and enable voice conversations. Once activated, press the headphone icon in the top-right corner of the home screen to select one of five voices. Professional voice actors have meticulously developed these voices to provide a human-like audio experience. Furthermore, Whisper, OpenAI’s open-source speech recognition system, transcribes spoken words into text, improving the overall quality of the conversation.

Image Interaction with ChatGPT

The ability to share photos with ChatGPT is another game changer. Users can now use ChatGPT to troubleshoot difficulties, explore material, and evaluate complex data by displaying one or more photos. ChatGPT can help you figure out why your grill won’t start, design a dinner based on the contents of your fridge, or analyze a data graph for work.

Tap the photo button to capture or select an image to use this function. Tap the addition button first on iOS or Android to upload several photographs, or use the sketching tool to lead your assistant. Multimodal models, such as GPT-3.5 and GPT-4, power these picture capabilities by applying language reasoning skills to a wide range of visual input, such as photos, screenshots, and documents comprising text and images.

Gradual Deployment for Security and Resilience

Voice and image capabilities will be gradually handed out to Plus and Enterprise subscribers over the next two weeks. Voice will be available on both the iOS and Android platforms, with the option to opt in via settings, while photos will be available on all devices.

OpenAI recognizes the hazards involved with these increased capabilities. The emphasis for voice is on voice chat, and the technology was created in partnership with voice actors to assure authenticity and safety. Notably, Spotify is leveraging this technology for its Voice Translation service, which allows podcasters to increase their audience by translating content into several languages using their own voices.

To protect people’s privacy, OpenAI has limited ChatGPT’s capacity to analyze and make direct statements about them using image input. Real-world usage and user input will be critical in further improving these safeguards while ensuring the tool’s usability.


Subscribe to Our Newsletter

Related Articles

Top Trending

Technical SEO Audit Tools
The Best 13 Technical SEO Audit Tools to Dominate SERPs
optimization obsession
The 'Optimization' Obsession Is Making Us Sick: Why Wellness Went Too Far!
AI Workflows for Educators to Save Time and Improve Teaching Quality
8 AI Workflows for Educators to Save Time and Improve Teaching Quality
Adding AI Music
A Practical Guide to Adding AI Music to Videos
morning habits better energy
9 Morning Habits for Better Energy

Fintech & Finance

Understanding SIP Investing in Mutual Funds for New Investors
Understanding SIP Investing in Mutual Funds for New Investors
Using an SIP Return Calculator for Mutual Fund Investment Planning
Using an SIP Return Calculator for Mutual Fund Investment Planning
Split AC Installation Tips
Buying a Split AC in 2026: Six Installation Tips to Know Before the Technician Arrives
Multi Asset Allocation Fund: Simple Diversification for Investors
Multi Asset Allocation Fund - A Single Fund Approach for Investors Who Want Diversification Without the Guesswork
Building Wealth Through Cashflow Investing for Time-Rich Lifestyles
Building Wealth Through Cashflow Investing for Time-Rich Lifestyles

Sustainability & Living

Sustainable Food Brands
13 Sustainable Food Brands Worth Knowing for Smarter Grocery Choices
sustainable home goods brands
7 Sustainable Home Goods Brands for a Lower-Waste Home
Compostable Adhesive Tech
6 US SMEs Perfecting Compostable Adhesive Tech for Zero-Waste Brands
sustainable childrens brand
9 Sustainable Children’s Brands Parents Can Actually Trust
Sustainable Footwear Brands
10 Sustainable Footwear Brands for Eco Shoes That Actually Feel Worth Buying

GAMING

Gaming Genres Guide
The Ultimate Gaming Genres Guide: From RPG Mechanics to Esports Mastery
Best Game Streaming Platforms
7 Best Game Streaming Platforms Compared for Creators, Gamers, and Growing Channels
Online Gaming Brands
What Online Brands Can Learn from Casino Sites in 2026 and Beyond
best indie gaming communities
9 Best Indie Gaming Communities for Gamers, Developers, and Hidden-Gem Hunters
Visual Novels and Narrative Games
Visual Novels and Narrative Games Explained: Why Story Beats Mechanics

Business & Marketing

AI Workflows Real Estate Agents
13 AI Workflows for Real Estate Agents to Generate Leads and Close Faster
How to Help Business Growth in UK with Charfen.CO.UK
Charfen.CO.UK: Business Growth Help For UK Entrepreneurs
7 AI Workflows for E-Commerce Brands to Increase Sales and Automate Growth
7 AI Workflows for E-Commerce Brands to Increase Sales and Automate Growth
Understanding SIP Investing in Mutual Funds for New Investors
Understanding SIP Investing in Mutual Funds for New Investors
SaaS growth marketing
SaaS Growth and Marketing Complete Guide: A Practical Roadmap

Technology & AI

AI Workflows for Educators to Save Time and Improve Teaching Quality
8 AI Workflows for Educators to Save Time and Improve Teaching Quality
AI Workflows Real Estate Agents
13 AI Workflows for Real Estate Agents to Generate Leads and Close Faster
7 AI Workflows for E-Commerce Brands to Increase Sales and Automate Growth
7 AI Workflows for E-Commerce Brands to Increase Sales and Automate Growth
AI Music Generation
The Reality Behind the Magic of AI Music Generation
AI podcast production
AI Podcast Production: A Practical Workflow for Planning, Editing, and Publishing Better Episodes

Fitness & Wellness

optimization obsession
The 'Optimization' Obsession Is Making Us Sick: Why Wellness Went Too Far!
morning habits better energy
9 Morning Habits for Better Energy
best healthy habits
33 Healthy Habits Worth Building This Year
eating for fitness goals
Eating for Specific Fitness Goals: How to Eat for Muscle Gain, Fat Loss and Performance
Plant-Based Diets for Athletes
Plant-Based Diets for Athletes