OpenAI’s ChatGPT Now Can Speak, Hear, and See: a Multimodal Upgrade in History

ChatGPT Now Can Speak Hear and See

With the addition of additional speech and image capabilities in ChatGPT, OpenAI is once again pushing the boundaries of AI technology. These features are expected to change the way users engage with the AI model, providing a more intuitive and immersive experience.

Voice Conversations with ChatGPT

One of the most notable aspects of this version is the ability to do voice conversations using ChatGPT. Users may now converse with their AI helper in real time, bringing up a world of possibilities. ChatGPT’s voice skills are ready to assist you whether you’re on the go, looking for a bedtime story for your family, or settling a dinner table disagreement.

To begin using voice, go to the Settings menu in the mobile app, pick “New Features,” and enable voice conversations. Once activated, press the headphone icon in the top-right corner of the home screen to select one of five voices. Professional voice actors have meticulously developed these voices to provide a human-like audio experience. Furthermore, Whisper, OpenAI’s open-source speech recognition system, transcribes spoken words into text, improving the overall quality of the conversation.

Image Interaction with ChatGPT

The ability to share photos with ChatGPT is another game changer. Users can now use ChatGPT to troubleshoot difficulties, explore material, and evaluate complex data by displaying one or more photos. ChatGPT can help you figure out why your grill won’t start, design a dinner based on the contents of your fridge, or analyze a data graph for work.

Tap the photo button to capture or select an image to use this function. Tap the addition button first on iOS or Android to upload several photographs, or use the sketching tool to lead your assistant. Multimodal models, such as GPT-3.5 and GPT-4, power these picture capabilities by applying language reasoning skills to a wide range of visual input, such as photos, screenshots, and documents comprising text and images.

Gradual Deployment for Security and Resilience

Voice and image capabilities will be gradually handed out to Plus and Enterprise subscribers over the next two weeks. Voice will be available on both the iOS and Android platforms, with the option to opt in via settings, while photos will be available on all devices.

OpenAI recognizes the hazards involved with these increased capabilities. The emphasis for voice is on voice chat, and the technology was created in partnership with voice actors to assure authenticity and safety. Notably, Spotify is leveraging this technology for its Voice Translation service, which allows podcasters to increase their audience by translating content into several languages using their own voices.

To protect people’s privacy, OpenAI has limited ChatGPT’s capacity to analyze and make direct statements about them using image input. Real-world usage and user input will be critical in further improving these safeguards while ensuring the tool’s usability.


Subscribe to Our Newsletter

Related Articles

Top Trending

Monster Hunter Wilds Affinity
Monster Hunter Wilds Affinity Explained: Critical Chance And Negative Crits
Akuma Layered Armor
How to Get the Akuma Layered Armor in Monster Hunter Wilds
How to Earn Passive Income Without Trading
How to Earn Passive Income Without Trading in a Volatile Market
How to Make Profits With Digital Drop-Servicing
How to Make Profits With Digital Drop-Servicing: A Guide to Earn Big in 2026
Witch Hunt
The Witch Hunt: Why Momoka’s Game Was the Ultimate Test of Trust [Not Intelligence]

Fintech & Finance

How to Earn Passive Income Without Trading
How to Earn Passive Income Without Trading in a Volatile Market
high yield savings accounts in January 2026
Top 5 High-Yield Savings Accounts (HYSA) for January 2026
What Is Teen Banking
What Is Teen Banking: The Race To Capture The Gen Alpha Market [The Next Big Thing]
How to Conduct a SaaS Audit Cutting Bloat in Q1 2026
How To Conduct A SaaS Audit: Cutting Bloat In Q1 2026
The Evolution of DAOs Are They Replacing Corporations
The Evolution Of DAOs: Are They Replacing Corporations?

Sustainability & Living

What Is The Sharing Economy
What Is The Sharing Economy: Borrowing Tools Instead Of Buying [Save Big]
Net-Zero Buildings
Net-Zero Buildings: How To Achieve Zero Emissions [The Ultimate Pathway to a Greener Future]
Fusion Energy
Fusion Energy: Updates on the Holy Grail of Power [Revisiting The Perspective]
Tiny homes
Tiny Homes: A Solution to Homelessness or Poverty with Better Branding?
Smart Windows The Tech Saving Energy in 2026 Skyscrapers
Smart Windows: The Tech Saving Energy in 2026 Skyscrapers

GAMING

Monster Hunter Wilds Affinity
Monster Hunter Wilds Affinity Explained: Critical Chance And Negative Crits
Akuma Layered Armor
How to Get the Akuma Layered Armor in Monster Hunter Wilds
Is Monster Hunter Wilds Open World
Is Monster Hunter Wilds An Open World Game? The Map & Regions Explained
Monster Hunter Wilds Story Length
How Many Chapters Are In Monster Hunter Wilds? Story Length Guide
steam deck alternatives in 2026
Top 5 Handheld Consoles to Buy in 2026 (That Aren't the Steam Deck)

Business & Marketing

How to Make Profits With Digital Drop-Servicing
How to Make Profits With Digital Drop-Servicing: A Guide to Earn Big in 2026
15 Best AI Productivity Tools for Remote Teams in 2026
15 Best AI Productivity Tools for Remote Teams in 2026
Side Hustles to Avoid
5 Popular Side Hustles That Are A Complete Waste of Time in 2026
Digital Drop-Servicing is the King of 2026
Forget Dropshipping: Why "Digital Drop-Servicing" Is The King Of 2026
How To Sell Notion Templates
Write Once, Sell Forever: How To Sell Notion Templates In 2026 [Profit Blueprint]

Technology & AI

15 Best AI Productivity Tools for Remote Teams in 2026
15 Best AI Productivity Tools for Remote Teams in 2026
best free SaaS tools
Work, Wealth, And Wellness: 50 Best Free SAAS Tools to Optimize Your Life in 2026
Why Local SaaS Hosting Matters More Than Ever
Data Sovereignty: Why Local SaaS Hosting Matters More Than Ever
Prompt Engineering Is Dead Here Are the 4 Tech Skills Actually Paying
Prompt Engineering Is Dead: Here Are the 4 Tech Skills Actually Paying in 2026
high income skills
Stop Driving Uber: 5 High-Paying Digital Skills You Can Learn in a Weekend

Fitness & Wellness

Mental Health First Aid for Managers
Mental Health First Aid: A Mandatory Skill for 2026 Managers
The Quiet Wellness Movement Reclaiming Mental Focus in the Hyper-Digital Era
The “Quiet Wellness” Movement: Reclaiming Mental Focus in the Hyper-Digital Era
Cognitive Optimization
Brain Health is the New Weight Loss: The Rise of Cognitive Optimization
The Analogue January Trend Why Gen Z is Ditching Screens for 30 Days
The "Analogue January" Trend: Why Gen Z is Ditching Screens for 30 Days
Gut Health Revolution The Smart Probiotic Tech Winning CES
Gut Health Revolution: The "Smart Probiotic" Tech Winning CES