AI-powered

Updated ChatGPT Adds Voice Conversation and Image Recognition Support: Details

Published

8 months ago

September 26, 2023

ChatGPT will soon be able to ‘debate’ user-submitted photos and hold ‘back-and-forth’ dialogues using five voices

OpenAI said on Monday that voice interactions and picture recognition are now supported by ChatGPT. Soon, the company’s AI-powered chatbot will be able to comprehend photographs that users upload or share, and it will be able to provide context or related information on all platforms where it is available. Additionally, it will be able to converse back and forth using OpenAI’s Whisper voice recognition software and a new text-to-speech (TTS) technology from the business that is said to provide “human-like” audio on the company’s ChatGPT mobile app.

In a blog post, OpenAI stated that the company’s new picture recognition capacity for ChatGPT will be available on all platforms, while the audio chats feature will be available via an opt-in setting on iOS and Android. These capabilities will be available to ChatGPT Plus and Enterprise subscribers, with no news on whether they will be available to free users in the future.

By selecting the Voice Conversations checkbox under Settings > New Features, voice conversations can be made available in ChatGPT. Then, you can choose among five voices; OpenAI claims to have partnered with experienced voice actors to provide the new feature. By translating your spoken inquiries into text that the chatbot can understand, the ChatGPT app will be able to respond, and responses will be converted into audio using the company’s new TTS technology.

Spotify launched a new AI-based speech translation tool for podcast creators on Monday that will automatically translate a podcast from English to French, German, and Spanish. ChatGPT is not the only service that will employ OpenAI’s new TTS technology. According to the streaming service, the tool is now being tested by a select podcasters, and translated episodes will be accessible to all users everywhere Spotify is present.

Read also:-The Dadasaheb Phalke Lifetime Achievement Award will be given to Waheeda Rehman

According to OpenAI, the new image identification tool utilises the GPT-3.5 and GPT-4 multimodal models from the firm and is capable of processing the images and text found in pictures, screenshots, and documents. To receive insights from ChatGPT, users can either take a new image or share an existing one from their phone.

According to OpenAI, ChatGPT will also let users contribute several photos that can be debated with the chatbot. You can mark a portion of the image with the built-in drawing tool if you want it to focus on a certain area. For instance, ChatGPT’s chatbot could be able to provide you with solutions if you draw a circle around a bicycle chain that has come undone in a photo you shared with it.