
What does the future look like for humans interacting with technology? Most futurists believe that voice will become the predominant way we interact with technology. For voice interactions to work, however, the AI systems that underpin said interactions must be highly tuned to enable natural conversation at scale and with incredibly low latency.
Speech recognition and natural language processing (NLP) company Deepgram recently announced the public release of Aura, a text-to-speech (TTS) API which offers human-like, quality conversation that is faster and more efficient (compute-wise) than other voice AI alternatives, according to the company. Aura is designed for developers who want to build real-time, conversational voice AI agents that can interact with customers, employees and other users in a natural and engaging way.
Aura, which complements Deepgram's Nova-2 speech-to-text API, can generate speech from any text input, including responses from LLMs (large language models) like ChatGPT, in fractions of a second. This enables fluid and natural-sounding conversations with AI agents that can handle complex and dynamic scenarios. Aura offers a selection of diverse voices strongly suited for conversational use cases and preferences requiring a high degrees of safety, security, speed and scale.
Deepgram's Nova-2 speech-to-text API, which provides accurate and fast transcription of audio streams, is currently used by a variety of global enterprises and organizations including Spotify, Citibank, NASA and Twilio. With this release, Deepgram offers developers a complete voice AI platform, giving them the essential building blocks they need – from transcription to sentiment analysis to voice synthesis – to build high throughput, real-time AI agents of the future, said the company.
“Aura is the result of years of research and development by our team of world-class AI scientists and engineers, who have leveraged the latest advances in deep learning and GPU technology to create a state-of-the-art TTS solution that outperforms anything else on the market” noted Scott Stephenson, CEO and co-founder of Deepgram. “With Aura, we are empowering developers to create voice AI applications that can truly understand and respond to human speech, opening up new possibilities for enhancing customer experience, productivity, and innovation."
Edited by
Alex Passett