Future of Work News Free eNews Subscription

AssemblyAI Raises $50M to Develop 'Superhuman' Speech AI, Unlocking New Applications

By

The global AI market size is predicted to reach $1,811.8 billion by 2030, up from $136.6 billion in 2022 with a CAGR of 38.1%, according to Grand View Research. Underneath that massive umbrella that is AI is speech AI, which is gaining momentum thanks to the likes of industry giants Google, Amazon and Microsoft as well as startups like Deepgram and Otter.ai. For example, Otter makes meetings more productive with transcriptions that take notes throughout the meeting.

Another platform that is accelerating the momentum around speech AI is AssemblyAI, a speech-to-text and speech intelligence platform. To give itself a distinct edge in this competitive landscape, AssemblyAI recently secured $50 million in Series C funding to develop "superhuman" speech AI models, aiming to revolutionize voice-driven applications across industries.

AssemblyAI was founded with the ambition of creating speech AI models that unlock a new wave of applications powered by voice data. Think of the knowledge contained in company meetings, podcasts, videos, customer calls or even voice-based machine interactions. Accurate understanding and analysis of voice data opens doors to a multitude of novel opportunities.

Over the past two years, advancements in data availability, computing power, and neural network architectures like the Transformer have significantly propelled AI models across various domains, making the dream of superhuman Speech AI models more attainable. As an example, AssemblyAI's latest Conformer-2 model, trained on 1.1 million hours of voice data, achieved accuracy and robustness in tasks like speech-to-text and speaker identification. It boasts a 43% reduction in errors on noisy data compared to other models and a nearly 50% accuracy improvement over previous generations.

In short, AssemblyAI's current suite of speech-to-text models delivers accuracy and additional features like speaker identification, sentiment analysis and chapter detection. The company boasts over 25 million daily inference calls and processes more than 10TB of voice data, serving clients across media, education, healthcare, finance and more. 

Taking speech AI further, AssemblyAI is developing its next-generation Universal model, destined to become a performer in multilingual Speech AI tasks. This model trains on a dataset of over 10 million hours of voice data, leveraging Google's new TPU chips. This represents a 1,250-times increase in training data compared to the company's first model released in 2019.

The emergence of powerful LLMs capable of ingesting accurately recognized speech and generating summaries, insights and classifications also opens new possibilities for voice-data-driven products and workflows. This LLM technology underpins AssemblyAI offerings like Audio Intelligence models for automated chapter detection and content moderation, which support brand safety and content management at scale for leading enterprises. Additionally, the new LeMUR product utilizes LLMs for text generation tasks over recognized speech.

"This new capital will support our ambitious research plans, new model development, training compute, market expansion, as well as help us build our team,” AssemblyAI founder and CEO Dylan Fox stated in the announcement. “We believe that the best way for us to continue to innovate is to bring together some of the best minds in AI. And, with 10,000-plus new organizations signing up for our API every month, we're just scratching the surface of the new voice-powered AI applications we'll see enter the market over the next year.”

The "superhuman" ambition seems to go beyond accuracy. If successful, the development and deployment of "superhuman" speech AI models could have profound implications for various sectors. Imagine classrooms where AI tutors analyze student conversations to personalize learning, or healthcare systems where AI agents interpret medical consultations and identify potential health risks. The possibilities are vast, and AssemblyAI is poised to be at the forefront of this transformative technology.

The round, led by Accel, brings AssemblyAI's total funds raised to $115 million — 90% of which was raised in the last 22 months, as organizations across virtually every industry have raced to embed Speech AI capabilities into their products, systems and workflows.




Edited by Alex Passett
Get stories like this delivered straight to your inbox. [Free eNews Subscription]

Future of Work Contributor

SHARE THIS ARTICLE

Related Articles

ICYMI: Developments Around the Future of Work

By: Greg Tavarez    5/17/2024

Let's get into how AI contributes to what the future of work has in store for all of us.

READ MORE

Fountain Launches Suite of Products to Revolutionize Frontline Workforce Management

By: Stefania Viscusi    5/17/2024

Fountain, an enterprise platform for managing frontline workers, introduced a suite of innovative products designed to enhance the hiring, management,…

READ MORE

Vonage Enhances Service Cloud Voice with Salesforce Einstein Integration

By: Stefania Viscusi    5/17/2024

Vonage, a cloud communications provider and part of Ericsson, announced an enhancement to its Vonage Premier for Service Cloud Voice solution.

READ MORE

AI Meets Cybersecurity: Palo Alto Networks Launches New Defensive Suite

By: Greg Tavarez    5/16/2024

Palo Alto Networks recently introduced a host of new security solutions to help enterprises thwart AI-generated attacks and effectively secure AI-by-d…

READ MORE

ICYMI: Developments for the Future of Work

By: Greg Tavarez    5/10/2024

Here are a few articles compiled into one for readers interested in developments regarding the future of work.

READ MORE