Future of Work News Free eNews Subscription

AssemblyAI Raises $50M to Develop 'Superhuman' Speech AI, Unlocking New Applications

By

The global AI market size is predicted to reach $1,811.8 billion by 2030, up from $136.6 billion in 2022 with a CAGR of 38.1%, according to Grand View Research. Underneath that massive umbrella that is AI is speech AI, which is gaining momentum thanks to the likes of industry giants Google, Amazon and Microsoft as well as startups like Deepgram and Otter.ai. For example, Otter makes meetings more productive with transcriptions that take notes throughout the meeting.

Another platform that is accelerating the momentum around speech AI is AssemblyAI, a speech-to-text and speech intelligence platform. To give itself a distinct edge in this competitive landscape, AssemblyAI recently secured $50 million in Series C funding to develop "superhuman" speech AI models, aiming to revolutionize voice-driven applications across industries.

AssemblyAI was founded with the ambition of creating speech AI models that unlock a new wave of applications powered by voice data. Think of the knowledge contained in company meetings, podcasts, videos, customer calls or even voice-based machine interactions. Accurate understanding and analysis of voice data opens doors to a multitude of novel opportunities.

Over the past two years, advancements in data availability, computing power, and neural network architectures like the Transformer have significantly propelled AI models across various domains, making the dream of superhuman Speech AI models more attainable. As an example, AssemblyAI's latest Conformer-2 model, trained on 1.1 million hours of voice data, achieved accuracy and robustness in tasks like speech-to-text and speaker identification. It boasts a 43% reduction in errors on noisy data compared to other models and a nearly 50% accuracy improvement over previous generations.

In short, AssemblyAI's current suite of speech-to-text models delivers accuracy and additional features like speaker identification, sentiment analysis and chapter detection. The company boasts over 25 million daily inference calls and processes more than 10TB of voice data, serving clients across media, education, healthcare, finance and more. 

Taking speech AI further, AssemblyAI is developing its next-generation Universal model, destined to become a performer in multilingual Speech AI tasks. This model trains on a dataset of over 10 million hours of voice data, leveraging Google's new TPU chips. This represents a 1,250-times increase in training data compared to the company's first model released in 2019.

The emergence of powerful LLMs capable of ingesting accurately recognized speech and generating summaries, insights and classifications also opens new possibilities for voice-data-driven products and workflows. This LLM technology underpins AssemblyAI offerings like Audio Intelligence models for automated chapter detection and content moderation, which support brand safety and content management at scale for leading enterprises. Additionally, the new LeMUR product utilizes LLMs for text generation tasks over recognized speech.

"This new capital will support our ambitious research plans, new model development, training compute, market expansion, as well as help us build our team,” AssemblyAI founder and CEO Dylan Fox stated in the announcement. “We believe that the best way for us to continue to innovate is to bring together some of the best minds in AI. And, with 10,000-plus new organizations signing up for our API every month, we're just scratching the surface of the new voice-powered AI applications we'll see enter the market over the next year.”

The "superhuman" ambition seems to go beyond accuracy. If successful, the development and deployment of "superhuman" speech AI models could have profound implications for various sectors. Imagine classrooms where AI tutors analyze student conversations to personalize learning, or healthcare systems where AI agents interpret medical consultations and identify potential health risks. The possibilities are vast, and AssemblyAI is poised to be at the forefront of this transformative technology.

The round, led by Accel, brings AssemblyAI's total funds raised to $115 million — 90% of which was raised in the last 22 months, as organizations across virtually every industry have raced to embed Speech AI capabilities into their products, systems and workflows.




Edited by Alex Passett
Get stories like this delivered straight to your inbox. [Free eNews Subscription]

Future of Work Contributor

SHARE THIS ARTICLE

Related Articles

Future of Work Expo 2025: UCaaS Drives the Future of Work

By: Greg Tavarez    2/12/2025

At Future of Work Expo 2025, part of the #TECHSUPERSHOW, a panel session, "Why UCaaS Is the Future of Work," explained why UCaaS is so central for the…

READ MORE

Is the Future of Work Powered by AI? Find Out at Future of Work Expo 2025

By: Alex Passett    2/11/2025

Future of Work Expo 2025 began today at the Broward County Convention Center in Fort Lauderdale, Florida. This story shares some details from the Futu…

READ MORE

Cybersecurity and Privacy Discussed at Future of Work Expo 2025

By: Greg Tavarez    2/11/2025

The flow of sensitive information, both within and outside organizations, is becoming harder to control.

READ MORE

Unified Office Announces Significant Expansion of its TCNIQ AI Analytics Suite of Products at Future of Work Expo 2025

By: TMCnet News    2/11/2025

Leading communications technology company Unified Office announced today the official expansion of its TCNIQTM AI-based business analytics suite of pr…

READ MORE

Beyond the Hype: Unified Office Provides Real AI Solutions for Business

By: Special Guest    2/8/2025

Unified Office is committed to creating practical AI applications that solve real world problems.

READ MORE