Be part of our each day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Study Extra
Gladia, an AI transcription and audio intelligence supplier, has raised $16 million in funding.
The Paris, France-based firm will use the funding to develop an end-to-end audio infrastructure – beginning with a brand new real-time audio transcription and analytics engine – enabling voice-first platforms to ship extra worth to their customers throughout borders with cutting-edge AI.
It’s a problem to rivals equivalent to Otter.ai and Fireflies.ai, in addition to different AI-based providers that transcribe voice conversations to textual content. In an interview with VentureBeat, CEO Jean-Louis Quéguiner defined to me why he began the corporate.
“As you can hear from a beautiful French accent, I’m not an English speaker and I was extremely frustrated with the accents,” Quéguiner stated. “That’s why I founded the company.”
I acquired a demo of the AI transcription, and it labored in actual time as Quéguiner spoke English along with his heavy French accent. I’m used to providers like Otter getting plenty of phrases unsuitable in a transcription, however within the first web page of outcomes from Gladia, I noticed no errors. He additionally confirmed how he may converse two totally different languages and the system may shift from one language to a different as wanted.
XAnge led the spherical, with participation by Illuminate Monetary, XTX Ventures, Athletico Ventures, Gaingels, Mana Ventures, Motier Ventures, Roosh Ventures, and Soma Capital.
Based in 2022, Gladia has now raised a complete of $20.3 million, with earlier seed investments headed by New Wave, Sequoia Capital (as a part of the First Sequoia Arc program), Cocoa, and GFC. Gladia lately was chosen to take part within the AWS generative AI accelerator program.
“Gladia represents the qualities we like to champion at XAnge: a bold, global tech team at the forefront of AI innovation, with a proven business model to unlock new opportunities across industries,” stated Alexis du Peloux, associate at XAnge, in an announcement. “In a fast-paced AI environment, Jean-Louis Quéguiner and his team have executed extremely well, and we are proud to back Gladia for the Series A.”
Given that almost all speech recognition fashions immediately are educated predominantly on English audio knowledge and are due to this fact inherently biased, Gladia prioritized constructing the primary real-time product that’s actually multilingual.
The brand new fine-tuned engine delivers superior real-time transcription in over 100 languages, together with enhanced assist for accents and the distinctive capability to adapt to totally different languages on the fly.
Gladia’s new engine is exclusive in its capability to extract insights from a name—just like the caller’s sentiment, key data, and dialog abstract—in real-time. This implies it takes lower than a second to generate each transcript and insights from a name or assembly utilizing Gladia.
New real-time AI transcription
Constructing an correct, low-latency, and multilingual engine in-house is a posh and resource-intensive job. It requires intensive experience in language understanding, real-time knowledge dealing with, with steady optimization and upkeep. Actual-time fashions require extra computing energy and will wrestle to provide correct output instantly attributable to restricted context.
Gladia’s new product permits corporations to bypass these challenges. The actual-time speech-to-text engine boasts an industry-leading latency of underneath 300 milliseconds with out compromising accuracy, whatever the language, geography, or tech stack used.
“Companies are spending valuable time and resources trying to incorporate multiple AI functions into their existing platforms,” stated Jonathan Soto, CTO of Gladia, in an announcement. “Our single API is compatible with all existing tech stacks and protocols, including SIP, VoIP, FreeSwitch, and Asterisk. This allows us to easily integrate real-time transcription and analysis into our customers’ AI platforms, so they can focus on delivering the best services to their end users.”
What’s forward
The corporate’s first async transcription and audio intelligence API launched in June 2023 and was primarily based on a proprietary model of Whisper ASR.
It quickly gained traction within the enterprise market, notably with assembly recorders and note-taking assistants. The API is now adopted by over 600 prospects world wide, together with Consideration, Circleback, Methodology Monetary, Recall, Sana, and VEED.IO and has greater than 70,000 customers.
“Gladia’s technology allows companies in vertical markets that need cutting-edge real-time transcription, including sales enablement and contact center platform, to shift seamlessly from manual post-call processing to proactive, low-latency workflows,” Quéguiner stated. “Whether it’s automated CRM enrichment or real-time guidance for support agents, Gladia is designed to help businesses operate smarter and more efficiently in record time, without requiring AI expertise in-house.”
Gladia will use the brand new capital to advance its R&D efforts and shortly carry to market a one-stop AI toolkit for audio and develop its product providing with further à la carte fashions—together with giant language fashions (LLMs) and retrieval-augmented technology (RAG). With a number of design companions within the contact-center-as-a-service (CCaaS) phase, the corporate is at present piloting an agent-assist answer powered by Gladia’s real-time AI engine. Moreover, Gladia will proceed to develop its expertise base because it prepares for worldwide growth.
“We are multilingual, and we have something that is called ‘code switching,’ which makes it unique,” Quéguiner stated. “You can start with the language and switch to another.”
He went on to indicate me that he may begin a name in English and provoke the transcription. Then he spoke French phrases, and the mannequin appropriately translated it in French.
“Keep in mind that [others] are not real time right now, and this one is real time,” he stated. “Usually, real time is a little bit less accurate. You can also have your own custom vocabulary in real time, which is pretty unusual, with us. We have the capability to extract some real-time insights.”
The service has an AI summarizer, and it’ll have new optionally available options within the coming months. Quéguiner stated that his service can even get acronyms proper and detect the swap to a different language.
“The mannequin we use is similar to LLMs (giant language fashions). It has no code decoder structure, which isn’t the case for a lot of the fashions that you just’ve seen with Fireflies, as an illustration.
The market contains “meeting recorders,” Quéguiner stated. The outcomes could be handed on to real-time insights, which can assist individuals like gross sales leads shut offers sooner.
The corporate additionally works with Name Facilities, giving them 30% sooner time to completion when they’re on the cellphone thanks to higher accuracy. The corporate will cost a flat payment equivalent to a per-hour pricing.