How I Tried Seamless, the AI Translator That Sounds Like Me
A tech enthusiast’s review of MetaAI’s new product that enables expressive and real-time cross-lingual communication
Watch this Spanish example below! Thanks, Graham Walker, for posting about this on LinkedIn.
MetaAI has just launched a new product for public testing: Seamless, a suite of AI language translation models that preserve expression and improve streaming. As a tech enthusiast and a wishful multilingual speaker, I was eager to try it out and share my experience with you.
German example below!
You can try it out personally here for free!
Here's what I found out about Seamless and its features.
Seamless is a system that enables expressive and real-time cross-lingual communication. It consists of three components: SeamlessExpressive, SeamlessStreaming, and SeamlessM4T v2, which are built on fairseq and UnitY architectures. Translation…cool AI stuff.
SeamlessExpressive is a model that transfers tones, emotions, styles, rhythm, and speech rate from the source speech to the target speech. It supports six languages: English, Spanish, German, French, Italian, and Chinese. As you saw above, I tested it with a few simple sentences, and I was impressed by how natural and expressive the translated speech sounded. It was like listening to a native speaker with the same personality as me. I even checked the accuracy of the German example above with one of our past Au Pairs from Germany!
SeamlessStreaming is a model that generates the translation while the speaker is still talking, using the EMMA algorithm. It reduces the delay to around two seconds and supports nearly 100 input and output languages for speech-to-text and speech-to-speech translation. You could try this with a live video stream of a news report in another language and you should be able to follow along with the translation in English without much lag or interruption. This is like having a simultaneous interpreter in your ear just like those cool UN meetings.
“It also incorporates toxicity mitigation and audio watermarking techniques to ensure responsible and safe use of the technology.”
SeamlessM4T v2 is the foundation model for Seamless, which achieves state-of-the-art results for automatic speech recognition, speech-to-speech, speech-to-text, and text-to-speech translation in 100 languages. It also incorporates toxicity mitigation and audio watermarking techniques to ensure responsible and safe use of the technology. This would be like having access to a universal translator that can handle any task and language. Mind blown!
Seamless by Meta appears to be a groundbreaking product that offers a new level of expressiveness and efficiency in language translation. It is currently available for public testing on MetaAI's website. Take it for a spin! I highly recommend playing with it and thinking about the future of this technology to bring people together, regardless of spoken language.