There are happy days and there are days when you find out that a new technology can automatically transcribe interviews.
Trint was born as an idea in 2013, during the Mozfest conference in London, when Mark Boas met Jeff Kofman, a three decades experienced reporter. At that time Boas, Laurian Gridinoc and Mark Panaghiston were working on a technology to synchronize text and audio or video. Kofman thought if that could be used to automate speech-to-text process.
In January, 2016, that idea came to life and the beta version of Trint was launched. Trint is a new word that combines “transcription” with “interview”. Its creators hope that this may become a popular verb, even though is not in the dictionary. Yet. “I am gonna trint it”. Doesn’t sound that bad.
On September, GEN team published an interview with Kofman about the recent updates of Trint and the feedback they had received from reporters and news organizations. “I think we are on our way to becoming a cult. We constantly hear users call Trint ‘magic’.”
When I read Kofman and his optimism — “transcripts you can trust” he said — I got optimistic. Those days I was struggling with the transcription of my conversation with Duy Linh Tu, a professor at Columbia University who visited Chile. The interview length was 48 minutes. He talks very fast and I am a slow keyboard writer. A nightmare.
I checked Trint’s website and I used the free 30 minutes they offer.
I uploaded half an hour of conversation. Trint asked me for the accent of the english (British, Australian, American) and then it took two to three minutes to work with it. When it finished I opened the file and the work was done. A robot transcribed my interview. As I was checking the result I smiled and remembered all those hours typing like a resigned automaton. “Trint” not only reduced hours of work to a few minutes. It also synchronized text with audio, providing location and error correction.
Enough of words. See it for yourself.
In short: you can upload audios or videos, the text is “tied” to the audio, you can search, check and correct.
After that moment of euphoria, reality came. Like any work done by machines, it needs further verification. I started reading the transcript and while almost 90 percent was accurate, the text had mistakes impossible to ignore. And not necessarily attributable to the robot. For example: catch phrases that spread like a plague and that a human would have omitted. Or incoherent combinations when voices mingled. Or mispronounced words: I was disappointed — because of my pronunciation — when Trint transcribed “China” instead of “Chile”.
These things may and will improve. It is a fluid and fast tool. Its growth potential is unexpected: live automatic translation, for example. For interviews or spontaneous conversations can be volatile and imprecise at times. But for more controlled instances — a well pronounced speech — it is ideal.
Trint recognizes English — British, American and Australian — and “European Spanish.” Hopefully soon will expand to other latitudes and will adapt to pronunciations, modulations and accents.
Not bad for a robot.
Originally published here.