The AI bots have cast their vote for the 2020 presidential candidates in a study released by Trint, the leading automated transcription platform. The clear winners in the AI age: Senators Bernie Sanders and Amy Klobuchar, who both landed perfect scores with every word transcribed correctly by Trint’s AI. Former Vice President Joe Biden and former Representative Beto O’Rourke scored the lowest in speech-to-text accuracy.
Reflecting how accurate automated transcription has become, Sanders received a 100 percent score, despite his thick New York accent. His clear enunciation, high volume and slow, measured speaking style enabled Trint’s audio-to-text algorithm to transcribe the senator’s every word correctly, including transcribing “faw-tee” as “forty” and “pov-a-tee” as “poverty.”
Biden earned the lowest score, partly due to his tendency to slur words together, causing the AI to miss shorter words like “the” and “in.” The AI was unable to recognize and transcribe words spoken by Biden in five instances, and also substituted words like “we” for “to” and “our” for “his” in other cases. O’Rourke, who tends to trail off at the end of sentences, earned the second lowest score. For example, while O’Rourke said “those students marching not just for their lives but for all of ours,” the AI transcribed it as “those students marching not just for their lives but for all of us”.
President Donald Trump came in third to last. Although he speaks slowly, he often swallows suffixes, causing the AI to transcribe “tariffed” as “tariffs,” for example.
To conduct the analysis for the Trint Index, Trint ran the closing statements from the June 26th and 27th Democratic presidential primary debates through its automated speech-to-text software, using samples from the 11 leading candidates. Trint also tested President Trump, using his remarks from a June 29th press conference at the G20 Summit. Trint used Word Error Rate (WER), a common speech recognition metric measuring the number of substituted, deleted, and inserted words to produce a rate of inaccuracy. A transcript with a WER of 5.5 is 5.5 percent inaccurate or, conversely, 94.5 percent accurate.
“Thanks to huge advances in the last few years, the accuracy of automated transcription has become staggeringly good, proving that the pain of manual transcription is a thing of the past,” said Trint CEO and Founder Jeff Kofman, who spent more than 30 years as a reporter with CBS, ABC and CBC News. Kofman estimates he manually transcribed thousands of hours of audio during his broadcasting career. “The proof is in the results: nine out of the twelve candidates in this Trint Index were transcribed with better than 98 percent accuracy. The results show that with clear audio and little background noise, A.I. can even transcribe Senator Sanders’ distinctive accent with 100 percent accuracy.”
Here are the Trint Index results for the top Democratic candidates and Donald Trump:
1. Amy Klobuchar: 100.00% accuracy (tie)
Bernie Sanders: 100.00% (tie)
2. Cory Booker: 99.50%
3. Andrew Yang: 99.24%
4. Elizabeth Warren: 98.88%
5. Pete Buttigieg: 98.84% (tie)
Kamala Harris: 98.84% (tie)
6. Marianne Williamson 98.58%
7. Julian Castro 98.47%
8. Donald Trump: 97.37%
9. Beto O'Rourke 95.83%
10. Joe Biden 95.33%
Kofman notes that even the candidates at the back of the pack have impressively accurate results: “While automated speech-to-text has made huge strides, users can’t assume its results will be perfect. With Trint’s interactive editor it’s easy for users to listen, verify and correct so they have transcripts they can trust.”
Since its launch, more than 300,000 users have transcribed and edited over 1.5 million audio/video files using Trint. The company’s enterprise and team clients include major media organizations (The Associated Press, Washington Post), video production companies (America’s Test Kitchen, NowThis News), universities (Princeton, New York University) and market researchers (Clutch).