AVSubtitles - Forum

Speech recognition: testing Rev.ai

truc1979

2021-07-17 17:08:46

Hi there,
Thanks to @javabeanies, I spent a little time to test rev.ai.
So far, I've just tested 3 english sampes:
- One from Xena XXX (good audio)
- One from Chastity Johnson (vintage, so average audio)
- One from Elvis XXX (a song)

If you're interested, here are the results: https://1fichier.com/?1ez5wivgtt12l6qwwyst
2 srt for each samples, the *_yt.srt have been generated by YouTube.

At fisrt sight, the main difference is how rev.ai is able to choose very good timestamps for the SRT file.

YouTube was the only able to get the name "Xena", but I think rev.ai is a little bit more close to the reality. (However, on the sentence "just like a total Kyson we need to go check it out", I don't know which one is right :(

By the way, we can notice the same trouble wee noticed with Japanese sample, when the words are similar. You can see in these samples that some "know" became "no".

Unfortunately, Youtube and Rev.ai were unable to detect the sung part from "Elvis XXX" :(

Their API is quite simple to use, and their engine seems a little faster than Youtube (but yeah, I should test Google cloud instead).

As a conclusion, for the speech recognition on the 3 samples, I'd say it's a draw. 50/50. But for SRT generation, rev.ai is far better.

I hope to have time to check with different languages soon.