The Captions

This post is severely out of chronological order, since working with captions is more or less the last thing you do when publishing a course. It’s what I’m doing now, and believe me, it’s the least fun part and the part taking the most time (except for actually preparing and recording the course material, but that’s not my problem; Hania did all that).

The way this works with Udemy is that once you’ve created your entire course and uploaded it, you can ask the system to generate captions for the videos. They’re using speech-to-text for that, and it’s amazingly good. It’s almost perfect, but that “almost” is the problem. It may be great for a cooking course, and if the speaker is a native American, but mathematics… not so much.

Bad caption
An “interesting” interpretation of some mathematical terms.

Just so you know, nothing about the Middle East is ever discussed in this course. The major problem with these weird caption interpretations is that it takes me a while to recover enough from all the laughing to get on with my work.

Udemy has a built-in editor for captions, looking like this:

The Udemy caption editor.

The idea behind this caption editor is fine, but it’s very hard to use. The problem is in the lag caused by, no doubt, heaps of scripts to make the page work. It’s hard to click in the right places, and generally moving around is laggy1.

Other things I think Udemy needs to fix are the activation of the captions. As soon as the captions are auto-generated, they’re made public. You can deactivate them, but then you can’t edit them anymore. So you have to let the horrible auto-generated version remain public while you try to edit and fix them as quickly as possible. Remember that for a 50 hour course like our first one, this will take at least double or triple that time to listen through everything and correcting it, which means 100 to 150 hours. Oh, boy… and meanwhile students risk having to ask themselves why they’re going to Iraq or what a “Santorum center” is doing in this context.

Anyway, back to the actual editing. I quickly tired of the Udemy editor. But, there’s a way of downloading the caption text files and editing them offline. You can download one at a time and edit it, then upload it again, but if you do, take care to give it a file extension of “vtt”. Udemy forgot to mention that little fact, which caused me some aggravation.

Or you can download the lot of them in one go using the menu at the top right of the “captions” screen in Udemy. It says “bulk download”. In that zip file, you’ll find all the caption files in WebVTT format, with pretty useful file names and the correct extension.

The three dots way up to the right next to the “disable” box.

So what you do is that you open a text editor (in my case Sublime Text) with the caption file to the right, then the Udemy caption editor on the left. You can start and stop the Udemy playback with shift-space, backup a second with shift-left arrow, forward a second with shift-right arrow. Obviously, this is nothing but pain, since that means shifting back and forth between the two apps. Something needs to be done.

Udemy caption editor and Sublime Text next to each other.

Keyboard Maestro to the rescue. I created three macros in KM to do this dance for me. All three are in a group that includes only Sublime Text, so the macros have to be started with ST in focus. Each of the macros shift to Brave (Udemy needs to be foreground window in Brave), then send the keystrokes (shift-space or arrows), then shift back to ST. I trigger the macros with F1, F2, and F3, so they turn into “back”, “play/stop”, “forward”, respectively.

The “back” hot key definition in Keyboard Maestro”. I’m using three repeats of the keystroke to make it a more reasonable three-second interval.
Start and stop. It works better, though not 100%, with the slight delay there.

So that’s where I am right now. A long hard slog getting through these captions, but it has to be done.

Once the english captions are done, we can activate automatic translations to, I think, eight other languages. But I’m pretty sure we won’t edit any of those by hand.

  1. This may be better with a faster machine, though. I’ll get back to that subject later.

Leave a Reply

Your email address will not be published. Required fields are marked *