I’m working on a quick 20 minute video for a non-profit group (and I do mean quick — less than two weeks from initial contact to the premiere). Everything has come together pretty nicely and I’ve finished the editing to my satisfaction in Cinelerra. Because some of the interviewees in the video have impaired speech, we’ve decided that it would be good to add subtitles in English to help viewers to follow the audio. Since we don’t want to stigmatize the people with the speech difficulties, I’m subtitling the whole video. This is fine since it isn’t too long. In any case, it also has the benefit of making the video accessible to the hard of hearing.

Rather than use Cinelerra’s title effect to add the captions, I decided to teach myself how to do it the “proper” way with a separate subtitle file.

Transcribing the Subtitles

Surprisingly, it isn’t completely obvious how and when to place the subtitles in the video. How long should they appear on screen? Should they be synchronized precisely with the video? How do you punctuate people’s verbal stream-of-consciousness ramblings? Do you include the speaker’s pauses and “um”s and so on? What do you do about words you simply cannot decipher? There’s lots of technical info out there on closed captioning, but not much in the way of tips for the beginner who is creating do-it-yourself video. The best synopsis I could find was a quote from this thread about live streaming closed captioning:

The real keys are timing and readability. Captioning is not just typing what the person is saying. You also decide when the caption appears and disappears. Tying a new caption to a shot change or any sort of movement that indicates that a person is about to speak is a good idea. The idea is that the caption itself should work with the rest of the visuals and not be too distracting. Don’t let any captions be too short or too long either. Right around two seconds a piece is a good rule of thumb. As far as readability, be wary of where you break a sentence if a caption is more than one line long. Again, the idea here is that the captions should flow.

Based on these thoughts, plus some common sense, I’ve devised the following informal guidelines for my project:

1) Try to transcribe precisely all of the actual words. Indicate long pauses by creating a new caption, or with ellipses or whatever punctuation seems appropriate. Ignore interjected particles such as “uh” or “um” unless they seem to be essential to the meaning of the utterance.
2) Synchronize the caption to start and end at the same time as the audio, unless the audio is less than two seconds long, in which case let the caption linger for two seconds.
3) Try to break captions on cuts in the video where possible.
4) Watch out for dramatic conclusions to sentences — keep the parts synchronized separately, so you don’t give away the ending in advance.
5) When a speaker is talking at length, use two line captions as required and try to break on sentence clauses or on pauses in the speech. Shorter is usually better.

Creating the Subtitle File

The technical process of creating subtitles is fairly easy, if tedious. I used the subtitleeditor application (tips here and here and here) which can be easily installed in Ubuntu to transcribe most of the titles. Unfortunately, I found it frustrating that the application sometimes seemed to arbitrarily ignore what I had typed, forcing me to re-enter all or most of many subtitles. Just for this step, I switched to the “Gaupol Subtitle Editor” which worked much the same way and didn’t seem to have the same bug. Next time, I’d investigate using a spreadsheet and then importing the data into the subtitle editor program. I found that transcribing and synchronizing had to be done as two separate steps anyhow.

For the synchronizing, I found the easiest method was to go back to the first Subtitle Editor app, use it to generate a waveform from the audio track of my video, then use the mouse and keyboard to set the correct start and end points for each subtitle. I set the “play/pause” shortcut to Super (Windows key) plus space, “short” (skip backward) to Super-left arrow, and “short” (skip forward) to Super-right arrow (yes, there are two keyboard shortcuts with the same name, but you can tell which is which by clicking on them in the shortcuts preferences). The default is for left-mouse-click to set the start point for the selected subtitle line, and right-mouse-click to set the end point.  Middle-click restarts the playback at the point in the wave timeline where you clicked.  Since I already had all the words transcribed, I just had to run through the audio track, listening and watching the waveform graph, using the shortcuts to move around, and click to select the right in and out points for each line. I found I had to split some of the subtitles, but that was also easy using the Subtitle Editor tool.  Overall, synchronization went pretty smoothly.

There are many different subtitle formats, but the default format worked fine for me.

Combining the Subtitles with the Video

According to the Cinelerra manual, there are three obvious choices for combining subtitles with your video:

# Distribute it with your video. People will have to load the appropriate subtitle file in their video player to actually see the subtitles.
# Use it with dvdauthor, to add the subtitles in a DVD. Read dvdauthor’s documentation for more information.
# Incrust the subtitles into the video using mencoder.

I used the second method, which seems to be the most flexible option, but also the most complicated.  In my case, I wanted to force the subtitles to appear rather than allowing the user to select whether to play them.  Dvdauthor provides a tool called spumux, which allows you to add a subtitle track to the video.  Then, you use dvdauthor (as you normally would) to build the dvd filesystem.   From there, the process is the same as for burning a non-subtitled video (I’ll write up my own process at some point, but there’s already a very good explanation from Crazed Mule: look here for exporting from Cinelerra and look here for creating a DVD).  You’ll need to create an xml file for spumux and another for dvdauthor.

Here is my subtitle xml file for the first step:

<subpictures>
<stream>
<textsub filename=”/home/kevin/Video/video_ykacl/project_xml/en_subtitles.sub” characterset=”UTF-8″
fontsize=”32.0″ font=”Arial.ttf”
horizontal-alignment=”center”
vertical-alignment=”bottom” left-margin=”60″
right-margin=”60″
top-margin=”20″ bottom-margin=”30″ subtitle-fps=”29.97″
movie-fps=”29.97″ movie-width=”720″ movie-height=”480″
force=”yes”/>
</stream>
</subpictures>

Note that for spumux to work, you MUST have copy or symbolic link to the truetype font file (in this case, Arial.ttf, but use whatever you like) in the .spumux folder in your home directory. I did it by running this command:

ln -s /usr/share/fonts/truetype/msttcorefonts/Arial.ttf /home/kevin/.spumux/Arial.ttf

The above command assumes you have the msttcorefonts package installed.  Also note the force=”yes” option, which tells the dvd player to always display these subtitles.

Once you’ve created the xml, you run spumux to combine your subtitle file with your video:

spumux -s0 $VIDEO_PROJECT/project_xml/en_subtitles.xml < $VIDEO_TMP/full_render.mpg > $VIDEO_TMP/full_render_subtitle.mpg

I put this command in a script in which I defined $VIDEO_PROJECT and $VIDEO_TMP — you can subsitute whatever path is appropriate.

Now you need to create another xml file for dvdauthor to use when building the dvd menu and filesystem (in my case the video is supposed to play automatically, so there is no actual menu):

<dvdauthor>
<vmgm />
<titleset>
<titles>
<subpicture lang=”en” />
<pgc>
<pre> subtitle=64; </pre>
<vob file=”/home/kevin/Video/video_tmp/video_ykacl_tmp/full_render_subtitle.mpg” />
</pgc>
</titles>
</titleset>
</dvdauthor>

The most important option in this file is the subtitle=64, which tells the dvd player to automatically display subtitle track 0 (see the links below for details on this). Now you just run dvdauthor:

dvdauthor -o dvd/ -x $VIDEO_PROJECT/project_xml/dvdauthor.xml

At this point you should have a working dvd filesystem ready to be burned onto an actual dvd. Before burning, you can test your output by running

vlc dvd:///path/to/dvd/

More Information About Subtitles

Explanation of how to add subtitles in dvdauthor is here.

Useful details on spumux is here.

Good examples here.