In addition to captioning, audio and video should have transcripts. A transcript is a text document of the content, so someone can read along. Transcripts should be formatted for reading, with indications of who is speaking. For example:
Jeff Dubrow
I'm Jeff Dubrow in my normal voice, but in my umpire voice I am [yells] Jeff Dubrow!
You may have seen something formatted like this, which is actually a transcript file that is formatted for creating video captions, the timestamp tells the application where in the video to place the text.
1
00:00:00,690 --> 00:00:08,700
I'm Jeff Dubrow in my normal voice, but in my umpire voice I am JEFF DUBROW!
When we are providing transcripts for the user they should be formatted in the first method, so it is easy to read.