Transcribing and Captioning Video and Audio Files

National Centers and other vendors are required to transcribe and caption their own multimedia, including webinars and audio conferences, for use on the Early Childhood Learning and Knowledge Center (ECLKC). An accurate transcript and captioning file must be created for each video. Audio-only presentations do not require closed captioning.

Review examples of, and suggestions for, addressing various issues frequently flagged during content and technical reviews by the Head Start Information and Communications Center (HSICC) team.

Transcribing the Audio

Identify Speakers

Identify all speakers using their full name in the first instance and just the first name after that, followed by a colon (:). If the speaker's name is not provided, use clear descriptions. Such descriptions may include: Teacher, Girl, Father, Narrator, and Moderator. Remember to label speakers each and every time they speak.

If there are multiple speakers with the same descriptions, use numerals to differentiate speakers. Do not include pound/number sign (#) or the abbreviation for number (No.) in speaker labels. When referring to the Director of the Office of Head Start, use the current incumbent’s last name and the appropriate prefix (e.g. Dr., Mr., Mrs.).

DO NOTDO
BRANDY BLACK THACKER: Thank you, Kiersten.Brandi Black Thacker: Thank you, Kiersten.
[Marco Beltran] And welcome, everyone.Marco Beltran: And welcome, everyone.
Bernadine:Bernadine: Hi, everyone.Dr. Futrell: Hi, everyone.
Teacher #2: This is a great resource.
Teacher No. 3: Yes,  I agree.
Teacher 2: This is a great resource.
Teacher 3: Yes, I agree.
Adriana Bernal: We are so happy to have you here.
Roselia Ramirez: Yes, this is very exciting.
— Next caption frame —
Adriana Bernal: To begin, we want to hear from you.
Adriana Bernal: We are so happy to have you here.
Roselia Ramirez: Yes, this is very exciting.
— Next caption frame —
Adriana: To begin, we want to hear from you.

Ensure Accuracy and Avoid Interruptions

The transcript and captions must match the audio/video as closely as possible to align with Section 508 (federal accessibility) standards.

  • Use [Inaudible] if the speaker cannot be heard clearly.
  • Do not include stuttered words and verbal pauses at the beginning of sentences (e.g. “So, …”, “And …”).
  • Do not include brief interruptions to another speaker’s dialogue. This may include someone saying, “um,” “right,” “OK,” or “mm hmm,” in response to or over what someone else is saying.
  • Use ellipses (…) for breaks or pauses at the end of a sentence.
  • Use an “en”  dash (–) between repetitious words and phrases that cannot be eliminated. Do not use double dashes (--).
DO NOTDO
We'd like to, um - we need to orient families.We'd like to  We need to orient families.
It was as if. [end of speaking]It was as if …
I'd like to – to welcome you.I'd like to welcome you.

Staff took pictures with different emotions --

smiling, frowning, or frustrated.

Staff took pictures with different emotions –

smiling, frowning, or frustrated.

Adia: If you go onto the ECLKC, there is a place where you can go.
Ann: Right.
Adia: It’s the What’s New page.
Adia: If you go onto the ECLKC, there is a place where you can go. It’s the What’s New page.
Blair: And so, where can we find them?
Melanie: So, it really depends.
Blair: Where can we find them?
Melanie: It really depends.
Wendy: We were gonna to there.Wendy: We were going to go there.

Non-spoken Content

To capture unspoken content and sound effects such as music, applause, or laughter, use square brackets. Brackets are also used to show any inaudible or indecipherable spoken content and to denote video clips within a video. Do not include in brackets or transcribe any slides or other visuals that are displayed on screen. Capitalize the first character in a bracket. All other characters and words should be lowercase.

DO NOTDO
[Clears Throat][Clears throat]
[humming][Humming]
[LAUGHTER][Laughter]

Example: A speaker named Paul throws his chalk, then says, “Let’s take a look at this function.”
Incorrect: Paul: (throws chalk) Let’s take a look at this function.
Correct: Paul: [Throws chalk] Let’s take a look at this function.

If there is a video within the video, use [Video begins] to mark the beginning of the video and [Video ends] to mark the end of it. These bracketed phrases must appear on their own line.

Punctuation and Formatting

Follow these punctuation and formatting tips:

  • Use quotation marks for titles, not italics.
  • Transcripts should be split into short to medium paragraphs (4 to 8 lines).
  • Delete double spaces between words and sentences.

Creating a Caption File

Break Up the Transcript

Once the transcript has been reviewed and approved, break it up into captions that will appear on screen. In the file, a captioned phrase:

  • Is no more than 80 characters, including spaces (up to 40 characters per line)
  • Appears on up to two lines within a single timecode

It is important to follow this guidance because phrases that exceed the limit will not render properly on screen. Follow these additional captioning tips:

  • Do not start a new sentence near the end of a caption frame.
  • Do not include two or more speakers in a single frame.
  • Do not separate compound words or phrases in different frames (e.g., Head Start, Early Head Start, Dr. Smith, family-centered services, the director).
  • Delete spaces at the beginning or end of each line to remove the extra character.

Make the 80 Characters Reader-Friendly

The 80-character phrases should be broken into two lines that will appear simultaneously in a single frame. Insert hard returns where the text should break in the frame rather than relying on the natural break, which may be awkward (e.g., long top line and only one or two words on the second line). In a two-line caption, each line may be up to 40 characters long, including spaces.

DO NOTDO
That's a great success for a program that was conceived of duringThat's a great success for a program
that was conceived of during
The higher education community is also a great partner. We have
— Next caption frame —
provided on-campus Early Head Start services at our local college;
The higher education
community is also a great partner.
— Next caption frame —
We have provided on-campus Early
Head Start services at our local college;

Insert Time Codes

The first word of each captioned phrase must be synced to the audio as closely possible, and not more than a half-second off. The user needs to be able to follow the captions as they are spoken, so as not to confuse hearing people who are simultaneously using the closed captioning function.

Time captions should appear on the screen for a minimum of 1.5 seconds to ensure they meet Section 508 criteria. Captions cannot be followed if they flash across the screen too quickly.

Spanish Transcripts and Captions

For captioning in Spanish, follow the tips above as well as the guidelines in Transcribing and Captioning Video and Audio Files in Spanish.

Embedded Audio/Video in Another Language

Sometimes, recorded presentations include an embedded video or audio element in another language. In these cases, please follow the guidelines below.

Presented in English with Embedded Spanish

When a presentation is facilitated in English (e.g., BabyTalks), the embedded Spanish-language audio or video needs to be translated into English and included in the transcript for captioning with the clarification in square brackets “[Speaking Spanish]”. For example:

“[Video begins]

Teacher: [Speaking Spanish] When the ball comes to you, it's your turn to show us the funniest face you can make.

[Video ends]”

Presented in Spanish with Embedded English

When a presentation is facilitated in Spanish (e.g., Conexiones), the embedded English-language audio or video needs to be translated into Spanish and included in the transcript for captioning with the clarification in square brackets “[En inglés]”. For example:

“[Inicio del video]

Docente: [En inglés] Podemos hacerle un lugar a él también para que juegue con nosotros.

[Fin del video]”

Embedded Languages Other Than English and Spanish

For any unidentified language, the clarification in square brackets should be: 

  • English – “Teacher: [Speaking a foreign language]”
  • Spanish – “Docente: [En idioma extranjero]” 

If the language is clearly identified in the video or audio, identify it in the closed brackets. In that case, the name of the language should be used in the square brackets; for example: 

“Teacher: [Speaking French]”