All about transcripts

June 13, 2022

Transcripts are similar to but they extend beyond the spoken words. Transcripts include any important sound effects or other visual descriptions to those who:

Are hard of hearing
Are deaf
Are deafblind

For those who are deafblind transcripts are the only way for deafblind users to access this type of content.

Transcripts should:

Identify who is speaking
Identify sounds such as applause and laughter
Describe any visual information

How to present transcripts

Transcripts are usually presented on the webpage of the media content below or above the video. Or they can be viewed on separate page or dialog box with a link to that location near the video.

When are transcripts needed

While they are only required for pre-recorded audio-only it can be beneficial for accessibility if you provide transcripts for both audio-only and video-only content.

Prerecorded multimedia (video plus audio) should have a text transcript

This is important for deafblind users who can easily access the information via a refreshable braille device

Prerecorded audio-only must be accompanied by an easily reachable transcript

This transcript should provide any dialog, narration, or significant sounds/descriptions that occur in the audio. You must also ensure that this transcript is easy to access by placing it near (under) or very close to the audio. If you make it hard to find/access then it really is not beneficial at all.

Prerecorded video-only must include audio descriptions or a text transcript

These should include all the text, graphics, facial expressions, actions and any other significant information that is visual. Transcripts are important because they are the only way deafblind users can access your content, so this makes it the better alternative. Audio descriptions can help provide this information to any users who are blind but can hear perfectly well but does not benefit those who are deafblind.

Text transcript describing the visual aspects of the video should be provided for video-only content

This transcript should basically function as a long descriptions of the video-only content. This transcript also should appear near (above or below) the video-only content, and be easy to find/access.

Methods of presenting transcripts

Place directly on the page with the audio / video player
- This allows easy access
- Also helps SEO since search results will lead users directly to your page
Provide a link to the transcript
- Helps not clutter the page and gives the users an option to see the transcript if needed
- Put the link either next to or directly below the video / audio player for your users to access
Provide interactive transcript which allows users to access specific places within the video / audio content

Interactive transcripts

Interactive transcripts allow users to search videos and navigate anywhere in that video by selecting a sentence or word that is in that transcript. This is great for SEO because it allows search engines to crawl through the transcripts text.

What to include in a transcript

Transcripts must be verbatim for scripted content

If audible content is presented based on a script then the transcript must present all the script verbatim (including all ums and uhs) that occur intentionally in the script

Transcript should be verbatim for unscripted or live content - with optional exceptions for stuttering or filler words

This is for any broadcasts, interviews or unscripted content. If there is a lot of um's and uhs that can inhibit the readability of the transcript you can skim back any of these utterances to enhance how the transcript is read.

Important visual events must be described in the transcript

Its very important that the visual information that happens on the screen contributes to the meaning / understanding of what is being presented. This ensures that all your users have access to the same content.

Important background sounds must be conveyed in transcripts (preferably in brackets or parentheses)

Important sounds could be:

Background sounds
Sound effects
Background music
Persons tone
Any other sounds that contribute to setting the mood and context of the video and audio being presented

Speech that is spoken off-screen must be captured in the transcript

Any information provided off-screen can be provided by italicizing the spoken language but since deafblind users cannot see them you can add descriptive wording before the text:

ex. through the telephone: 'hello'

Must identify the person speaking in the transcript

The best way to to do this is to use a label for the speaker. This can be a name or a role in all caps followed by a colon, then the spoken text. Uppercase text formatting is generally reserved for descriptive wording/labels. Mixed cased text formatting should be reserved for the actual dialog or narration.

Transcripts should use punctuation to convey emphasis whenever possible, rather than writing extra text to explain the emphasis

This helps ensure that transcripts are presented in a clean format. Sometimes punctuation may not be enough so you may need to use descriptive wording.

Transcript must not reveal intentionally-withheld information in content before its appropriate time

If there is intentionally withheld information you must not reveal it sooner that it should be. If you reveal the content sooner in the transcript it can spoil the experience for your user.

Music should be identified by title and artist whenever possible in the transcripts, unless doing so would be inappropriate in the content

Any music that is apart of the action or has some significance should be identified. You can use a label of MUSIC that is followed by tht title in quotations and the artists name (if it is known).

Important music lyrics should be included in transcripts if they are relevant to the meaning of the content

You can use music notes to differentiate singing from spoken words in captions but with a text transcript you must provide a label to indicate that the lyrics are being sung. You can do this by coding the music notes in addition to a text label. It is best through to provide a text label either within brackets or parentheses to identify singing.

Ex.[Singing]: O say can you see

Transcripts should indicate when speech is whispered or mouthed

This helps keep your audience engaged in what is taking place. You can do this by providing a descriptive label in all caps that can be used with either brackets or parentheses.

When speech is inaudible or difficult to perceive clearly the transcripts should say so using neutral language

You should avoid using unintelligible or incoherent babbling and use neutral language like unclear in brackets or parentheses.

Strong language should be retained and not edited out of transcripts, whenever possible or should be bleeped or muted to match style/content

If any strong language is bleeped out you should replace the word with BLEEP. If the text is partially muted you can use dashes or ellipses between the first and last characters.