About Captioning

Captioning involves a combination of:

  • Speech transcription. e.g. >> Let me begin by acknowledging... ; and
  • Intersemiotic translation of non-speech audio information to text. e.g. [Church bells chime in the distance].

Captions v. Subtitles

Captions and subtitles are easy to confuse because the words are sometimes used interchangeably and because they both typically appear at the bottom of video content.

Captions are a text representation of spoken words and audio information, in the same language, e.g. words spoken in English and written in English. They are designed to assist people who are deaf, hard of hearing, are listening in an environment where they can't access audio or who are accessing audio content which is not in their native language.

Subtitles are a text representation of spoken words in one language that have been translated into another, e.g. words spoken in Spanish and written English. They are designed to assist people who do not speak the source language or have it as their second language.

Presentation

Captions are often overlaid at the bottom of video content, so that users can watch the video and glance down to clarify words that they are unsure about.

But captions can also be presented in a number of other ways. Whilst a presenter speaks in front of a group, an audience member might access captions on a projector or on their laptop.

Prerecorded video v Live video

Where video content is created prior to publication, a caption file is used to store a time coded text version of the audio content.

The most effective way of providing captions is by producing a caption file prior to publishing the video content. There are two types of captions: open captions, which are burnt into the video, and closed captions, which is a text file with a time coded text version of the audio content.

The benefits of closed captioning are:

  • The captions are synchronized to speech, making it much easier for users to follow
  • Because closed captions are in a separate file, they are more readable and their appearance can be customised by end users
  • No additional work is required on the day of the event
  • Captions are regularly used by all users, not just those who are hard of hearing or deaf
  • Captions can be indexed by search engines
  • Supported by Facebook, Twitter, YouTube and Vimeo

In the case of live content, captions can either be created by a human transcriber listening from a seperate location (live remote captioning) or by a computer converting audio information into text (auto-captioning).

Human captioning v. Auto-captioning

People who are deaf or hard of hearing usually prefer captions created by humans because they are about 99% accurate. Captions created by computers are about 85 - 95% accurate. The problem is that the words that the computer gets wrong, such as domain specific language, are the same words that people who are hard of hearing want to clarify. In addition, once a user knows that some words are inaccurate, they are left having to guess which computer translated words are incorrect. That involves increased cognitive load and reduced efficacy. As a result, hard of hearing students often choose to turn off videos with auto-captions.

Despite the drawbacks of auto-captions, there are contexts in which they are preferred by people who are hard of hearing, such as:

  • Where there isn't sufficient time to arrange captioning by a human, e.g. a hastily organised office meeting
  • Where people are having confidential discussions and do not feel comfortable with a third person listening in, e.g. a meeting with a counsellor

Types of Captioning Solutions

There is no single captioning solution that will suit everyone.

The content, context and user will affect the choice of captioning solution.

  • Content: What sort of content is being presented? Is it a pre-recorded video, a live webinars or an external learning resource?
  • Context: How and where is the content being provided? Is the content being delivered face to face, in a lecture or tutorial? Is it being delivered online as part of a subject? Is it a staff meeting? Is it a public event?
  • User: Who will be accessing the content? Are they prospective students, current students, alumni, staff or the wider community.

The Future of Captioning

Altered study and working arrangements, during the Covid-19 pandemic, have highlighted the need for captioning. An alternative perspective is that there has always been a need for captioning, but it's just been ignored. Regardless of which view you take, the increased demand for captioning is here to stay.

Contact Us

For assistance or to report accessibility problems please contact:

Andrew Normand
Web Accessibility Lead
Email: anormand@unimelb.edu.au
Phone: +61 3 9035 4867