Overview

The Speech Test Videos corpus consists of recordings of common speech test materials (see table below). The recordings incorporate multiple talkers and repetitions, with utterance length and context ranging from isolated vowels and syllables to highly-predictable sentences. Brief test utterances that are embedded in the carrier phrase (“You will mark ____ please.” ) have been marked for easy extraction of the audio. The recorded audio and video signals have been carefully synchronized. For most sets of materials, recordings were made with two male and two female talkers; in a few sets it was one male and one female.

Click on the titles in the table below for more information on each group of recordings.

Test MaterialCarrier PhraseTalkersRepetitionsNumber of Items
CV syllablesyes2M, 2F31320 (330 per talker)
VC syllablesyes2M, 2F31200 (300 per talker)
hVd syllablesyes2M, 2F3180 (45 per talker)
MRT wordsyes2M, 2F33276 (819 per talker)
Numbers 0-10yes1M, 1F366 (33 per talker)
Numbers 0-99no1M, 1F3600 (300 per talker)
High-Probability Spin Sentencesno2M, 2F1200 (200 per talker)
Nonsense Sentencesno2M, 2F1200 (200 per talker)
Structured Sentencesno2M, 2F12000 (500 per talker)

File Formats

 

All materials are available as:

(i) videos — with full HD 1920×1080 pixel resolution, stored in .mov files with H.264 video encoding and a single channel of uncompressed 24-bit audio;
and
(ii) audio-only — single-channel, 24-bit .wav files.

The .mov files for the test items spoken in the carrier phrase include chapter markers indicating the start and end time of the item within the phrase. Matlab .mat files are also provided that contain the key-word start and stop times for each file.

All signal levels have been normalized to an RMS of -25 dB re Full Scale; .mat files provide the original RMS levels.

CV Syllables

Test MaterialCarrier PhraseTalkersRepetitionsNumber of Items
CV syllablesyes2M, 2F31320

Each of 22 initial consonants were combined with each of five vowels to form 110 CV syllables.

Initial Consonants

/ p, t, tʃ, k, f, ɵ, s, ʃ, b, d, ʤ, g, v, ð, z, ʒ, m, n, ɭ, r, w, j /

Vowels

/ ɑ, eɪ, i, o, u /

Sample

M1

VC Syllables

Test MaterialCarrier PhraseTalkersRepetitionsNumber of Items
VC syllablesyes2M, 2F31200

Twenty final consonants were combined with the same five vowels as used with CVs to form 100 VC syllables.

Final Consonants

/ p, t, tʃ, k, f, ɵ, s, ʃ, b, d, ʤ, g, v, ð, z, ʒ, m, n, ɭ, r /

Vowels

/ ɑ, eɪ, i, o, u /

Sample

F1

hVd Syllables

Test MaterialCarrier PhraseTalkersRepetitionsNumber of Items
hVd syllablesyes2M, 2F3180

Fifteen vowels were spoken in hVd context in the carrier phrase.

Vowels

/ i, ɪ, e, æ, ɑ, ɔ, o, ʊ, u, ɜr, eɪ, ʌ, aɪ, ɔɪ, aʊ /

Sample

M2

MRT Words

Test MaterialCarrier PhraseTalkersRepetitionsNumber of Items
MRT wordsyes2M, 2F33276

The words of the modified rhyme test (House et al., 1965) were recorded in the carrier phrase.

wentsentbentdenttentrent
holdcoldtoldfoldsoldgold
patpadpanpathpackpass
kitbitfithitwitsit
mustbustgustrustdustjust
teakteamtealteachteartease
dindilldimdigdipdid
bedledfedredwedshed
pinsintinfindinwin
dugdungduckduddubdun
sumsunsungsupsupsud
seepseenseetheseekseemseed
nottotgotpothotlot
vesttestrestbestwestnest
pigpullpinpippitpick
backbathbadbassbatban
waymaysaypaydaygay
pigbigdigwigrigfig
palepacepagepanepaypave
canecasecapecakecamecave
shopmopcoptophoppop
coiloilsoiltoilboilfoil
tantangtaptacktamtab
fitfibfizzfillfigfin
samenamegametamecamefame
peelreelfeeleelkeelheel
harkdarkmarkbarkparklark
heavehearheathealheapheath
cupcutcudcuffcusscub
thawlawrawpawjawsaw
pigbigdigwigrigfig
penhenmenthendenten
puffpuckpubpuspuppun
beanbeachbeatbeakbeadbeam
heatneatfeatseatmeatbeat
dipsiphiptipliprip
killkinkitkickkingkid
hangsangbangrangfanggang
tookcooklookhookshookbook
massmathmapmatmanmad
rayrazerateraverakerace
savesamesalesanesakesafe
fillkillwillhilltillbill
sillsicksipsingsitsin
balegalesaletalepalemale
wicksickkicklickpicktick
peacepeaspeakpeachpeatpeal
bunbusbutbugbuckbuff
sagsatsasssacksadsap
funsunbungunrunnun

Sample

F2

Numbers 0-10

Test MaterialCarrier PhraseTalkersRepetitionsNumber of Items
Numbers 0-10yes1M, 1F366

The eleven numbers (“zero’, “one”, . . ., “ten”) were spoken by two talkers with the carrier phrase.

Sample

F1

Numbers 0-99

Test MaterialCarrier PhraseTalkersRepetitionsNumber of Items
Numbers 0-99no1M, 1F3600

The one-hundred numbers (“zero’, “one”, . . ., “ninety-nine”) were spoken by two talkers with no carrier phrase.

Sample

M1

High-Probability Spin Sentences

Test MaterialCarrier PhraseTalkersRepetitionsNumber of Items
High-Probability Spin Sentencesno2M, 2F1200

This set comprises 200 high-context SPIN sentences (Bilger et al., 1984); for example, “Hold the baby on your lap.”

Sample

M2

Nonsense Sentences

Test MaterialCarrier PhraseTalkersRepetitionsNumber of Items
Nonsense Sentencesno2M, 2F1200

This set comprises 200 syntactically-correct but semantically-meaningless sentences; for example, “He was fought by the highway on your lap.”

These sentences were constructed so that the final word in the ith sentence is the same as that in the ith High-Probability Spin Sentence.

Sample

F2

Structured Sentences

Test MaterialCarrier PhraseTalkersRepetitionsNumber of Items
Structured Sentencesno2M, 2F12000

Two thousand unique five-word syntactically-constrained sentences were constructed by selecting randomly from ten choices for each word, given below. This set was divided into four sets of 500 sentences; one set was spoken by each of the four talkers.

Petergotthreelargedesks
Kathyseesninesmallchairs
Lucyboughtsevenoldtables
Allengiveseightdarktoys
Rachelsoldfourheavyspoons
Williamprefersnineteengreenwindows
Stevenhastwocheapsofas
Thomaskeptfifteenprettyrings
Dorisorderedtwelveredflowers
Ninawantssixtywhitehouses

Sample

F1

Purchase

Speech materials are sold in the following sets, which can be ordered for individual talkers.

Test MaterialAudioVideo
CV syllables$400$600
VC syllables$400$600
hVd syllables$100$200
MRT words$700$1100
Numbers 0-10$100$200
Numbers 0-99$400$600
High-Probability Spin Sentences$500$900
Nonsense Sentences$500$900
Structured Sentences$800$1300

Notes:

  • Sets of videos include the corresponding sets of audio-only .wav files.
  • All selections include multiple repetitions if available.
  • Price listed is for one talker

For additional talkers, add:

  • 75% of the one–talker price for the second talker
  • 50% of the one–talker price for the third talker
  • 25% of the one–talker price for the fourth talker

Thus, for all four talkers, multiply the one-talker price by 2.5.

To purchase STeVi please contact Sensimetrics.