LUCID Corpus – London UCL Clear speech in interaction

Authors: Rachel Baker and Valerie Hazan

DP4

The LUCID corpus is made available to the Speech and Language research community for  non-commercial research purposes.

The participants:

  • 40 native southern British English speakers (20 pairs)
  • 20 male, 20 female
  • 18-29 years old (mean age: 23 years)
  • non-bilingual and normal hearing thresholds (20dB or below)

For each talker: 

Session1 DiapixUK (3 pix): casual speech good listening conditions
Session2 DiapixUK (6 pix): clear speech (Cochlear Implant simulation) each person hears vocoded speech for 3 pix
Session3 DiapixUK: (3 pix): clear speech (Noise or L2: 20 participants in each) one person hears speech in multi-talker babble (Noise) or has a partner who is a low-proficiency English speaker (L2)
Session4  Picture naming & sentence reading: casual speech ‘speak casually as is talking to a friend’
Session5 Picture naming & sentence reading: clear speech ‘speak clearly as if talking to someone who is hearing-impaired’
Speech type Speech style Minutes per participant
spontaneous (DiapixUK) casual, clear 110
read (sentences) casual, clear 30
semi-spontaneous (picture naming) casual, clear 30

Format:

  • Stereo wav files for two-way dialog and individual wav files for each speaker
  • Word-aligned orthographic transcriptions (praat TextGrids format), apart from the speech for the L2 speakers

Access:

– Audio files, TextGrids and picture materials are all available from the Speechbox Resource at Northwestern University:

We are grateful to our collaborator Prof. Ann Bradlow and her colleagues for enabling us to include the LUCID Corpus within this excellent resource.

Acknowledgments:

An acknowledgment should be included in all publications reporting work that has made use of the LUCID corpus, regardless of the format or the medium in which the publications are made. If using the diapix materials or recordings, the following can be referenced:

Baker, R., & Hazan, V. (2011). DiapixUK: task materials for the elicitation of multiple spontaneous speech dialogs. Behavior Research Methods, 43 (3), 761-770. doi:10.3758/s13428-011-0075-y

If using the read materials, please contact v.hazan @ ucl.ac.uk for an appropriate reference.

Get in touch!

We would be very interested to hear of research that is making use of the LUCID corpus or DiapixUK materials, so please contact Valerie (v.hazan @ ucl.ac.uk) to let us know how you are using these.