Overview

The following case study explores the challenges faced by a client who required a CUStom speech dataset to be collected from 1000 native individuals with diverse accents,dialects, genders, and age profiles. The client had specific requirements for the dataset, which included the recording of scripted sentences. This case study highlights thestrategies and solutions implemented by our team to meet the client’s needs and overcome the obstacles encountered during the dataset collection process.

Client's Method of Data Collection

They created a process to collect speech data while working with a small group of people, following the steps outlined below:

  • They are sending 100 sentences in excel file to a specific person
  • Then user will go to specific open source sound recorder and record each sentence at a time from this excel file
  • Then user will download the recording and rename it
  • And corresponding to that recording user will create a text file that contain that sentence and rename that too
  • Once done with all recordings and file creation user will create one more text file containing all user metadata like age, gender, language, country etc

Client's Method of Data Collection

They created a process to collect speech data while working with a small group of people, following the steps outlined below:

  • They are sending 100 sentences in excel file to a specific person
  • Then user will go to specific open source sound recorder and record each sentence at a time from this excel file
  • Then user will download the recording and rename it
  • And corresponding to that recording user will create a text file that contain that sentence and rename that too
  • Once done with all recordings and file creation user will create one more text file containing all user metadata like age, gender, language, country etc

Client's Method of Data Collection

They created a process to collect speech data while working with a small group of people, following the steps outlined below:

  • They are sending 100 sentences in excel file to a specific person
  • Then user will go to specific open source sound recorder and record each sentence at a time from this excel file
  • Then user will download the recording and rename it
  • And corresponding to that recording user will create a text file that contain that sentence and rename that too
  • Once done with all recordings and file creation user will create one more text file containing all user metadata like age, gender, language, country etc

Client's Method of Data Collection

They created a process to collect speech data while working with a small group of people, following the steps outlined below:

  • They are sending 100 sentences in excel file to a specific person
  • Then user will go to specific open source sound recorder and record each sentence at a time from this excel file
  • Then user will download the recording and rename it
  • And corresponding to that recording user will create a text file that contain that sentence and rename that too
  • Once done with all recordings and file creation user will create one more text file containing all user metadata like age, gender, language, country etc

Yugo is our state-of-the-art mobile application for collecting speech data.
Here’s how it works:

  • To get started, users can download the app from the Play Store. During sign-up, we collect information like age, gender, country, and dialect, which we later use as
    metadata.
  • We no longer have to send a script to each individual for recording. Now, we can create a project in the admin panel, upload the scripts, and assign them to specific users.
  • Once a user is assigned a batch of sentences, they can see their progress and status, including how many sentences they’ve completed and their QA status. The User will
    receive one sentence at a time on their screen and can record it, play it back, listen to it, re-do it if necessary, and submit it from the app.
  • When a user has finished recording their assigned batch of sentences, we can assign the entire batch for review to QA from the admin panel. The reviewer listens and
    compares each recording with its corresponding sentence on the screen, one at a time, to ensure that the quality is sufficient. If it’s not, the reviewer will reject the
    recording with a comment, and it will be automatically assigned to the specific user to re-record.
  • We can also use the app to provide instructions and sample recordings for recorders and reviewers, ensuring that they know what we expect from them in terms of
    recording quality. These instructions are available at any time within the app.

To accomplish our client’s goal, we onboarded native German and Spanish participants from our global crowd community for this project. We aim for an even distribution of
participants across age groups and genders, as shown below:

Recommended for you

A leading company working in speech recognition and natural language processing technology approached us with the requirement...
A leading company working in speech recognition and natural language processing technology approached us with the requirement...