-
Notifications
You must be signed in to change notification settings - Fork 35
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3822 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
- Loading branch information
Showing
4 changed files
with
50 additions
and
82 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,50 +1,13 @@ | ||
|
||
This directory contains example scripts that demonstrate how to | ||
use Kaldi. Each subdirectory corresponds to a corpus that we have | ||
example scripts for. Currently these are all corpora available from | ||
the Linguistic Data Consortium (LDC). | ||
example scripts for. | ||
|
||
Explanations of the corpora are below. | ||
Note: the easiest examples to work with are rm/s3 and wsj/s3. | ||
|
||
wsj: The Wall Street Journal corpus. This is a corpus of read | ||
sentences from the Wall Street Journal, recorded under clean conditions. | ||
The vocabulary is quite large. | ||
Available from the LDC as either: [ catalog numbers LDC93S6A (WSJ0) and LDC94S13A (WSJ1) ] | ||
or: [ catalog numbers LDC93S6B (WSJ0) and LDC94S13B (WSJ1) ] | ||
The latter option is cheaper and includes only the Sennheiser | ||
microphone data (which is all we use in the example scripts). | ||
|
||
rm: Resource Management. Clean speech in a medium-vocabulary task consisting | ||
of commands to a (presumably imaginary) computer system. | ||
Available from the LDC as catalog number LDC93S3A (it may be possible to | ||
get the same data using combinations of other catalog numbers, but this | ||
is the one we used). | ||
|
||
tidigits: The TI Digits database, available from the LDC (catalog number LDC93S10). | ||
This is one of the oldest speech databases; it consists of a bunch of speakers | ||
saying digit strings. It's not considered a "real" task any more, but can be useful | ||
for demos, tutorials, and the like. | ||
|
||
yesno: This is a simple recipe with some data consisting of a single person | ||
saying the words "yes" and "no", that can be downloaded from the Kaldi website. | ||
It's a very easy task, but useful for checking that the scripts run, or if | ||
you don't yet have any of the LDC data. | ||
|
||
|
||
Recipes in progress (these may be less polished than the ones above). | ||
|
||
swbd: Switchboard (from LDC). A fairly large amount of telephone speech (2-channel, 8kHz | ||
sampling rate). | ||
This directory is a work in progress. | ||
|
||
gp: GlobalPhone (from ELDA). This is a multilingual speech corpus. | ||
|
||
timit: TIMIT (from LDC), which is an old corpus of carefully read speech. | ||
LDC corpous LDC93S1 | ||
|
||
voxforge: A recipe for the free speech data available from voxforge.org | ||
|
||
hkust: A recipe for HKUST Mandarin Telephone Speech (available from LDC) | ||
Note: we now have some scripts using free data, including voxforge, | ||
vystadial_{cz,en} and yesno. Most of the others are available from | ||
the Linguistic Data Consortium (LDC), which requires money (unless you | ||
have a membership). | ||
|
||
If you have an LDC membership, probably rm/s5 or wsj/s5 should be your first | ||
choice to try out the scripts. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters