Reproducible ASR - Project RASR notes
This are the notes I took during the meeting with @DavidRisinamhodzi_46 discussing the preparation for the Sci-GaIA workshop in Dar es Salaam. We have started work on the app, during the Hackfest. The project is and planning is on Github. We discussed this before.
In order to demonstrate progress and the actual contribution of application, we need to define some kind of baseline. The ASR template is already being used widely, so what is our work going to contribute ?
So, we define a baseline experiment reproducibility test. The question is :
can we have the same training results for different languages using the same parameters ?
The speech recognition workflow consists of a few tasks :
- Feature extraction
- Create lists
We will be doing all of them including training and testing. Training builds models based on parameters in the template. These models are contained in
hmm files. We need to re-use the models, so they will be stored in the OAR.
There is another question of how language specific models may be.
The model is used when you extend to different recognition steps. Parameters are in the different scripts, mostly in
Experiments will be run and their results stored on gLibrary.
Tasks for Dar Salaam
This brings us to the todo list for the meeting in Dar :
- Ensure that HTK ASR template is executable on the sites (in CODE-RADE)
- Define the baseline experiment on a particular language - IsiNdebele
- Define what the baseline experiment will do :
- Get corpus
- change parameters
- Do training and testing.
After this baseline run we will have :
- a set of models (gLibrary)
- a set of accuracy results
- model + parameters + language = result.
We also need to have a bit of an idea of how to extend things, and where they can be extended. For example :
- We can swap out the language and get the same accuracy ?
- How does the accuracy depend on the variance of the parameters ?