Kaldi 是一个语音识别工具。使用 C++ 开发,基于 Apache 许可证。目的是为语音识别研究者提供。
Kaldi is similar in aims and scope to HTK. The goal is to have modern and flexible code, written in C++, that is easy to modify and extend. Important features include:
Code-level integration with Finite State Transducers (FSTs)
Extensive linear algebra support
Extensible design
Open license
Complete recipes
The goal of releasing complete recipes is an important aspect of Kaldi. Since the code is publicly available under a license that permits modifications and re-release, we would like to encourage people to release their code, along with their script directories, in a similar format to Kaldi’s own example script.
We have tried to make Kaldi’s documentation as complete as possible given time constraints, but in the short term we cannot hope to generate documentation that is as thorough as HTK’s. In particular there is a lot of introductory material in the HTKBook, explaining statistical speech recognition for the uninitiated, that will probably never appear in Kaldi’s documentation. Much of Kaldi’s documentation is written in such a way that it will only be accessible to an expert. In the future we hope to make it somewhat more accessible, bearing in mind that our intended audience is speech recognition researchers or researchers-in-training. In general, Kaldi is not a speech recognition toolkit “for dummies.” It will allow you to do many kinds of operations that don’t make sense.