Video Demo: Real time speech recognition on VisionFive2 with next-gen Kaldi

I would like to share the news that we just managed to run the subproject sherpa-ncnn of next-gen Kaldi on VisionFive2 for real-time speech recognition with a USB microphone.

You can find the documentation at

Everything is open-source, i.e., the code, the model, the data, and the documentation, etc.

The video demo is available at


Good news!

Have you tried the same thing on the Raspberry pi 4? If so, how do they compare?

I have tried it on Raspberry Pi 4 Model B, which is faster than VisionFive2 and it can also run a larger model in real time.

VisionFive2 can only run a smaller model in real time.

By real-time, I mean the RTF is less than 1.