5.1.2 Speech to Text (ASR)
Last Version: 11/09/2025
Overview
This guide explains how to use Automatic Speech Recognition (ASR) to convert spoken words into text. The process involves capturing audio from a microphone and using a model to transcribe it automatically.
Project repository: ⭐ Bianbu AI Demo Zoo | NLP
Preparation
Clone Code
Clone the repository and navigate to the correct directory:
git clone https://gitee.com/bianbu/spacemit-demo.git
cd spacemit-demo/examples/NLP
Install Environment Dependencies
It is recommended to use a virtual environment for dependency isolation:
# Install the virtual environment package
sudo apt install python3-venv
# Create and activate the virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install project dependencies
pip install -r requirements.txt
Detect System Recording Devices
Follow the instructions in the Detect System Recording Devices section to check the available recording devices in the system.
Execute Example Code
Run the ASR example:
python 03_asr_demo.py
After the program starts, press Enter to begin recording. The integrated VAD functionality will automatically determine if there is human speech and stop recording during silence. The program will start and wait for your command.
- Press Enter to begin recording.
- The built-in Voice Activity Detection (VAD) will automatically detect speech and stop during silence.
Parameter Description
Parameter Name | Description | Usage |
---|---|---|
sld | Silence Duration Threshold (seconds) | Speech ends if silence lasts ≥ sld seconds; - Set to 0 to disable |
max_time | Maximum Recording Time (seconds) | Automatically stops recording after this duration to prevent long recordings. |
channels | Audio Channel Count | Usually set to 1 (mono). - mono input is recommended for speech recognition |
rate | Sample Rate (Hz) | Number of samples per second, e.g., 16000 or 48000 . - Must match the model input |
device_index | Input Device Index | Specify recording device. - Find index using arecord or search_device.py |