Skip to main content

LocalAI

Introduction

LocalAI is a complete AI stack for running AI models locally. Designed to be simple, efficient, and accessible, it provides an alternative to OpenAI's API. Users can run large language models (LLMs), image generation, speech transcription, and other AI tasks on consumer-grade hardware (including CPU environments) while maintaining data privacy and security.

This document details how to compile, install, and use LocalAI from source on our Bianbu platform, along with adding custom inference backends.

Compilation and Installation Steps

Install System Dependencies

sudo apt update
sudo apt install cmake golang libgrpc-dev make protobuf-compiler-grpc python3-grpc-tools

# Uninstall existing protobuf
sudo apt-get remove --purge protobuf-compiler libprotobuf-dev
sudo apt-get autoremove
sudo rm /usr/local/bin/protoc # Remove executable
sudo rm -rf /usr/local/include/google # Remove headers
sudo rm -rf /usr/local/lib/libproto* # Remove libraries
sudo rm -rf /usr/lib/protoc # Other possible paths

sudo apt-get install autoconf automake libtool curl make gcc-14 g++-14 unzip

# Switch to /usr/bin, delete symlinks for:
# gcc, g++, gcc-ar, gcc-nm, gcc-ranlib,
# riscv64-linux-gnu-gcc, riscv64-linux-gnu-gcc-ar, riscv64-linux-gnu-gcc-nm,
# riscv64-linux-gnu-gcc-ranlib, riscv64-linux-gnu-g++
# Then create new symlinks pointing to version 14
# Example:
# sudo rm /usr/bin/gcc
# sudo ln -s /usr/bin/gcc-14 /usr/bin/gcc

# Compile and install protobuf from source
wget https://github.com/protocolbuffers/protobuf/releases/download/v3.20.3/protobuf-cpp-3.20.3.tar.gz
tar xvzf protobuf-cpp-3.20.3.tar.gz
cd protobuf-3.20.3/cmake
cmake -DCMAKE_INSTALL_PREFIX=/usr/local .
cmake --build . --parallel 8
ctest --verbose
sudo cmake --install .
sudo ldconfig
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
cd ../../

sudo apt install libgrpc++-dev

Compile LocalAI

Download our source code and compile:

# Navigate to project directory
cd localai

# Add Go binary path to environment
export PATH=$PATH:$(go env GOPATH)/bin

# Use Aliyun proxy for faster module downloads
export GOPROXY=https://mirrors.aliyun.com/goproxy/,direct

# Compile
make build

# If errors occur, run make clean before rebuilding

Adding Custom Inference Backends

Add RISC-V Accelerated llama.cpp Backend

Our modify llama.cpp with RISC-V acceleration and package it as a grpc-server binary. Deploy using:

cd backend/cpp/spacemit-llama-cpp
bash install.sh

The script downloads the RISC-V accelerated llama-cpp-grpc-server binary and quantized models, configures directories, and generates config files. Run ./local-ai --debug from the project root to start LocalAI. Test via browser at http://localhost:8080/chat/.

If encountering shared library error

stderr llama-cpp-riscv-spacemit: error while loading shared libraries: libabsl_synchronization.so.20220623: No such file.

Fix with:

sudo apt install libabsl-dev
sudo ln -s /usr/lib/riscv64-linux-gnu/libabsl_synchronization.so /usr/lib/riscv64-linux-gnu/libabsl_synchronization.so.20220623

To use other LLM models with our accelerated backend, you can

Add RISC-V Accelerated ASR Backend

cd backend/cpp/spacemit-asr-cpp
bash build.sh

cd ../../../
./local-ai --debug

Test with:

# Prepare audio file test.wav
curl -X POST http://localhost:8080/v1/audio/transcriptions \
-H "Content-Type: multipart/form-data" \
-F "file=@test.wav" \
-F "model=sensevoicesmall-cpp"

Add C++ TTS Backend

cd backend/cpp/matcha-tts-cpp
bash build.sh

cd ../../../
./local-ai --debug

Test with:

curl -X POST "http://localhost:8080/tts" \
-H "Content-Type: application/json" \
-d '{"input":"Hello, how is the weather today","model":"matcha-tts-cpp"}' \
-o output.wav