Model Gallery

Discover and install AI models from our curated collection

3 models available

1 repositories

Documentation

Find Your Perfect Model

Filter by Model Type

Browse by Tags

voxtral-mini-4b-realtime

Voxtral Mini 4B Realtime is a speech-to-text model from Mistral AI. It is a 4B parameter model optimized for fast, accurate audio transcription with low latency, making it ideal for real-time applications. The model uses the Voxtral architecture for efficient audio processing.

Repository: localaiLicense: apache-2.0

lfm2.5-audio-1.5b-realtime

LFM2.5-Audio-1.5B is LiquidAI's any-to-any audio foundation model. The 1.2B LFM2.5 backbone plus a FastConformer audio encoder and an LFM2-based audio detokenizer give real-time speech-to-speech with text + audio output interleaved at 12.5 Hz / 24 kHz. This entry runs in S2S (speech-to-speech) mode and is the model the LocalAI realtime API any-to-any path consumes. Switch to ASR, TTS, or chat by picking the sibling gallery entries.

Repository: localaiLicense: LFM-Open-License-v1.0

localvqe-v1.1-1.3m

LocalVQE v1.1 (1.3 M parameters, F32) — joint acoustic echo cancellation, noise suppression, and dereverberation for 16 kHz mono speech. DeepVQE-style architecture with an S4D bottleneck and an in-graph DCT-II filterbank. ~9.6× realtime on a desktop CPU; 16 ms algorithmic latency. ~5 MB on disk. v1.1 ships the v16 echoaware checkpoint with improved double-talk and near-end single-talk AECMOS scores.

Repository: localaiLicense: apache-2.0