Discover and install AI models from our curated collection
Repository: localaiLicense: apache-2.0
Voxtral Mini 4B Realtime is a speech-to-text model from Mistral AI. It is a 4B parameter model optimized for fast, accurate audio transcription with low latency, making it ideal for real-time applications. The model uses the Voxtral architecture for efficient audio processing.
Links
Tags
Repository: localaiLicense: LFM-Open-License-v1.0
LFM2.5-Audio-1.5B is LiquidAI's any-to-any audio foundation model. The 1.2B LFM2.5 backbone plus a FastConformer audio encoder and an LFM2-based audio detokenizer give real-time speech-to-speech with text + audio output interleaved at 12.5 Hz / 24 kHz. This entry runs in S2S (speech-to-speech) mode and is the model the LocalAI realtime API any-to-any path consumes. Switch to ASR, TTS, or chat by picking the sibling gallery entries.
Links
Tags
Repository: localaiLicense: apache-2.0
LocalVQE v1.1 (1.3 M parameters, F32) — joint acoustic echo cancellation, noise suppression, and dereverberation for 16 kHz mono speech. DeepVQE-style architecture with an S4D bottleneck and an in-graph DCT-II filterbank. ~9.6× realtime on a desktop CPU; 16 ms algorithmic latency. ~5 MB on disk. v1.1 ships the v16 echoaware checkpoint with improved double-talk and near-end single-talk AECMOS scores.
Links
Tags