• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Tuesday, May 26, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Meet OmniVoice Studio: A Local, Open-Source Alternative to ElevenLabs

Josh by Josh
May 26, 2026
in Al, Analytics and Automation
0
Meet OmniVoice Studio: A Local, Open-Source Alternative to ElevenLabs


OmniVoice Studio — How to Use It
01 / 08

What Is OmniVoice Studio?

OmniVoice Studio is an open-source desktop application for voice cloning, video dubbing, real-time dictation, and speaker diarization. Everything runs locally on your machine. No API keys, no cloud account, no subscription required.

  • 646 languages supported for TTS via the default OmniVoice engine
  • 99 languages for transcription via WhisperX
  • Available on macOS, Windows, and Linux
  • GPU is optional — full pipeline runs on CPU
  • Free for personal, educational, and research use (FSL-1.1-ALv2)

OmniVoice Studio — How to Use It
02 / 08

System Requirements

A GPU is optional. Without one, TTS runs approximately 3× slower on CPU. With ≤8 GB VRAM, TTS automatically offloads to CPU during transcription — no config needed.

Component Minimum Recommended
OS Win 10 / macOS 12+ / Ubuntu 20.04+ Any modern 64-bit OS
RAM 8 GB 16 GB+
VRAM 4 GB (auto-offloads) 8 GB+ (RTX 3060+)
Disk 10 GB free 20 GB+ SSD
Python 3.10+ 3.11–3.12
GPU Optional CUDA / MPS / ROCm

OmniVoice Studio — How to Use It
03 / 08

Installation

The project recommends running from source. Install three prerequisites first: ffmpeg, Bun (JS runtime), and uv (Python package manager).

git clone https://github.com/debpalash/OmniVoice-Studio.git
cd OmniVoice-Studio
uv sync
bun install
bun dev

Frontend loads at http://localhost:5173  |  API runs on port 8000.
Model weights download automatically on first generation.

Pre-built installers available: macOS DMG, Windows MSI, Linux AppImage and .deb — see the Releases page on GitHub.

READ ALSO

Together AI Open-Sources OSCAR: An Attention-Aware 2-Bit KV Cache Quantization System for Long-Context LLM Serving

Best Authentication Platforms for AI Agents and MCP Servers in 2026

OmniVoice Studio — How to Use It
04 / 08

Voice Cloning

Voice cloning uses zero-shot learning — it clones a voice from a clip as short as 3 seconds, without prior training on that voice. The default OmniVoice engine conditions a diffusion-based TTS model on the reference audio.

  • Go to the Voice Clone tab in the UI
  • Upload or record a 3-second audio clip of the target voice
  • Enter your text and select a target language (646 available)
  • Click Generate — output is saved to your project library

Voice Gallery: Search YouTube, browse categories, and download reference clips directly inside the app to build your voice library.

OmniVoice Studio — How to Use It
05 / 08

Video Dubbing

The full dubbing pipeline runs locally: transcribe → translate → synthesize → mux. Demucs isolates vocals so the original background audio is preserved in the final export.

  • Go to the Dub tab — paste a YouTube URL or upload a local file
  • WhisperX transcribes speech with word-level alignment
  • Select a target language; translation runs automatically
  • TTS engine re-voices the transcript; Demucs preserves background audio
  • Export the final MP4 with dubbed audio mixed in

Batch Queue: Drop up to 50 videos and walk away. Each job has its own progress bar tracking through the full pipeline.

OmniVoice Studio — How to Use It
06 / 08

Dictation & Speaker Diarization

Dictation works system-wide from any application. Diarization identifies individual speakers in a multi-speaker audio file using Pyannote + WhisperX.

  • Press ⌘+⇧+Space (macOS) to open the floating dictation widget
  • Speech streams via WebSocket and auto-pastes into the active input field
  • Upload a multi-speaker file to the Diarization tab
  • Pyannote identifies who said what; each speaker gets an auto-extracted voice profile
  • Assign a TTS voice per speaker for per-speaker dubbing

Hugging Face token required for Pyannote diarization. See docs/setup/huggingface-token.md in the repo.

OmniVoice Studio — How to Use It
07 / 08

TTS Engines

Six TTS engines are built in. Switch via Settings → TTS Engine or the env var:
OMNIVOICE_TTS_BACKEND=cosyvoice

Engine Languages Clone Platform
OmniVoice (default) 600+ ✓ CUDA / MPS / CPU
CosyVoice 3 9 + 18 dialects ✓ CUDA / MPS / CPU
MLX-Audio Multi Varies Apple Silicon only
VoxCPM2 30 ✓ CUDA / MPS / CPU
MOSS-TTS-Nano 20 ✓ CUDA / CPU
KittenTTS English ✗ CPU only

Custom engine: Subclass TTSBackend in backend/services/tts_backend.py and add it to _REGISTRY. ~50 lines of Python.

OmniVoice Studio — How to Use It
08 / 08

MCP Server & Resources

OmniVoice Studio ships a built-in MCP Server, exposing voice and dubbing capabilities to any MCP-compatible client — Claude, Cursor, or your own tooling — without opening the desktop UI.

  • MCP Server starts alongside the FastAPI backend on bun dev
  • Point your MCP client at the local server to access all endpoints
  • AudioSeal (Meta) embeds an invisible neural watermark in all generated audio for AI provenance
  • GitHub: github.com/debpalash/OmniVoice-Studio
  • Install docs: docs/install/ (macos / windows / linux / docker)
  • Troubleshooting: docs/install/troubleshooting.md
  • Discord: discord.gg/bzQavDfVV9



Source_link

Related Posts

Together AI Open-Sources OSCAR: An Attention-Aware 2-Bit KV Cache Quantization System for Long-Context LLM Serving
Al, Analytics and Automation

Together AI Open-Sources OSCAR: An Attention-Aware 2-Bit KV Cache Quantization System for Long-Context LLM Serving

May 26, 2026
Best Authentication Platforms for AI Agents and MCP Servers in 2026
Al, Analytics and Automation

Best Authentication Platforms for AI Agents and MCP Servers in 2026

May 25, 2026
Build a Complete Langfuse Observability and Evaluation Pipeline for Tracing, Prompt Management, Scoring, and Experiments
Al, Analytics and Automation

Build a Complete Langfuse Observability and Evaluation Pipeline for Tracing, Prompt Management, Scoring, and Experiments

May 25, 2026
Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%
Al, Analytics and Automation

Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%

May 24, 2026
Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents
Al, Analytics and Automation

Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents

May 24, 2026
Nous Research Releases Contrastive Neuron Attribution (CNA): Sparse MLP Circuit Steering Without SAE Training or Weight Modification
Al, Analytics and Automation

Nous Research Releases Contrastive Neuron Attribution (CNA): Sparse MLP Circuit Steering Without SAE Training or Weight Modification

May 23, 2026
Next Post
This startup is betting India’s gig economy can train the world’s robots

This startup is betting India's gig economy can train the world's robots

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

Who gets to warn us the world is ending?

Who gets to warn us the world is ending?

January 29, 2026
How To Collect Online Payments For Your Event in Minutes

How To Collect Online Payments For Your Event in Minutes

May 31, 2025
Cost to Build an App Like Poshmark in 2026: Marketplace Development Pricing

Cost to Build an App Like Poshmark in 2026: Marketplace Development Pricing

February 8, 2026
Google’s short-lived ‘Dark web report’ tool shuts down this week

Google’s short-lived ‘Dark web report’ tool shuts down this week

January 13, 2026

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • This startup is betting India’s gig economy can train the world’s robots
  • Meet OmniVoice Studio: A Local, Open-Source Alternative to ElevenLabs
  • FitFlop Marks Haring’s Legacy With Spring 2026 Launch
  • Agent-To-Agent Marketing Was Just Born on Moltbook
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions