Chatterbox Egyptian Arabic (Masri) TTS
Model Summary
Chatterbox Egyptian Arabic (Masri) TTS is a Text-to-Speech model built on top of the Chatterbox Multilingual TTS architecture and configured to generate Egyptian Arabic (Masri) speech.
The model supports:
- Egyptian Arabic text input (
language_id = "ar") - Natural prosody and conversational tone
- Optional reference audio prompting for speaker/style transfer
This repository contains the model checkpoints and assets required for inference.
Supported Language
- Arabic (
ar)- Intended usage: Egyptian Arabic (Masri) text
- Not optimized for Modern Standard Arabic (MSA) pronunciation
Intended Use
Primary Use Cases
- Egyptian Arabic voice synthesis
- Conversational agents and assistants
- Prototyping Arabic voice UX
- Content creation (narration, demos, accessibility)
- Research and experimentation in Arabic TTS
Out-of-Scope Uses
- Voice impersonation without consent
- Identity spoofing or deceptive content
- Legal, medical, or emergency-critical systems
- Guaranteed accent purity across all Arabic dialects
Inference Behavior
Input
- Text in Egyptian Arabic
- Maximum recommended length: ~300 characters
Optional Reference Audio
- A short reference clip may be provided to influence:
- Speaker identity
- Voice style
- Prosody
If the reference audio is not Egyptian Arabic, accent leakage may occur.
Example Usage
import numpy as np
from huggingface_hub import snapshot_download
from chatterbox.mtl_tts import ChatterboxMultilingualTTS
# Select device
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
# Download model checkpoint
ckpt_dir = snapshot_download(
repo_id="oddadmix/chatterbox-egyptian-v0",
repo_type="model",
revision="main",
)
# Load model
model = ChatterboxMultilingualTTS.from_checkpoint(
str(ckpt_dir) + "/",
DEVICE
)
# Optional: move to device explicitly
if hasattr(model, "to"):
model.to(DEVICE)
# Egyptian Arabic (Masri) text
text = "ุฃูุง ุฑุงูุญ ุงูุดุบู ุฏูููุชู ููููู
ู ุฃูู ู
ุง ุฃูุตู."
# Generate speech
wav = model.generate(
text=text,
language_id="ar",
temperature=0.8,
cfg_weight=0.5,
exaggeration=0.5,
)
# Save output audio
import soundfile as sf
sf.write(
"egyptian_tts.wav",
wav.squeeze(0).cpu().numpy(),
model.sr
)
print("Audio saved as egyptian_tts.wav")
Limitations
- Accent transfer from reference audio can override dialect
- Long-form synthesis may lose prosodic consistency
- Not fine-tuned exclusively on Egyptian-only corpora
- No speaker identity guarantees
- Dialectal spelling variations affect pronunciation
Ethical Considerations
This model can generate realistic human-like speech.
Users must:
- Disclose synthetic audio where appropriate
- Obtain consent for reference voices
- Avoid misuse for impersonation or deception
Citation
If you use this model in research or demos, please cite:
@misc{chatterbox_egyptian_tts,
title={Chatterbox Egyptian Arabic (Masri) Text-to-Speech},
author={oddadmix},
year={2025},
howpublished={\url{https://huggingface.co/oddadmix/chatterbox-egyptian-v0}}
}
- Downloads last month
- -
Model tree for oddadmix/chatterbox-egyptian-v0
Base model
ResembleAI/chatterbox