Changelog – Rehnuma Awaz

Rehnuma Awaz Changelog

Version 1.1.2

Release Date: 08 January 2026

Enhancements:
Improved Urdu character recognition
Special characters and diacritics such as Superscript Alef (ٰ), Do-chashmi Heh (ھ), and other Urdu-specific marks are now correctly recognized as Urdu text instead of fallback “unknown” or “Arabic letter” tokens.
Numeric handling update
Arabic numerals (0–9) are now automatically treated as part of Urdu text, ensuring consistent pronunciation and TTS handling of numbers within Urdu sentences.
Impact:
TTS output for Urdu text with diacritics and embedded numbers is now more accurate and natural.

Version 1.1.0

Release Date: 27 November, 2025

Enhanced Restart Flow After New Voice Installation
Fixed issues causing delayed or incorrect restarts when adding new voices. The system now restarts cleanly and reliably after each addition.
Smarter Auto-Language Switching
Improved detection and switching logic for multilingual content, resulting in smoother transitions and fewer misclassifications.

Rehnuma Manager UI Upgrade

Polished interface for better usability and workflow.
Added support for registering other Piper-based English voices/models.
Introduced quick voice selection directly inside the Rehnuma Voice Manager panel.

Voice Improvements

Rehnuma Akber:

Significantly improved handling of single words and individual letters, making it more reliable for TTS training, dictionary work, and educational use-cases.

New Voice: Rehnuma Arfa (Female):

Added a natural-sounding female voice named Arfa to broaden the selection for users.

Miscellaneous

Various stability fixes, performance tuning, refactoring, and internal improvements.

Version 1.0.4

Release Date: 30 October, 2025

Enabled ONNX Runtime graph optimizations
Set GraphOptimizationLevel to Level3 and enabled memory pattern reuse for inference sessions

Impact: 15–45% faster model inference; fewer allocations during repeated calls

Added phonemization result caching

Thread-safe cache keyed by voice and processed text (includes Arabic diacritized text)

Impact: 50–90% lower latency for repeated or similar utterances; lower CPU usage

Fixed exponential growth in real-time streaming chunk size

Replaced multiplicative chunk size update with constant base chunk size per stream

Impact: predictable latency, reduced memory usage, prevents performance degradation over time

Enabled GPU acceleration on Windows via DirectML

ONNX Runtime provider ordering: DirectML (preferred) → CPU fallback

Impact: large speedups on compatible GPUs without configuration changes

Tuned real-time streaming parameters

Updated defaults: chunk_size=64, chunk_padding=2

Impact: lower initial latency while maintaining stream smoothness

Increased gRPC channel buffer sizes

mpsc channel capacity raised from 512 to 1024 for both synthesis endpoints

Impact: reduced backpressure and blocking under bursty load; smoother streaming

Configured gRPC server for high concurrency

Set concurrency_limit_per_connection=1024, max_concurrent_streams=1000, tcp_keepalive=60s, tcp_nodelay=true

Impact: improved throughput and responsiveness under concurrent client load

Right-sized synthesis thread pool

Reduced from num_cpus4 to min(num_cpus2, 16), with a floor of 2

Impact: less context switching, more stable latency, better CPU utilization