Rehnuma Awaz Changelog



Version 1.1.2

Release Date: 08 January 2026

  • Enhancements:

    Improved Urdu character recognition
    Special characters and diacritics such as Superscript Alef (ٰ), Do-chashmi Heh (ھ), and other Urdu-specific marks are now correctly recognized as Urdu text instead of fallback “unknown” or “Arabic letter” tokens.

    Numeric handling update

    Arabic numerals (0–9) are now automatically treated as part of Urdu text, ensuring consistent pronunciation and TTS handling of numbers within Urdu sentences.

    Impact:
    TTS output for Urdu text with diacritics and embedded numbers is now more accurate and natural.

Version 1.1.0

Release Date: 27 November, 2025

  • Enhanced Restart Flow After New Voice Installation
    Fixed issues causing delayed or incorrect restarts when adding new voices. The system now restarts cleanly and reliably after each addition.
  • Smarter Auto-Language Switching
    Improved detection and switching logic for multilingual content, resulting in smoother transitions and fewer misclassifications.
 

Rehnuma Manager UI Upgrade

  • Polished interface for better usability and workflow.
  • Added support for registering other Piper-based English voices/models.
  • Introduced quick voice selection directly inside the Rehnuma Voice Manager panel.

Voice Improvements

  • Rehnuma Akber: 
Significantly improved handling of single words and individual letters, making it more reliable for TTS training, dictionary work, and educational use-cases.
  • New Voice: Rehnuma Arfa (Female): 
Added a natural-sounding female voice named Arfa to broaden the selection for users.

Miscellaneous

  • Various stability fixes, performance tuning, refactoring, and internal improvements.
 

Version 1.0.4

Release Date: 30 October, 2025

  • Enabled ONNX Runtime graph optimizations
  • Set GraphOptimizationLevel to Level3 and enabled memory pattern reuse for inference sessions
  • Impact: 15–45% faster model inference; fewer allocations during repeated calls
  • Added phonemization result caching
  • Thread-safe cache keyed by voice and processed text (includes Arabic diacritized text)
  • Impact: 50–90% lower latency for repeated or similar utterances; lower CPU usage
  • Fixed exponential growth in real-time streaming chunk size
  • Replaced multiplicative chunk size update with constant base chunk size per stream
  • Impact: predictable latency, reduced memory usage, prevents performance degradation over time
  • Enabled GPU acceleration on Windows via DirectML
  • ONNX Runtime provider ordering: DirectML (preferred) → CPU fallback
  • Impact: large speedups on compatible GPUs without configuration changes
  • Tuned real-time streaming parameters
  • Updated defaults: chunk_size=64, chunk_padding=2
  • Impact: lower initial latency while maintaining stream smoothness
  • Increased gRPC channel buffer sizes
  • mpsc channel capacity raised from 512 to 1024 for both synthesis endpoints
  • Impact: reduced backpressure and blocking under bursty load; smoother streaming
  • Configured gRPC server for high concurrency
  • Set concurrency_limit_per_connection=1024, max_concurrent_streams=1000, tcp_keepalive=60s, tcp_nodelay=true
  • Impact: improved throughput and responsiveness under concurrent client load
  • Right-sized synthesis thread pool
  • Reduced from num_cpus4 to min(num_cpus2, 16), with a floor of 2
  • Impact: less context switching, more stable latency, better CPU utilization