Abstract: In spoken scenarios, achieving personalized and controllable zero-shot spontaneous style speech synthesis is highly significant, particularly in generating natural and expressive speech for ...
Abstract: Automatic detection of synthetic speech is becoming increasingly important as current synthesis methods are both near indistinguishable from human speech and widely accessible to the public.
Kokoro 82M is an 82-million-parameter text-to-speech model that beats many TTS APIs while running locally on CPUs, including ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results