What is your favorite Text To Speech Solution

kwannie · January 31

I was wondering for those that use DAZ to create original animations with a bit of dialogue between characters, which Text To Speech platform do you find works best for you? I used TextAloud reader from NextUp and purchase several of the voice bundles and it worked fine until I tried to install the voices in Windows 10 then I couldnt really get it to function after that. Ivona voices were very natural sounding back then but still could use any expressive SSML tags, that I know of.

I know today the most obvious solution is to use an online AI voice generator, but I really don't like using online platforms and purchasing with a few hundred word limits. Are the any TTS readers and voices that can be run on a local host, that allow SSML tags? I have tried Balabolka but very few SSML tags work with it that I am aware of.

Any information would be appreciated

wolf359 · February 1

I have tried Balabolka as well and its way too Robotic

Honestly there are no really good local install solutions IMHO, particularly any that can and emotional inflection.

I did buy a one year subscription to
https://voicemaker.in/
to make my last full length animated film, but I have since let it lapse until I have another major movie project planned

Here are some of the “voicemaker” voices in the film

I have also used this service that has nice free voices with 250 character limit per generation but you can just break up your dialog into 250 letter paragraphs and connect the generated clips in audacity or similar
https://app.artflow.ai/my-creations

Here is a sample of one of the free Artflow.ai voices

WendyLuvsCatz · February 1

I have been using some AI based ones lately

had one I used the other day that I ran on my own PC and could train on a minute of voice recording but I seem to have accidently deleted it and forgot what it was called

I am now questioning if I am going demented

to be fair I don't think it had a logical sounding name and I haven't apparently saved the download zip either

FirstBastion · February 1

I've used Replica a few times. It's not cheap but you do get an actual performance. https://www.replicastudios.com/voice-library

WendyLuvsCatz · February 1

oh I am not going mad, it was one of the many Visions of Chaos machine learning options

not a standalone

XTTS

anyway I cloned my own voice as well as my late mother's

don't really think either sounds like us but as unique text to speech voices they are not bad

there is a Hugging Face Demo https://huggingface.co/spaces/coqui/xtts

and a GitHub repository https://github.com/coqui-ai/TTS

I like VOC as it download the files and models, creates the Python enviroment, User interface and everything for me as I am a dummy

kwannie · February 1

I came across EmotiVoice - Multi Voice TTS Engine on Youtube, and it looks like it is a local installation with many voices and emotive capabilities. It seems to require some coding skills which I am behind the power curve on some I have not dared give it a go. If anybody knows anything about it I'd love to know if it works well.( placed a link below).

I am not really looking for perfect A. I. quality I just want to find an egine that can use SSML tags with some of the mainstream voices like Ivona, Nuance, Acapela and the like. Tags like pitch, speed, volume. and in the past I have seen the (emph tag) which allows to emphasize parts of a word. Generally phonetics dictionaries allow to design how a word is pronounced. Seems like a few years ago there were several TTS enginges that could do this.

http://www.youtube.com/watch?v=jcj-KDPB_-o

TheMysteryIsThePoint · February 1

Eleven Labs is amazing. Does speech-to-speech as well.

wsterdan · February 1

TheMysteryIsThePoint said:

Eleven Labs is amazing. Does speech-to-speech as well.

Agreed, very good option. I also use play.ht quite a bit.

FirstBastion · February 1

wsterdan said:

TheMysteryIsThePoint said:

Eleven Labs is amazing. Does speech-to-speech as well.

Agreed, very good option. I also use play.ht quite a bit.

I just tried out Eleven Labs, and you're right, it's quite impressive.

kwannie · February 2

Has anybody continued using the local host TTS platforms, The original question was about finding emotive TTS solutions that are installed and run on a computer as opposed to an AI solution that requires a renewable service. I know that just a few years ago some companies were marketing realistic voices that could be installed on a computer that had SSML tag capability. They were so expensive at the time that I largely ignored them, but I can't seem to find them anymore. That is why I was wondering if anybody else has come across that type of TTS genre.

wolf359 · February 2

The original question was about finding emotive TTS solutions that are installed and run on a computer as opposed to an AI solution that requires a renewable service. I know that just a few years ago some companies were marketing realistic voices that could be installed on a computer that had SSML tag capability. They were so expensive at the time that I largely ignored them,

It is likely that TTS companies no longer see trying to sell expensive one off licenses as a viable business model when you just set up a sub based AI service online.
( and they are probably farming the actual
generation processing out to a third party at that)
I know you say you do not personally care about high quality but it is the utter lack of believable machine generated voices that have been the bane of us solo animators for decades and in short order
AI voice creation has become far and above any of the local install TTS software in terms of quality and Multi- lingual vocal diversity so I would not hold out any hope on finding any new local TTS apps arriving on the scene in the future

WendyLuvsCatz · February 3

well the one I use with Visions of Chaos is local

I also use RVC to change wav and mp3 files to trained voices

it does singing voices too and isolates vocals and music

WillowRaven · February 3

I use Eleven Labs. Both free and paid options.

kwannie · February 3

Wendy what was the name of the software you were talking about that does singing? I'm somewhat confused about what Visions of Chaos and RVC are, could you explain a bit?

WendyLuvsCatz · February 3

https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main

is the one that can swap voices

it's a self contained zip installer

VOC is a software for hosting apps like Automatic 1111 and various other machine learning GUIs including Audiocraft, Zeroscope, Large Language Models etc

it"s advantage for me is it will download and install all the prerequisites for you and also has disc management for removing stuff you no longer want

I linked the Hugging space demo of the TTS software I was using earlier in this thread and it's GitHub repository if you want to set up your own python environment

VOC does all that for me

https://softology.pro/voc.htm

Notifications

What is your favorite Text To Speech Solution

Comments

Adding to Cart…