What is your favorite Text To Speech Solution

I was wondering for those that use DAZ to create original animations with a bit of dialogue between characters, which Text To Speech platform do you find works best for you? I used TextAloud reader from NextUp and purchase several of the voice bundles and it worked fine until I tried to install the voices in Windows 10 then I couldnt really get it to function after that. Ivona voices were very natural sounding back then but still could use any expressive  SSML tags, that I know of.

I know today the most obvious solution is to use an online AI voice generator, but I really don't like using online platforms and purchasing with a few hundred word limits. Are the any TTS readers and voices that can be run on a local host, that allow SSML tags? I have tried Balabolka but very few SSML tags work with it that I am aware of.

Any information would be appreciated

Comments

  • wolf359wolf359 Posts: 3,828

    I have tried Balabolka as well and its way too Robotic


    Honestly there are no really good local install solutions IMHO, particularly any that can and emotional inflection.

    I did buy a one year subscription to
    https://voicemaker.in/
    to make my last full length animated film, but I have since let it lapse until I have another major movie project planned

    Here are some of the “voicemaker” voices in the film

    I have also used this service that has nice free voices with 250 character limit per generation but you can just break up your dialog into 250 letter paragraphs and connect the generated clips  in audacity or similar
    https://app.artflow.ai/my-creations


    Here is a sample of one of the free Artflow.ai voices

     

     

  • WendyLuvsCatzWendyLuvsCatz Posts: 38,226
    edited February 1

    I have been using some AI based ones lately

    had one I used the other day that I ran on my own PC and could train on a minute of voice recording but I seem to have accidently deleted it and forgot what it was called

    I am now questioning if I am going demented

    to be fair I don't think it had a logical sounding name and I haven't apparently saved the download zip either

    Post edited by WendyLuvsCatz on
  • FirstBastionFirstBastion Posts: 7,762

    I've used Replica a few times.  It's not cheap  but you do get an actual  performance. https://www.replicastudios.com/voice-library

  • WendyLuvsCatzWendyLuvsCatz Posts: 38,226
    edited February 1

    oh I am not going mad, it was one of the many Visions of Chaos machine learning options

    not a standalone

    XTTS

    anyway I cloned my own voice as well as my late mother's

    don't really think either sounds like us but as unique text to speech voices they are not bad

    there is a Hugging Face Demo https://huggingface.co/spaces/coqui/xtts

    and a GitHub repository https://github.com/coqui-ai/TTS

    I like VOC as it download the files and models,  creates the Python enviroment, User interface and everything for me as I am a dummy

    Post edited by WendyLuvsCatz on
  • kwanniekwannie Posts: 865

    I came across EmotiVoice - Multi Voice TTS Engine on Youtube, and it looks like it is a local installation with many voices and emotive capabilities. It seems to require some coding skills which I am behind the power curve on some I have not dared give it a go. If anybody knows anything about it I'd love to know if it works well.( placed a link below).

    I am not really looking for perfect A. I. quality I just want to find an egine that can use SSML tags with some of the mainstream voices like Ivona, Nuance, Acapela and the like. Tags like pitch, speed, volume. and in the past I have seen the (emph tag) which allows to emphasize parts of a word. Generally phonetics dictionaries allow to design how a word is pronounced. Seems like a few years ago there were several TTS enginges that could do this.

     

    http://www.youtube.com/watch?v=jcj-KDPB_-o

  • Eleven Labs is amazing. Does speech-to-speech as well.

  • wsterdanwsterdan Posts: 2,349

    TheMysteryIsThePoint said:

    Eleven Labs is amazing. Does speech-to-speech as well.

    Agreed, very good option. I also use play.ht quite a bit. 

  • FirstBastionFirstBastion Posts: 7,762

    wsterdan said:

    TheMysteryIsThePoint said:

    Eleven Labs is amazing. Does speech-to-speech as well.

    Agreed, very good option. I also use play.ht quite a bit. 

     I just tried out Eleven Labs,  and you're right,  it's quite impressive.

  • kwanniekwannie Posts: 865

    Has anybody continued using the local host TTS platforms, The original question was about finding emotive TTS solutions that are installed and run on a computer as opposed to an AI  solution that requires a renewable service. I know that just a few years ago some companies were marketing realistic voices that could be installed on a computer that had SSML tag capability. They were so expensive at the time that I largely ignored them, but I can't seem to find them anymore. That is why I was wondering if anybody else has come across that type of TTS genre.

  • wolf359wolf359 Posts: 3,828
    edited February 3

    The original question was about finding emotive TTS solutions that are installed and run on a computer as opposed to an AI  solution that requires a renewable service. I know that just a few years ago some companies were marketing realistic voices that could be installed on a computer that had SSML tag capability. They were so expensive at the time that I largely ignored them,

     

    It is likely that TTS companies no longer see trying to sell expensive one off licenses as a viable business model when you just set up a sub based AI service online.
    ( and they are probably farming the actual
     generation processing out to a third party at that)
    I know you say you do not personally care about high quality but it is the utter lack of believable machine generated voices that have been the bane of us solo animators for decades and in short order
    AI voice creation has become far and above any of the local install TTS software in terms of quality and Multi- lingual vocal diversity so I would not hold out any hope on finding any new local TTS apps arriving on the scene in the future 

    Post edited by wolf359 on
  • WendyLuvsCatzWendyLuvsCatz Posts: 38,226

    well the one I use with Visions of Chaos is local

    I also use RVC to change wav and mp3 files to trained voices

    it does singing voices too and isolates vocals and music

  • WillowRavenWillowRaven Posts: 3,787

    I use Eleven Labs. Both free and paid options.

  • kwanniekwannie Posts: 865

    Wendy what was the name of the software you were talking about that does singing? I'm somewhat confused about what Visions of Chaos and RVC are, could you explain a bit?

  • WendyLuvsCatzWendyLuvsCatz Posts: 38,226

    https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main

    is the one that can swap voices

    it's a self contained zip installer


    VOC is a software for hosting apps like Automatic 1111 and various other machine learning GUIs including Audiocraft, Zeroscope, Large Language Models etc

    it"s advantage for me is it will download and install all the prerequisites for you and also has disc management for removing stuff you no longer want

    I linked the Hugging space demo of the TTS software I was using earlier in this thread and it's GitHub repository if you want to set up your own python environment 

    VOC does all that for me

    https://softology.pro/voc.htm

Sign In or Register to comment.