Categories
News

🗣️ New German “Thorsten” TTS model released 🎉

YEARS of passion for open voice tech,
MONTH of recording sessions,
WEEKS of computed training time,
DAYS of audio optimiziation,
HOURS of disillusion.

All for that ONE MOMENT, to share next generation of open “Thorsten-Voice” with the community!

This model is based on a completely new recorded and optimized voice dataset (Thorsten-22.05-neutral).

It’s trained using Coqui 🐸 TTS (for all “TTS-Insiders”, it’s a VITS model).

tl;dr

- pip install tts==0.7.1
- tts-server --model_name tts_models/de/thorsten/vits
- Open webbrowser on http://localhost:5002

Just have fun 🗣️🎉😄

Dominik & Thorsten

Categories
News

“Thorsten” samples from Mycroft skills

Dominik and i are still playing around to provide a new version of “Thorsten” voice to be used with Mycroft installations.

This is the current “work-in-progress” state we are working on
(thx Olaf for supporting us with compute power on HifiGAN training).

“Bitte warte einen Moment, bis ich fertig mit dem booten bin.”
“Ich bin jetzt bereit.”
“Ich verstehe das nicht, aber ich lerne jeden Tag neue Dinge.”
“Es ist im Moment klarer Himmel bei 18 Grad.”
“Mein Name ist Mycroft und ich bin funky.”
Categories
News

Audio samples of next “Thorsten” voice model

After I (again) invested months of my free time for audio recordings (this time with a good microphone and recording setup) and Dominik applied his “audio magic” things really started for both of us.

We have tried (and still try) various configurations, but want to share our current result with you.

  • > 12.000 mono audio recordings made by me with a samplerate of 22kHz
  • Trained with mit Coqui TTS (0.5.0)
  • Tacotron2 DDC (TTS-model)
  • HifGAN (Vocoder) – Thanks Olaf, for supporting us with compute power.
  • Lot’s of love 🙂

Of course, this “Thorsten” model can still be generated offline and is available free of charge under CC0 license.

But how does it sound?

Info about “Berlin” (Source: Wikipedia)

There is no date yet when the model and underlying dataset will be released as the “fine tuning” work is still ongoing. However, we are closer to the goal than to the beginning :-).

We would appreciate your feedback on the current status of the model. Either via the contact form or by email to tm@thorsten-voice.de.

Interested in Open Voice Technology? Take a look at my Youtube channel on that.
This is default text for notification bar