-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding new voices #5
Comments
I was unable to pass through the maths and papers, so I am going to do it with Mary for her system and then get back to RHVoice once I know a little more about all of it. =) https://github.com/marytts/marytts/wiki/New-Language-Support |
Any updates regarding MaryTTS? if not, I guess that I'm really interested in proceeding to learn and use Mary for our case. I think I know a guy personally who is familiar with MaryTTS development, so hope to get some help from him. Also, I guess I will have a chance to contact developers of the system personally. |
@YauhenMinsk how do you think - could we find resources to prepare a proposal for H2020 grant, so that we could work with MaryTTS developers and RHVoice full time to get support for Belarussian language? Because it requires more free time than I have in my reserve, it may be more effective to spend it on the proposal. |
I can't say I'm very familiar with H2020. What are the requirements? |
H2020 - the requirements are to write a proposal explaining our idea, find appropriate call and submit it. Then wait to or/and find three partners in EU to participate in it with us. Some progress about creating new voices - Alexander Severin pointed me to the source sound data for existing voices, which is available from http://tiflo.info/rhvoice/ |
Made a progress by analysing VoxForge files for German language. Learned about Cepstral analysis a bit from this awesome presentation - http://www.speech.cs.cmu.edu/15-492/slides/03_mfcc.pdf The concept of Spectral Envelope is also pretty amazing. But.. I don't see yet how it helps with creating new voices.. There are also good news. Nickolay provided critical insight that splitting audio data into chunks by phrase is enough for training, and more fine-grained markup of the text is not needed. This fine-grained markup is called phonetic segmentation and is done automatically. |
Nice. Btw, the tool to work with audio-signal, that is used on the presentation, is Praat [http://www.fon.hum.uva.nl/praat/] |
I start from http://hts.sp.nitech.ac.jp/?cmd=read&page=Tutorial&word=tutorial
The text was updated successfully, but these errors were encountered: