Adding new voices #5

abitrolly · 2015-10-14T07:01:51Z

I start from http://hts.sp.nitech.ac.jp/?cmd=read&page=Tutorial&word=tutorial

abitrolly · 2015-10-21T14:28:06Z

I was unable to pass through the maths and papers, so I am going to do it with Mary for her system and then get back to RHVoice once I know a little more about all of it. =)

https://github.com/marytts/marytts/wiki/New-Language-Support

yauhen-info · 2015-10-25T10:55:51Z

Any updates regarding MaryTTS? if not, I guess that I'm really interested in proceeding to learn and use Mary for our case. I think I know a guy personally who is familiar with MaryTTS development, so hope to get some help from him. Also, I guess I will have a chance to contact developers of the system personally.

abitrolly · 2015-10-28T17:27:32Z

@YauhenMinsk how do you think - could we find resources to prepare a proposal for H2020 grant, so that we could work with MaryTTS developers and RHVoice full time to get support for Belarussian language? Because it requires more free time than I have in my reserve, it may be more effective to spend it on the proposal.

yauhen-info · 2015-10-28T20:00:24Z

I can't say I'm very familiar with H2020. What are the requirements?
For the time being I think we need to evaluate amount of work and continue the conversation only after we set the expectations which require a roadmap;

abitrolly · 2015-10-29T12:53:24Z

H2020 - the requirements are to write a proposal explaining our idea, find appropriate call and submit it. Then wait to or/and find three partners in EU to participate in it with us.

Some progress about creating new voices - Alexander Severin pointed me to the source sound data for existing voices, which is available from http://tiflo.info/rhvoice/

abitrolly · 2015-12-12T19:47:02Z

Made a progress by analysing VoxForge files for German language. Learned about Cepstral analysis a bit from this awesome presentation - http://www.speech.cs.cmu.edu/15-492/slides/03_mfcc.pdf The concept of Spectral Envelope is also pretty amazing. But.. I don't see yet how it helps with creating new voices..

There are also good news. Nickolay provided critical insight that splitting audio data into chunks by phrase is enough for training, and more fine-grained markup of the text is not needed. This fine-grained markup is called phonetic segmentation and is done automatically.

yauhen-info · 2015-12-12T20:24:07Z

Nice. Btw, the tool to work with audio-signal, that is used on the presentation, is Praat [http://www.fon.hum.uva.nl/praat/]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding new voices #5

Adding new voices #5

abitrolly commented Oct 14, 2015

abitrolly commented Oct 21, 2015

yauhen-info commented Oct 25, 2015

abitrolly commented Oct 28, 2015

yauhen-info commented Oct 28, 2015

abitrolly commented Oct 29, 2015

abitrolly commented Dec 12, 2015

yauhen-info commented Dec 12, 2015

Adding new voices #5

Adding new voices #5

Comments

abitrolly commented Oct 14, 2015

abitrolly commented Oct 21, 2015

yauhen-info commented Oct 25, 2015

abitrolly commented Oct 28, 2015

yauhen-info commented Oct 28, 2015

abitrolly commented Oct 29, 2015

abitrolly commented Dec 12, 2015

yauhen-info commented Dec 12, 2015