Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not only the whole story, but also which character is currently speaking, what place and mood he is in, whether it is sarcasm or irony and many many more aspects.

However, in my opinion it would be a huge benefit, if this kind of metadata would be put into the ebook file in some way, so that it would be something extractable and not has to be detected. I think it would be enough to ID the characters and tag a gender and a mood in the book together with citations, so that you could add different speech models for different characters. That would also allow to use different voices for different characters.

I wrote a little tool called voicebuilder (which I will open source next year). It's a "sentence splitter" which is able to extract an LJSpeech training dataset for an audio file, epub file and length matching. Works pretty accurate for now, although it needs manual polish of the extracted model. Still way faster than doing it manually.

This way you can build speech sets of your favorite narrators and although you would never be allowed to publish them, I think for private use they are great!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: