ETRO VUB
About ETRO  |  News  |  Events  |  Vacancies  |  Contact  
Home Research Education Industry Publications About ETRO

ETRO Events

A list of events ETRO is organizing or participating in.

PhD Defense
High-Quality Personalized Text-to-Speech Synthesis for Belgian Standard Dutch

Presenter

Mr Lukas Latacz [Email]

Abstract

Human speech is quite diverse: there exist more than 7000 languages in the world, even more regional language variants, and humans apply – partly unconsciously – different styles of speaking according to the situation in which the speech is being used.
In many applications the direct use of human speech is too inconvenient or too costly to be economically feasible and speech synthesis is used instead. Speech synthesis is typically generated from an input text. People with communicative disabilities often depend on synthetic speech, for example in a speaking device for people who cannot speak properly anymore or in a reading device for people with reduced eye-sight or dyslexia.
The number of commercially-available synthetic voices is still rather small. This is especially true for medium-sized languages such as Dutch and for specific language-variants such as Belgian standard Dutch (also sometimes referred to as Flemish). Each synthetic voice is typically spoken in a neutral speaking style, which is not always appropriate in all situations.
This thesis focuses on how to build more appropriate high-quality synthetic voices without requiring significant effort and expert knowledge from the voice-builder. The lack of high-quality Belgian standard Dutch synthetic voices available for research purposes inspired us to construct a new high-quality speech synthesizer at the Vrije Universiteit Brussel, the DSSP synthesizer, able to synthesize Belgian standard Dutch and English speech.
This work is structured into three main parts. The first part describes how the recordings of a speaker are used to create new synthetic voices. Our synthesizer is able to synthesize using the two dominant speech synthesis techniques, unit selection synthesis and statistical parametric synthesis. The latter uses a flexible statistical parametric model of speech, but sounds less natural than unit selection synthesis, which selects small speech units from the recordings and concatenates their waveforms. The second part describes the language-specific aspects of our Belgian standard Dutch synthetic voices and how the speaking style of a speaker can be captured by modeling speaker-specific pronunciations, prosodic phrase breaks, silences, accented words and prominent syllables. Finally, we look at some use-cases of the DSSP synthesizer to create personalized high-quality synthetic speech.

Logistics

Date: 14.10.2015

Time: 16:30 - 18:30

Location: Room D.2.01 Building D

- Contact person

- IRIS

- AVSP

- LAMI

- Contact person

- Thesis proposals

- ETRO Courses

- Contact person

- Spin-offs

- Know How

- Journals

- Conferences

- Books

- Vacancies

- News

- Events

- Press

Contact

ETRO Department

Tel: +32 2 629 29 30

©2024 • Vrije Universiteit Brussel • ETRO Dept. • Pleinlaan 2 • 1050 Brussels • Tel: +32 2 629 2930 (secretariat) • Fax: +32 2 629 2883 • WebmasterDisclaimer