Existing multilingual TTS systems can be switched to operate in one of several independent language modes, but in general each language is treated by an independent subsystem and synthesized with a language-specific voice. Therefore, these multilingual TTS systems are not usable for mixed-lingual texts. It is therefore necessary to extend multilingual to truly mixed-lingual TTS systems, which will be called polyglot systems hereafter.
The possibility of synthesizing a single voice speaking the four most important languages in Switzerland (French, German, Italian, and English) has already been investigated and demonstrated by TIK/ETHZ (cf. project POSSY/TTS'99).
The aim of this project was to identify and solve the problems of another major part of a polyglot TTS system, namely the word and sentence analysis and phonetic transcription of mixed-lingual texts. The first phase of the project investigated the frequency and the syntactic nature of foreign expressions in French and German texts as well as the degree of assimilation of the pronunciation of foreign expressions to the base language.
The second phase of the project was devoted to the question of how a lexical and syntactic analyzer for mixed-lingual texts can be built using the linguistic knowledge of existing text analyzers for French, German, English and Italian. The result of the project is a prototype mixed-lingual text-analysis module which will be used in the polyglot version of our SVOX TTS system.
This project required expertise in both speech synthesis and multilingual text analysis and was therefore defined as a joint project of LATL/UNIGE (specialists in multi-lingual text analysis) and TIK/ETHZ (specialists in speech synthesis). The achievements of the project are summarized in report [PW02].
Supported by: The financial support for this project came mainly from the Swiss National Science Foundation.
In collaboration with: This was collaboration project of University of Geneva (LATL), and ETH Zürich.