Technical Program

Oral and Poster Sessions

Tuesday, September 13

Keynote Session 1 (KN1)

Tuesday, September 13, 9:30 - 10:30

Large-scale finite element simulations of the physics of voice

Oriol Guasch

Oral Session 1

Tuesday, September 13

Automatic, model-based detection of pause-less phrase boundaries from fundamental frequency and duration features

Mahsa Sadat Elyasi Langarani, Jan van Santen

Synthesising Filled Pauses: Representation and Datamixing

Rasmus Dall, Marcus Tomalin, Mirjam Wester

Emphasis recreation for TTS using intonation atoms

Pierre-Edouard Honnet, Philip N. Garner

Prediction of Emotions from Text using Sentiment Analysis for Expressive Speech Synthesis

Eva Vanmassenhove, João P. Cabral, Fasih Haider

Poster Session 1

Tuesday, September 13

Non-filter waveform generation from cepstrum using spectral phase reconstruction

Yasuhiro Hamada, Nobutaka Ono, Shigeki Sagayama

Investigating Spectral Amplitude Modulation Phase Hierarchy Features in Speech Synthesis

Alexandros Lazaridis, Milos Cernak, Pierre-Edouard Honnet, Philip N. Garner

Multidimensional scaling of systems in the Voice Conversion Challenge 2016

Mirjam Wester, Zhizheng Wu, Junichi Yamagishi

An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion
and Speaker Similarity

Dong-Yan Huang, Lei Xie, Yvonne Siu Wa Lee, Jie Wu, Huaiping Ming, Xiaohai Tian,
Shaofei Zhang, Chuang Ding, Mei Li, Quy Hy Nguyen, Minghui Dong, Haizhou LI

Nonaudible murmur enhancement based on statistical voice conversion
and noise suppression with external noise monitoring

Yusuke Tajiri, Tomoki Toda

Prosodic and Spectral iVectors for Expressive Speech Synthesis

Igor Jauk, Antonio Bonafonte

Development of a statistical parametric synthesis system for operatic singing in German

Michael Pucher, Fernando Villavicencio, Junichi Yamagishi

DNN-based Speech Synthesis for Indian Languages from ASCII text

Srikanth Ronanki, Siva Reddy, Bajibabu Bollepalli, Simon King

Experiments with Cross-lingual Systems for Synthesis of Code-Mixed Text

Sunayana Sitaram, Sai Krishna Rallabandi, Shruti Rijhwani, Alan W. Black

Jerk Minimization for Acoustic-To-Articulatory Inversion

Avni Rajpal, Hemant A. Patil

How to select a good voice for TTS

Sunhee Kim

WikiSpeech – enabling open source text-to-speech for Wikipedia

John Andersson, Sebastian Berlin, André Costa, Harald Berthelsen, Hanna Lindgren,
Nikolaj Lindberg, Jonas Beskow, Jens Edlund, Joakim Gustafson

Wednesday, September 14

Keynote Session 2 (KN2)

Wednesday, September 14, 9:30 - 10:30

Siri’s voice gets deep learning

Alex Acero

Oral Session 2

Wednesday, September 14

Parallel and cascaded deep neural networks for text-to-speech synthesis

Manuel Sam Ribeiro, Oliver Watts, Junichi Yamagishi

Temporal modeling in neural network based statistical parametric speech synthesis

Keiichi Tokuda, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku

Multi-output RNN-LSTM for multiple speaker speech synthesis with α-interpolation model

Santiago Pascual, Antonio Bonafonte

A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems
Trained on Very Large Speaker-Dependent Corpora

Xin Wang, Shinji Takaki, Junichi Yamagishi

Poster Session 2

Wednesday, September 14

Non-intrusive Quality Assessment of Synthesized Speech using Spectral Features and Support Vector Regression

Meet H. Soni, Hemant A. Patil

Novel Pre-processing using Outlier Removal in Voice Conversion

Sushant V. Rao, Nirmesh J Shah, Hemant A. Patil

Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform

Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki

Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech

Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki, Junichi Yamagishi

Speaker Adaptation of Various Components in Deep Neural Network based Speech Synthesis

Shinji Takaki, SangJin Kim, Junichi Yamagishi

Mandarin Prosodic Phrase Prediction based on Syntactic Trees

Zhengchen Zhang, Fuxiang Wu, Chenyu Yang, Minghui Dong, Fugen Zhou

Investigating Very Deep Highway Networks for Parametric Speech Synthesis

Xin Wang, Shinji Takaki, Junichi Yamagishi

Contextual Representation using Recurrent Neural Network Hidden State for Statistical Parametric Speech Synthesis

Sivanand Achanta, Rambabu Banoth, Ayushi Pandey, Anandaswarup Vadapalli, Suryakanth V Gangashetty

Wide Passband Design for Cosine-Modulated Filter Banks in Sinusoidal Speech Synthesis

Nobuyuki Nishizawa, Tomonori Yazaki

Utterance Selection Techniques for TTS Systems Using Found Speech

Pallavi Baljekar, Alan W. Black

Open-Source Consumer-Grade Indic Text To Speech

Andrew Wilkinson, Alok Parlikar, Sunayana Sitaram, Tim White, Alan W. Black, Suresh Bazaj

On the impact of phoneme alignment in DNN-based speech synthesis

Mei Li, Zhizheng Wu, Lei Xie

Merlin: An Open Source Neural Network Speech Synthesis System

Zhizheng Wu, Oliver Watts, Simon King

Thursday, September 15

Keynote Session 3 (KN3)

Thursday, September 15, 9:30 - 10:30

End-to-end Learning for Text and Speech

Quoc V. Le

Oral Session 3

Thursday, September 15

A hybrid harmonics-and-bursts modelling approach to speech synthesis

Jonas Beskow, Harald Berthelsen

A Pulse Model in Log-domain for a Uniform Synthesizer

Gilles Degottex, Pierre Lanchantin, Mark Gales

Using instantaneous frequency and aperiodicity detection to estimate F0 for high-quality speech synthesis

Hideki Kawahara, Yannis Agiomyrgiannakis, Heiga Zen

Wideband Harmonic Model: Alignment and Noise Modeling for High Quality Speech Synthesis

Slava Shechtman, Alex Sorin

Latest news

Keynote talks

We are pleased to announce Oriol Guasch, Alex Acero and Quoc V. Lee as Keynote Speakers for SSW9.

August 26th 2016

Presentation Guidelines

Poster and oral presentation guidelines are published here

August 4th 2016

Call for Demos

If you want to make a demo presentation, please take a look at the Call for Demos

July 25th 2016

Oral and Poster Sessions

The oral and poster sessions have been published.

July 22th 2016

Accepted Papers

The list of accepted papers has been published.
Deadline for camera-ready papers: July 26th

July 15th 2016

Registration

The registration system is open. Early registration ends on August 10th.

July 9th 2016

International Speech Communication Association.

SynSIG: promoting the study of Speech Synthesis

9th ISCA Speech Synthesis Workshop

September 13th - 15th, 2016, Sunnyvale, CA, USA

Technical Program

Oral and Poster Sessions

Latest news

Keynote talks

Presentation Guidelines

Call for Demos

Oral and Poster Sessions

Accepted Papers

Registration