Information Technology Society within VDE (ITG) (Ed.)

ITG-Fb. 267: Speech Communication

12. ITG-Fachtagung Sprachkommunikation 5. – 7. Oktober 2016 in Paderborn

ITG-Fachberichte

2016, 394 pages, Slimlinebox, CD-Rom
ISBN 978-3-8007-4275-2
Personal VDE Members are entitled to a 10% discount on this title

Content Foreword

Diese im zweijährigen Rhythmus stattfindende Konferenz hat sich mittlerweile zur größten wissenschaftlichen Fachtagung auf dem Gebiet der maschinellen Verarbeitung von gesprochener Sprache im deutschsprachigen Raum entwickelt. Darüber hinaus hat die Tagung internationale Sichtbarkeit, welches durch Englisch als Konferenzsprache und die Veröffentlichung der Manuskripte über IEEE Xplore unterstrichen wird.

Die diesjährige Tagung umfasst Sitzungen zu den Themen
• Iterative Algorithms for Enhancement and Recognition – Machine Learning for Speech Enhancement
• Selected and Emerging Topics in Speech Processing
• Speech Processing for Ear-Mounted Devices
• Quality Evaluation
• Speech Enhancement in Dynamic Acoustic Scenarios
• Efficient Modeling in ASR
• Speech and Diagnostics

Das letztgenannte Thema, Speech and Diagnostics, stellt einen neuen Schwerpunkt dar, der die wachsende Bedeutung der Sprachtechnologie für medizinische Anwendungen unterstreicht.
Die ITG ist die nationale Vereinigung aller auf dem Gebiet der Informationstechnik Tätigen in
Wirtschaft, Verwaltung, Lehre und Forschung und Wissenschaft. Ihre Ziele sind die Förderung der wissenschaftlichen und technischen Weiterentwicklung und Bewertung der Informationstechnik in Theorie und Praxis. 1954 als Nachrichtentechnische Gesellschaft gegründet, ist sie die älteste Fachgesellschaft im VDE.

1

EXIT Charts for Turbo Automatic Speech Recognition: A Case Study

Authors:
Lohrenz, Timo; Receveur, Simon; Fingscheidt, Tim

2

Introducing Block-Wise Processing into Turbo Viterbi ASR

Authors:
Receveur, Simon; Lohrenz, Timo; Fingscheidt, Tim

3

Noise-Presence-Probability-Based Noise PSD Estimation by Using DNNs

Authors:
Chinaev, Aleksej; Heymann, Jahn; Drude, Lukas; Haeb-Umbach, Reinhold

4

Iterative Harmonic Speech Enhancement

Authors:
Stahl, Johannes; Mowlaee, Pejman

5

Factor Graph Decoding for Speech Presence Probability Estimation

Authors:
Glarner, Thomas; Momenzadeh, Mohammad Mahdi; Drude, Lukas; Haeb-Umbach, Reinhold

6

New Insights into Turbo-Decoding-Based AVSR with Dynamic StreamWeights

Authors:
Gergen, Sebastian; Zeiler, Steffen; Abdelaziz, Ahmed Hussen; Kolossa, Dorothea

7

Unsupervised Classification of Voiced Speech and Pitch Tracking Using Forward-Backward Kalman Filtering

Authors:
Boenninghoff, Benedikt T.; Nickel, Robert M.; Zeiler, Steffen; Kolossa, Dorothea

8

9

10

11

General Detection of Speech Signals in the Time-Frequency Plane

Authors:
Urrigshardt, Sebastian; Kreuzer, Sebastian; Kurth, Frank

12

13

Head-Orientation-Based Device Selection: Are You Talking to Me?

Authors:
Müller, Menno; Par, Steven van de; Bitzer, Joerg

14

Voice Activity Detection Based on Modulation-Phase Differences

Authors:
Graf, Simon; Herbig, Tobias; Buck, Markus; Schmidt, Gerhard

15

A Method to Analyze the Spatial Response of Informed Spatial Filters

Authors:
Chakrabarty, Soumitro; Thiergart, Oliver; Habets, Emanuel A. P.

16

17

18

19

“Listen, Follow me”: The Transformational Leadership Corpus (TLC)

Authors:
Hsu, Chia-Chun; Krajewksi, Jarek; Felfe, Joerg; Mrnka, Joachim; Wiggerich, Andre; Schnieder, Sebastian

20

Towards Opaque Audio Features for Privacy in Acoustic Sensor Networks

Authors:
Nelus, Alexandru; Gergen, Sebastian; Taghia, Jalal; Martin, Rainer

21

The Fraunhofer IAIS Audio Mining System: Current State and Future Directions

Authors:
Schmidt, Christoph; Stadtschnitzer, Michael; Koehler, Joachim

22

Personalized News Event Retrieval for Small Talk in Social Dialog Systems

Authors:
Bechberger, Lucas; Schmidt, Maria; Waibel, Alex; Federico, Marcello

23

Using Tweets as "Ice-Breaking" Sentences in a Social Dialog System

Authors:
Andonov, Aleksandar; Schmidt, Maria; Niehues, Jan; Waibel, Alex

24

25

26

A Model-Based Placement Strategy for a Nearby External Microphone for Speech Enhancement in Hearing Aids

Authors:
Yee, Dianna; Kamkar-Parsi, Homayoun; Martin, Rainer; Puder, Henning

27

On the Use of Beamforming Approaches for Binaural Speaker Localization

Authors:
Zohourian, Mehdi; Enzner, Gerald; Martin, Rainer

28

29

Development of a Sound Coding Strategy based on a Deep Recurrent Neural Network for Monaural Source Separation in Cochlear Implants

Authors:
Nogueira, Waldo; Gajecki, Tom; Krueger, Benjamin; Janer, Jordi; Buechner, Andreas

30

On the Impact of Quantization on Binaural MVDR Beamforming

Authors:
Amini, Jamal; Hendriks, Richard C.; Heusdens, Richard; Guo, Meng; Jensen, Jesper

31

A Robust Null-Steering Beamformer for Acoustic Feedback Cancellation for a Multi-Microphone Earpiece

Authors:
Schepker, Henning; Tran, Linh T. T.; Nordholm, Sven; Doclo, Simon

32

33

Non-Intrusive Estimation Model for the Speech-Quality Dimension Loudness

Authors:
Koester, Friedemann; Cercos-Llombart, Victor; Mittag, Gabriel; Moeller, Sebastian

34

Predicting the quality of processed speech by combining modulation-based features and model trees

Authors:
Cauchi, Benjamin; Santos, Joao F.; Siedenburg, Kai; Falk, Tiago H.; Naylor, Patrick A.; Doclo, Simon; Goetze, Stefan

35

36

Objective Assessment of Artificial Speech Bandwidth Extension Approaches

Authors:
Abel, Johannes; Kaniewska, Magdalena; Guillaume, Cyril; Tirry, Wouter; Fingscheidt, Tim

37

38

39

Towards VoIP quality testing with real-life devices and degradations

Authors:
Soloducha, Michal; Raake, Alexander; Kettler, Frank; Rohrer, Nils; Parotat, Eva; Waeltermann, Marcel; Trevisany, Sven; Voigt, Peter

40

41

Emotion Intelligibility within Codec-Compressed and Reduced Bandwidth Speech

Authors:
Siegert, Ingo; Lotz, Alicia Flores; Maruschke, Michael; Jokisch, Oliver; Wendemuth, Andreas

42

Voice and Speech Assessment From Telephone Recordings Using Prosodic Analysis Based on μ-Law-Companded Features

Authors:
Haderlein, Tino; Schuetzenberger, Anne; Doellinger, Michael; Noeth, Elmar

43

Evaluation of Communication Systems for Full-Face Firefighter Masks

Authors:
Brodersen, Michael; Jüngling, Thorben Moritz; Schmidt, Gerhard

44

A Bag-of-Audio-Words Approach for Snore Sounds’ Excitation Localisation

Authors:
Schmitt, Maximilian; Janott, Christoph; Pandit, Vedhas; Qian, Kun; Heiser, Clemens; Hemmert, Werner; Schuller, Bjoern

45

Wavelet-Based Time-Frequency Representations for Automatic Recognition of Emotions from Speech

Authors:
Vasquez-Correa, J. C.; Arias-Vergara, T.; Orozco-Arroyave, J. R.; Vargas-Bonilla, J. F.; Noeth, E.

46

47

Parkinson-Speech Analysis: Methods and Aims

Authors:
Baasch, Christin; Schmidt, Gerhard; Heute, Ulrich; Nebel, Adelheid; Deuschl, Günther

48

Large Sleepy Reading Corpus (LSRC): Applying Read Speech for Detecting Sleepiness

Authors:
Krajewski, Jarek; Schnieder, Sebastian; Monschau, Christopher; Titt, Raphael; Sommer, David; Golz, Martin

49

An Analysis of Perplexity to Reveal the Effects of Alzheimer’s Disease on Language

Authors:
Wankerl, Sebastian; Noeth, Elmar; Evert, Stefan

50

Gender–dependent GMM–UBM for tracking Parkinson’s disease progression from speech

Authors:
Arias-Vergara, Tomas; Vasquez-Correa, Juan Camilo; Orozco-Arroyave, Juan Rafael; Vargas-Bonilla, Jesus Francisco; Haderlein, Tino; Noeth, Elmar

51

Towards Cross-lingual Automatic Diagnosis of Autism Spectrum Condition in Children’s Voices

Authors:
Schmitt, Maximilian; Marchi, Erik; Ringeval, Fabien; Schuller, Bjoern

52

53

Non-invasive photoglottography for use in the lab and the field

Authors:
Suthau, Eike; Birkholz, Peter; Mainka, Alexander; Simpson, Adrian P.

54

55

Time Domain Approach for Listening Enhancement in Noisy Environments

Authors:
Niermann, Markus; Thierfeld, Christian; Jax, Peter; Vary, Peter

56

Multiframe Echo Suppression Based on Orthogonal Signal Decompositions

Authors:
Huang, Hai; Hofmann, Christian; Kellermann, Walter; Chen, Jingdong; Benesty, Jacob

57

Combined Single-Microphone Wiener and MVDR Filtering based on Speech Interframe Correlations and Speech Presence Probability

Authors:
Fischer, Doerte; Doclo, Simon; Habets, Emanuel A. P.; Gerkmann, Timo

58

A Priori SNR Estimation Using Weibull Mixture Model

Authors:
Chinaev, Aleksej; Heitkaemper, Jens; Haeb-Umbach, Reinhold

59

60

Kurtosis-Controlled Babble Noise Suppression

Authors:
Graf, Simon; Herbig, Tobias; Buck, Markus; Schmidt, Gerhard

61

Combined Linear and Nonlinear Residual Echo Suppression Using a Deficient Distortion Model – A Proof of Concept

Authors:
Schalk-Schupp, Ingo; Faubel, Friedrich; Buck, Markus; Wendemuth, Andreas

62

63

Spectral Envelope Statistics for Source Modeling in Speech Enhancement

Authors:
Das, Sneha; Craciun, Alexandra; Jaehnel, Tobias; Baeckstroem, Tom

64

A Practical Beamformer-Postfilter System forMicrophone Arrays on Seat Belts

Authors:
Krini, Mohammed; Mirza, Zafar Baig; Rodemer, Klaus

65

66

Acoustic Feedback Compensation with Reverb-based Stepsize Control for Incar Communication Systems

Authors:
Bulling, Philipp; Linhard, Klaus; Wolf, Arthur; Schmidt, Gerhard

67

Noise Reduction in the Time Domain Using ARMA Filtering

Authors:
Heese, Florian; Steinbiss, Richard; Jax, Peter; Vary, Peter

68

Robust Online Multi-Channel Speech Recognition

Authors:
Kitza, Markus; Zeyer, Albert; Schlueter, Ralf; Heymann, Jahn; Haeb-Umbach, Reinhold

69

70

Language Feature Vectors for Resource Constraint Speech Recognition

Authors:
Mueller, Markus; Stueker, Sebastian; Waibel, Alex

71

Uncertainty Decoding Using a Sampling Strategy Based on the Eigenvalue Decomposition

Authors:
Huemmer, Christian; Stadter, Philipp; Kellermann, Walter

72

Growing a Deep Neural Network Acoustic Model with Singular Value Decomposition

Authors:
Kilgour, Kevin; Tseyzer, Igor; Nguyen, Thai Son; Stueker, Sebastian; Waibel, Alex

73

74

Phoneme Boundary Detection using Deep Bidirectional LSTMs

Authors:
Franke, Joerg; Mueller, Markus; Hamlaoui, Fatima; Stueker, Sebastian; Waibel, Alex

75

Training Deep Neural Networks for Reverberation Robust Speech Recognition

Authors:
Ritter, Marvin; Mueller, Markus; Stueker, Sebastian; Metze, Florian; Waibel, Alex