The Fraunhofer IAIS Audio Mining System: Current State and Future Directions

Konferenz: Speech Communication - 12. ITG-Fachtagung Sprachkommunikation
05.10.2016 - 07.10.2016 in Paderborn, Deutschland

Tagungsband: ITG-Fb. 267: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Schmidt, Christoph; Stadtschnitzer, Michael; Koehler, Joachim (Fraunhofer IAIS, Schloss Birlinghoven, 53757 Sankt Augustin, Germany)

Archivists, journalists and content hosters often face the problem of dealing with vast amounts of audio-visual data. These media files are usually accompanied by only few metadata such as title and topic, and search algorithms can often only search on this metadata. Consequently, metadata information has to be annotated manually, or the content cannot be found in the archive later. The Fraunhofer IAIS Audio Mining System alleviates this issue by providing state-of-the-art multimedia analytics to facilitate full text as well as keyword search based on the spoken words. It thus opens up archive content which does not contain manual annotations. In this paper, we give a detailed description of the current state of the system as well as the analysis algorithms and the user interface. We also provide an outlook about future directions of development, which is currently in productive use in the public media industry.