Hebbian Principal Component Clustering for Information Retrieval on a Crowdsourcing Platform
Conference: NDES 2012 - Nonlinear Dynamics of Electronic Systems
07/11/2012 - 07/13/2012 at Wolfenbüttel, Germany
Proceedings: NDES 2012
Pages: 4Language: englishTyp: PDFPersonal VDE Members are entitled to a 10% discount on this title
Niederberger, Thomas; Stoop, Norbert; Ott, Thomas (ZHAW Zurich University of Applied Sciences, Switzerland)
Christen, Markus (University of Zurich, Switzerland & University of Notre Dame, USA)
Crowdsourcing, a distributed process that involves outsourcing tasks to a network of people, is increasingly used by companies for generating solutions to problems of various kinds. In this way, thousands of people contribute a large amount of text data that needs to already be structured during the process of idea generation in order to avoid repetitions and to maximize the solution space. This is a hard information retrieval problem as the texts are very short and have little predefined structure. We present a solution that involves three steps: text data preprocessing, clustering, and visualization. In this contribution, we focus on clustering and visualization by presenting a Hebbian network approach that is able to learn the principal components of the data while the data set is continuously growing in size. We compare our approach to standard clustering applications and demonstrate its superiority with respect to classification reliability on a real-world example.