Clustering Audiology Data

Muhammad Anwar, Michael Oakes, Stefan Wermter, Stefan Heinrich

Conference: Proceedings of the 19th Annual Belgian-Dutch Conference on Machine Learning (BeneLearn2010), Leuven, Belgium, May 2010

PDF - URL

Abstract: In this paper we describe new results of statistical and neural data mining of audiology patient records, with the ultimate aim of looking for factors influencing which patients would most benefit from being fitted with a hearing aid. We describe how a combination of neural and statistical techniques can usefully subdivide a set of patients into clusters, based on their hearing thresholds at six different frequencies, and then label the clusters with meaningful text labels. In our first experiment, we cluster the patients based on similarities between their audiograms using k-means clustering, resulting in two main clusters. We then use the chi-squared test to label each cluster with the keywords selected from the text comment, diagnosis and hearing aid type associated with each patient which are most typical (and atypical) of each cluster. In our second experiment, we again cluster the patients based on similarities between their audiograms, but this time using a self-organizing map (SOM). Here the locations in the resulting map, corresponding to individual patients, are labeled with the type of hearing aid selected for each patient. We demonstrate that this automatic textual labeling addresses well the heterogeneous character of medical audiology records, since they consist of numeric, structured and free text data