17 Apr 2016

Visual Data Exploration of Temporal Cluster Changes Using Self-Organizing Maps

Denny, March 2013
The Australian National University

Abstract
Discovering clustering changes in real-life datasets is important in many contexts, such as fraud detection and customer attrition analysis. Organizations can use such knowledge of change to adapt business strategies in response to changing circumstances.  To understand what has changed, analysts have to be able to relate new knowledge acquired from a newer dataset to that acquired from an earlier dataset. This PhD thesis presents a comprehensive visual-interactive temporal clustering analysis framework using the Self-Organizing Map (SOM) to identify and analyze clustering changes in both clustering structure and cluster membership.

The key contributions of this research are as follows. Population-based real-life datasets often contain clusters of unusual and particularly interesting sub-populations, called hot spots.  The first contribution is a SOM-based methodology to identify hot spots, to rank attributes that distinguish a selected hot spot, and to drill down into the selected hot spot in a single snapshot dataset. Second, this research contributes a new visualization method called Relative Density Self-Organizing Map (ReDSOM) to compare clustering structures from two snapshot datasets. This visualization provide means for the analysts to visually identify and analyze various changes in the clustering structure, such as emerging clusters, disappearing clusters, splitting clusters, and merging clusters. After interactively analyzing clustering changes at the macro level (structural changes), analysts often desire deeper insight into changes at the micro level (entity migration). As the third contribution, this research develops a method to visualize migration paths and analyze attributes which had significantly changed in the members of a cluster.

These contributions have been evaluated using synthetic datasets, as well as real-life datasets from the World Bank, the Australian Taxation Office, and various international organizations. The results from these real-life datasets demonstrate that the changes identified by this method can be related to actual changes.

Share this article on:

Related Article


Back to Top