Hinted Star Coordinates for Mixed Data

Mixed data sets containing numerical and categorical attributes are nowadays ubiquitous. Converting them to one attribute type may lead to a loss of information. We present an approach for handling numerical and categorical attributes in a holistic view. For data sets with many attributes, dimension...

Full description

Saved in:
Bibliographic Details
Published in:Computer graphics forum Vol. 39; no. 1; pp. 117 - 133
Main Authors: Matute, J., Linsen, L.
Format: Journal Article
Language:English
Published: Oxford Blackwell Publishing Ltd 01-02-2020
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Mixed data sets containing numerical and categorical attributes are nowadays ubiquitous. Converting them to one attribute type may lead to a loss of information. We present an approach for handling numerical and categorical attributes in a holistic view. For data sets with many attributes, dimensionality reduction (DR) methods can help to generate visual representations involving all attributes. While automatic DR for mixed data sets is possible using weighted combinations, the impact of each attribute on the resulting projection is difficult to measure. Interactive support allows the user to understand the impact of data dimensions in the formation of patterns. Star Coordinates is a well‐known interactive linear DR technique for multi‐dimensional numerical data sets. We propose to extend Star Coordinates and its initial configuration schemes to mixed data sets. In conjunction with analysing numerical attributes, our extension allows for exploring the impact of categorical dimensions and individual categories on the structure of the entire data set. The main challenge when interacting with Star Coordinates is typically to find a good configuration of the attribute axes. We propose a guided mixed data analysis based on maximizing projection quality measures by the use of recommended transformations, named hints, in order to find a proper configuration of the attribute axes. Mixed data sets containing numerical and categorical attributes are nowadays ubiquitous. Converting them to one attribute type may lead to a loss of information. We present an approach for handling numerical and categorical attributes in a holistic view. For data sets with many attributes, dimensionality reduction (DR) methods can help to generate visual representations involving all attributes. While automatic DR for mixed data sets is possible using weighted combinations, the impact of each attribute on the resulting projection is difficult to measure. Interactive support allows the user to understand the impact of data dimensions in the formation of patterns. Star Coordinates is a well‐known interactive linear DR technique for multi‐dimensional numerical data sets. We propose to extend Star Coordinates and its initial configuration schemes to mixed data sets. In conjunction with analysing numerical attributes, our extension allows for exploring the impact of categorical dimensions and individual categories on the structure of the entire data set. The main challenge when interacting with Star Coordinates is typically to find a good configuration of the attribute axes. We propose a guided mixed data analysis based on maximizing projection quality measures by the use of recommended transformations, named hints, in order to find a proper configuration of the attribute axes.
ISSN:0167-7055
1467-8659
DOI:10.1111/cgf.13666