Common and distinct components in data fusion

In many areas of science, multiple sets of data are collected pertaining to the same system. Examples are food products that are characterized by different sets of variables, bioprocesses that are online sampled with different instruments, or biological systems of which different genomic measurement...

Full description

Saved in:
Bibliographic Details
Published in:Journal of chemometrics Vol. 31; no. 7
Main Authors: Smilde, Age K., Måge, Ingrid, Næs, Tormod, Hankemeier, Thomas, Lips, Mirjam Anne, Kiers, Henk A. L., Acar, Ervim, Bro, Rasmus
Format: Journal Article
Language:English
Published: Chichester Wiley Subscription Services, Inc 01-07-2017
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In many areas of science, multiple sets of data are collected pertaining to the same system. Examples are food products that are characterized by different sets of variables, bioprocesses that are online sampled with different instruments, or biological systems of which different genomic measurements are obtained. Data fusion is concerned with analyzing such sets of data simultaneously to arrive at a global view of the system under study. One of the upcoming areas of data fusion is exploring whether the data sets have something in common or not. This gives insight into common and distinct variation in each data set, thereby facilitating understanding of the relationships between the data sets. Unfortunately, research on methods to distinguish common and distinct components is fragmented, both in terminology and in methods: There is no common ground that hampers comparing methods and understanding their relative merits. This paper provides a unifying framework for this subfield of data fusion by using rigorous arguments from linear algebra. The most frequently used methods for distinguishing common and distinct components are explained in this framework, and some practical examples are given of these methods in the areas of medical biology and food science. This paper presents a general mathematical framework for defining common and distinct components in data fusion. It places the currently most used methods in this framework and derives new properties of those methods. Some of the methods are illustrated with two real‐life examples. It also discusses unsolved problems in the area and hints at possible new directions of research.
ISSN:0886-9383
1099-128X
DOI:10.1002/cem.2900