In-situ visual exploration over big raw data
Data exploration and visual analytics systems are of great importance in Open Science scenarios, where less tech-savvy researchers wish to access and visually explore big raw data files (e.g., json, csv) generated by scientific experiments using commodity hardware and without being overwhelmed in th...
Saved in:
Published in: | Information systems (Oxford) Vol. 95; p. 101616 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
Oxford
Elsevier Ltd
01-01-2021
Elsevier Science Ltd |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Data exploration and visual analytics systems are of great importance in Open Science scenarios, where less tech-savvy researchers wish to access and visually explore big raw data files (e.g., json, csv) generated by scientific experiments using commodity hardware and without being overwhelmed in the tedious processes of data loading, indexing and query optimization. In this paper, we present our work for enabling efficient query processing on large raw data files for interactive visual exploration scenarios and analytics. We introduce a framework, named RawVis, built on top of a lightweight in-memory tile-based index, VALINOR, that is constructed on-the-fly given the first user query over a raw file and progressively adapted based on the user interaction. We evaluate the performance of a prototype implementation compared to three other alternatives and show that our method outperforms in terms of response time, disk accesses and memory consumption. Particularly during an exploration scenario, the proposed method in most cases is about 5-10× faster compared to existing solutions, and requires significantly less memory resources.
•Progressive and Adaptive processing for in-situ visualization and analytics.•Visual user interactions over raw data as data-access operations.•A main-memory index, constructed on-the-fly based on the first user interaction.•User-driven techniques that progressively adapt index structure during exploration.•Improvement in terms of execution time, I/O operations, and memory consumption. |
---|---|
ISSN: | 0306-4379 1873-6076 |
DOI: | 10.1016/j.is.2020.101616 |