Constellation: A science graph network for scalable data and knowledge discovery in extreme-scale scientific collaborations

Constellation's overarching goal is the federation of information from resources within an extreme-scale scientific collaboration to enable the scalable discovery of data and new knowledge pathways. The resource fabric is comprised of petascale supercomputers and storage systems, users, jobs, d...

Full description

Saved in:
Bibliographic Details
Published in:2016 IEEE International Conference on Big Data (Big Data) pp. 3052 - 3061
Main Authors: Vazhkudai, Sudharshan S., Harney, John, Gunasekaran, Raghul, Stansberry, Dale, Seung-Hwan Lim, Barron, Tom, Nash, Andrew, Ramanathan, Arvind
Format: Conference Proceeding
Language:English
Published: IEEE 01-12-2016
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Constellation's overarching goal is the federation of information from resources within an extreme-scale scientific collaboration to enable the scalable discovery of data and new knowledge pathways. The resource fabric is comprised of petascale supercomputers and storage systems, users, jobs, datasets and lifecycle artifacts. For an extreme-scale supercomputing center, normal operations can generate hundreds of millions of data products and metadata entries describing the resource fabric. Constellation federates the information extracted from the resources using a custom, transformative science graph network; constructs rich metadata indexes and higher-order derived metadata from the extracted information; and conducts scalable graph analytics to unravel hidden data pathways. Our implementation and deployment for a production, supercomputing facility shows that the graph can scale to more than 750 million vertices, its domain agnostic indexing can answer interesting science queries, and its analytics can aid in structural, topological and temporal analysis to identify usage hotspots.
DOI:10.1109/BigData.2016.7840959