Search Results - "2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)"

Refine Results
  1. 1

    Fast Error-Bounded Lossy HPC Data Compression with SZ by Sheng Di, Cappello, Franck

    “…Today's HPC applications are producing extremely large amounts of data, thus it is necessary to use an efficient compression before storing them to parallel…”
    Get full text
    Conference Proceeding
  2. 2

    Parallel Tensor Compression for Large-Scale Scientific Data by Austin, Woody, Ballard, Grey, Kolda, Tamara G.

    “…As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a…”
    Get full text
    Conference Proceeding
  3. 3

    Disruptive Research and Innovation by Kai Li

    “…Ever since Clayton Christensen coined the terms "disruptive technologies" and "disruptive innovations" in 1990s, researchers and entrepreneurs love the word…”
    Get full text
    Conference Proceeding
  4. 4

    A Medium-Grained Algorithm for Sparse Tensor Factorization by Smith, Shaden, Karypis, George

    “…Modeling multi-way data can be accomplished using tensors, which are data structures indexed along three or more dimensions. Tensors are increasingly used to…”
    Get full text
    Conference Proceeding
  5. 5

    On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems by Yildiz, Orcun, Dorier, Matthieu, Ibrahim, Shadi, Ross, Rob, Antoniu, Gabriel

    “…As we move toward the exascale era, performance variability in HPC systems remains a challenge. I/O interference, a major cause of this variability, is…”
    Get full text
    Conference Proceeding
  6. 6

    Rabbit Order: Just-in-Time Parallel Reordering for Fast Graph Analysis by Arai, Junya, Shiokawa, Hiroaki, Yamamuro, Takeshi, Onizuka, Makoto, Iwamura, Sotetsu

    “…Ahead-of-time data layout optimization by vertex reordering is a widely used technique to improve memory access locality in graph analysis. While reordered…”
    Get full text
    Conference Proceeding
  7. 7

    Mystic: Predictive Scheduling for GPU Based Cloud Servers Using Machine Learning by Ukidave, Yash, Xiangyu Li, Kaeli, David

    “…GPUs have become the primary choice of accelerators for high-end data centers and cloud servers, which can host thousands of disparate applications. With the…”
    Get full text
    Conference Proceeding
  8. 8

    Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication by Koanantakool, Penporn, Azad, Ariful, Buluc, Aydin, Morozov, Dmitriy, Sang-Yun Oh, Oliker, Leonid, Yelick, Katherine

    “…Multiplication of a sparse matrix with a dense matrix is a building block of an increasing number of applications in many areas such as machine learning and…”
    Get full text
    Conference Proceeding
  9. 9

    ARCHER: Effectively Spotting Data Races in Large OpenMP Applications by Atzeni, Simone, Gopalakrishnan, Ganesh, Rakamaric, Zvonimir, Ahn, Dong H., Laguna, Ignacio, Schulz, Martin, Lee, Gregory L., Protze, Joachim, Muller, Matthias S.

    “…OpenMP plays a growing role as a portable programming model to harness on-node parallelism, yet, existing data race checkers for OpenMP have high overheads and…”
    Get full text
    Conference Proceeding
  10. 10

    Parallel Graph Coloring for Manycore Architectures by Deveci, Mehmet, Boman, Erik G., Devine, Karen D., Rajamanickam, Sivasankaran

    “…Graph algorithms are challenging to parallelize on manycore architectures due to complex data dependencies and irregular memory access. We consider the well…”
    Get full text
    Conference Proceeding
  11. 11

    MEMTUNE: Dynamic Memory Management for In-Memory Data Analytic Platforms by Xu, Luna, Min Li, Li Zhang, Butt, Ali R., Yandong Wang, Hu, Zane Zhenhua

    “…Memory is a crucial resource for big data processing frameworks such as Spark and M3R, where the memory is used both for computation and for caching…”
    Get full text
    Conference Proceeding
  12. 12

    OpenACC to FPGA: A Framework for Directive-Based High-Performance Reconfigurable Computing by Seyong Lee, Jungwon Kim, Vetter, Jeffrey S.

    “…This paper presents a directive-based, high-level programming framework for high-performance reconfigurable computing. It takes a standard, portable OpenACC C…”
    Get full text
    Conference Proceeding
  13. 13

    Mitigation of Denial of Service Attack with Hardware Trojans in NoC Architectures by Boraten, Travis, Kodi, Avinash Karanth

    “…As Multiprocessor System-on-Chips (MPSoCs) continue to scale, security for Network-on-Chips (NoCs) is a growing concern as rogue agents threaten to infringe on…”
    Get full text
    Conference Proceeding
  14. 14

    Analyzing Network Health and Congestion in Dragonfly-Based Supercomputers by Bhatele, Abhinav, Jain, Nikhil, Livnat, Yarden, Pascucci, Valerio, Bremer, Peer-Timo

    “…The dragonfly topology is a popular choice for building high-radix, low-diameter, hierarchical networks with high-bandwidth links. On Cray installations of the…”
    Get full text
    Conference Proceeding
  15. 15

    Are Static Schedules so Bad? A Case Study on Cholesky Factorization by Agullo, Emmanuel, Beaumont, Olivier, Eyraud-Dubois, Lionel, Kumar, Suraj

    “…Our goal is to provide an analysis and comparison of static and dynamic strategies for task graph scheduling on platforms consisting of heterogeneous and…”
    Get full text
    Conference Proceeding
  16. 16

    High Performance Parallel Stochastic Gradient Descent in Shared Memory by Sallinen, Scott, Satish, Nadathur, Smelyanskiy, Mikhail, Sury, Samantika S., Re, Christopher

    “…Stochastic Gradient Descent (SGD) is a popular optimization method used to train a variety of machine learning models. Most of SGD work to-date has…”
    Get full text
    Conference Proceeding
  17. 17

    PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures by Patwary, Md Mostofa Ali, Satish, Nadathur Rajagopalan, Sundaram, Narayanan, Jialin Liu, Sadowski, Peter, Racah, Evan, Byna, Suren, Tull, Craig, Bhimji, Wahid, Prabhat, Dubey, Pradeep

    “…Computing k-Nearest Neighbors (KNN) is one of the core kernels used in many machine learning, data mining and scientific computing applications. Although…”
    Get full text
    Conference Proceeding
  18. 18

    GraphPad: Optimized Graph Primitives for Parallel and Distributed Platforms by Anderson, Michael J., Sundaram, Narayanan, Satish, Nadathur, Patwary, Md Mostofa Ali, Willke, Theodore L., Dubey, Pradeep

    “…The duality between graphs and matrices means that many common graph analyses can be expressed with primitives such as generalized sparse matrix-vector…”
    Get full text
    Conference Proceeding
  19. 19

    Online Algorithm-Based Fault Tolerance for Cholesky Decomposition on Heterogeneous Systems with GPUs by Jieyang Chen, Xin Liang, Zizhong Chen

    “…Extensive researches have been done on developing and optimizing algorithm-based fault tolerance (ABFT) schemes for systolic arrays and general purpose…”
    Get full text
    Conference Proceeding
  20. 20

    On First Fit Bin Packing for Online Cloud Server Allocation by Xueyan Tang, Yusen Li, Runtian Ren, Wentong Cai

    “…Cloud-based systems often face the problem of dispatching a stream of jobs to run on cloud servers in an online manner. Each job has a size that defines the…”
    Get full text
    Conference Proceeding