Search Results - "Acun, Bilge"

Refine Results
  1. 1

    Beyond Efficiency: Scaling AI Sustainably by Wu, Carole-Jean, Acun, Bilge, Raghavendra, Ramya, Hazelwood, Kim

    Published in IEEE MICRO (01-09-2024)
    “…Barroso’s seminal contributions in energy-proportional warehouse-scale computing launched an era where modern data centers have become more energy efficient…”
    Get full text
    Journal Article
  2. 2

    Understanding Training Efficiency of Deep Learning Recommendation Models at Scale by Acun, Bilge, Murphy, Matthew, Wang, Xiaodong, Nie, Jade, Wu, Carole-Jean, Hazelwood, Kim

    “…The use of GPUs has proliferated for machine learning workflows and is now considered mainstream for many deep learning models. Meanwhile, when training…”
    Get full text
    Conference Proceeding
  3. 3

    Datacenter-Scale Analysis and Optimization of GPU Machine Learning Workloads by Wesolowski, Lukasz, Acun, Bilge, Andrei, Valentin, Aziz, Adnan, Dankel, Gisle, Gregg, Christopher, Meng, Xiaoqiao, Meurillon, Cyril, Sheahan, Denis, Tian, Lei, Yang, Janet, Yu, Peifeng, Hazelwood, Kim

    Published in IEEE MICRO (01-09-2021)
    “…In this article, we present a system to collectively optimize efficiency in a very large scale deployment of GPU servers for machine learning workloads at…”
    Get full text
    Journal Article
  4. 4

    MAD-Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems by Hsia, Samuel, Golden, Alicia, Acun, Bilge, Ardalani, Newsha, DeVito, Zachary, Wei, Gu-Yeon, Brooks, David, Wu, Carole-Jean

    “…Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high…”
    Get full text
    Conference Proceeding
  5. 5

    Towards realizing the potential of malleable jobs by Gupta, Abhishek, Acun, Bilge, Sarood, Osman, Kale, Laxmikant V.

    “…Malleable jobs are those which can dynamically shrink or expand the number of processors on which they are executing at runtime in response to an external…”
    Get full text
    Conference Proceeding
  6. 6

    Power, Reliability, and Performance: One System to Rule them All by Acun, Bilge, Langer, Akhil, Meneses, Esteban, Menon, Harshitha, Sarood, Osman, Totoni, Ehsan, Kale, Laxmikant V.

    Published in Computer (Long Beach, Calif.) (01-10-2016)
    “…In a design based on the Charm++ parallel programming framework, an adaptive runtime system dynamically interacts with a datacenter's resource manager to…”
    Get full text
    Journal Article
  7. 7
  8. 8

    Support for Power Efficient Proactive Cooling Mechanisms by Acun, Bilge, Lee, Eun Kyung, Park, Yoonho, Kale, Laxmikant V.

    “…Increasing scale of data centers and the density of server nodes pose significant challenges in producing power and energy efficient cooling infrastructures…”
    Get full text
    Conference Proceeding
  9. 9

    Mitigating Variability in HPC Systems and Applications for Performance and Power Efficiency by Acun, Bilge

    Published 01-01-2017
    “…Power consumption and process variability are two important, interconnected, challenges of future generation large-scale High Performance Computing (HPC) data…”
    Get full text
    Dissertation
  10. 10

    SecNDP: Secure Near-Data Processing with Untrusted Memory by Xiong, Wenjie, Ke, Liu, Jankov, Dimitrije, Kounavis, Michael, Wang, Xiaochen, Northup, Eric, Yang, Jie Amy, Acun, Bilge, Wu, Carole-Jean, Peter Tang, Ping Tak, Edward Suh, G., Zhang, Xuan, Lee, Hsien-Hsin S.

    “…Today's data-intensive applications increasingly suffer from significant performance bottlenecks due to the limited memory bandwidth of the classical von…”
    Get full text
    Conference Proceeding
  11. 11

    Thermal aware automated load balancing for HPC applications by Menon, Harshitha, Acun, Bilge, De Gonzalo, Simon Garcia, Sarood, Osman, Kale, Laxmikant

    “…As we move towards the exascale era, power and energy have become major challenges. Some of the supercomputers draw more than 10 megawatts, leading to high…”
    Get full text
    Conference Proceeding
  12. 12

    Beyond Efficiency: Scaling AI Sustainably by Wu, Carole-Jean, Acun, Bilge, Raghavendra, Ramya, Hazelwood, Kim

    Published 07-06-2024
    “…Barroso's seminal contributions in energy-proportional warehouse-scale computing launched an era where modern datacenters have become more energy efficient and…”
    Get full text
    Journal Article
  13. 13

    Fine-Grained Energy Efficiency Using Per-Core DVFS with an Adaptive Runtime System by Acun, Bilge, Chandrasekar, Kavitha, Kale, Laxmikant V.

    “…Dynamic voltage and frequency scaling (DVFS) is a well-known technique to reduce the power and/or energy consumption of various applications. While most…”
    Get full text
    Conference Proceeding
  14. 14

    Parallel programming with migratable objects: charm++ in practice by Acun, Bilge, Gupta, Abhishek, Jain, Nikhil, Langer, Akhil, Menon, Harshitha, Mikida, Eric, Ni, Xiang, Robson, Michael, Sun, Yanhua, Totoni, Ehsan, Wesolowski, Lukasz, Kale, Laxmikant

    “…The advent of petascale computing has introduced new challenges (e.g. heterogeneity, system failure) for programming scalable parallel applications. Increased…”
    Get full text
    Conference Proceeding
  15. 15

    Unlocking the Potential of Renewable Energy Through Curtailment Prediction by Acun, Bilge, Morgan, Brent, Richardson, Henry, Steinsultz, Nat, Wu, Carole-Jean

    Published 28-05-2024
    “…A significant fraction (5-15%) of renewable energy generated goes into waste in the grids around the world today due to oversupply issues and transmission…”
    Get full text
    Journal Article
  16. 16

    CHAI: Clustered Head Attention for Efficient LLM Inference by Agarwal, Saurabh, Acun, Bilge, Hosmer, Basil, Elhoushi, Mostafa, Lee, Yejin, Venkataraman, Shivaram, Papailiopoulos, Dimitris, Wu, Carole-Jean

    Published 12-03-2024
    “…Large Language Models (LLMs) with hundreds of billions of parameters have transformed the field of machine learning. However, serving these models at inference…”
    Get full text
    Journal Article
  17. 17

    Carbon Responder: Coordinating Demand Response for the Datacenter Fleet by Xing, Jiali, Acun, Bilge, Sundarrajan, Aditya, Brooks, David, Chakkaravarthy, Manoj, Avila, Nikky, Wu, Carole-Jean, Lee, Benjamin C

    Published 14-11-2023
    “…The increasing integration of renewable energy sources results in fluctuations in carbon intensity throughout the day. To mitigate their carbon footprint,…”
    Get full text
    Journal Article
  18. 18

    MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems by Hsia, Samuel, Golden, Alicia, Acun, Bilge, Ardalani, Newsha, DeVito, Zachary, Wei, Gu-Yeon, Brooks, David, Wu, Carole-Jean

    Published 04-10-2023
    “…Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high…”
    Get full text
    Journal Article
  19. 19

    TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models by Yin, Chunxing, Acun, Bilge, Liu, Xing, Wu, Carole-Jean

    Published 25-01-2021
    “…The memory capacity of embedding tables in deep learning recommendation models (DLRMs) is increasing dramatically from tens of GBs to TBs across the industry…”
    Get full text
    Journal Article
  20. 20

    Generative AI Beyond LLMs: System Implications of Multi-Modal Generation by Golden, Alicia, Hsia, Samuel, Sun, Fei, Acun, Bilge, Hosmer, Basil, Lee, Yejin, DeVito, Zachary, Johnson, Jeff, Wei, Gu-Yeon, Brooks, David, Wu, Carole-Jean

    “…As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial…”
    Get full text
    Conference Proceeding