Search Results - "Roma, Nuno"

Refine Results
  1. 1

    A Compute Cache System for Signal Processing Applications by Vieira, João, Roma, Nuno, Falcao, Gabriel, Tomás, Pedro

    Published in Journal of signal processing systems (01-10-2021)
    “…Nowadays, processing systems are constrained by the low efficiency of their memory subsystems. Although memories evolved into faster and more efficient devices…”
    Get full text
    Journal Article
  2. 2

    A Reconfigurable Posit Tensor Unit with Variable-Precision Arithmetic and Automatic Data Streaming by Neves, Nuno, Tomás, Pedro, Roma, Nuno

    Published in Journal of signal processing systems (01-12-2021)
    “…The increased adoption of DNN applications drove the emergence of dedicated tensor computing units to accelerate multi-dimensional matrix multiplication…”
    Get full text
    Journal Article
  3. 3

    NDPmulator: Enabling Full-System Simulation for Near-Data Accelerators From Caches to DRAM by Vieira, João, Roma, Nuno, Falcao, Gabriel, Tomás, Pedro

    Published in IEEE access (2024)
    “…The accurate simulation and performance assessment of Near-Data Accelerators (NDAccs) is a complex challenge as it must consider the operation of the entire…”
    Get full text
    Journal Article
  4. 4

    GPU Static Modeling Using PTX and Deep Structured Learning by Guerreiro, Joao, Ilic, Aleksandar, Roma, Nuno, Tomas, Pedro

    Published in IEEE access (2019)
    “…In the quest for exascale computing, energy-efficiency is a fundamental goal in high-performance computing systems, typically achieved via dynamic voltage and…”
    Get full text
    Journal Article
  5. 5

    Efficient Hybrid DCT-Domain Algorithm for Video Spatial Downscaling by Roma, Nuno, Sousa, Leonel

    “…A highly efficient video downscaling algorithm for any arbitraryinteger scaling factor performed in a hybrid pixel transformdomain is proposed. This algorithm…”
    Get full text
    Journal Article
  6. 6

    Efficient Hybrid DCT-Domain Algorithm for Video Spatial Downscaling by Roma, Nuno, Sousa, Leonel

    “…: A highly efficient video downscaling algorithm for any arbitrary integer scaling factor performed in a hybrid pixel transform domain is proposed. This…”
    Get full text
    Journal Article
  7. 7

    Compiler-Assisted Data Streaming for Regular Code Structures by Neves, Nuno, Tomas, Pedro, Roma, Nuno

    Published in IEEE transactions on computers (01-03-2021)
    “…The performance of modern processors is often limited by execution stalls resulting from long memory access latencies. Compile-time optimizations, deep cache…”
    Get full text
    Journal Article
  8. 8

    Unified Posit/IEEE-754 Vector MAC Unit for Transprecision Computing by Crespo, Luis, Tomas, Pedro, Roma, Nuno, Neves, Nuno

    “…Transprecision computing targets energy-efficiency with multiple floating-point modules with different precisions to suit application requirements…”
    Get full text
    Journal Article
  9. 9

    Modeling and Decoupling the GPU Power Consumption for Cross-Domain DVFS by Guerreiro, Joao, Ilic, Aleksandar, Roma, Nuno, Tomas, Pedro

    “…Dynamic voltage and frequency scaling (DVFS) is a popular technique to improve the energy-efficiency of high-performance computing systems. It allows placing…”
    Get full text
    Journal Article
  10. 10

    A tutorial overview on the properties of the discrete cosine transform for encoded image and video processing by Roma, Nuno, Sousa, Leonel

    Published in Signal processing (01-11-2011)
    “…Discrete trigonometric transforms, such as the discrete cosine transform (DCT) and the discrete sine transform (DST), have been extensively used in signal…”
    Get full text
    Journal Article
  11. 11

    Decoupling GPGPU voltage-frequency scaling for deep-learning applications by Mendes, Francisco, Tomás, Pedro, Roma, Nuno

    “…•GPUs may be safely undervoltage, allowing for non-conventional DVFS configurations.•A benchmark suit characterizes GPU components regarding undervoltage…”
    Get full text
    Journal Article
  12. 12

    Flying tourist problem: Flight time and cost minimization in complex routes by Marques, Rafael, Russo, Luís, Roma, Nuno

    Published in Expert systems with applications (15-09-2019)
    “…•The NP-hard Flying Tourist Problem, a model for multi-city flight requests.•An efficient solution of the problem, based on a meta-heuristic methodology.•High…”
    Get full text
    Journal Article
  13. 13

    Adaptive In-Cache Streaming for Efficient Data Management by Neves, Nuno, Tomas, Pedro, Roma, Nuno

    “…The design of adaptive architectures is frequently focused on the sole adaptation of the processing blocks, often neglecting the power/performance impact of…”
    Get full text
    Journal Article
  14. 14

    Compiling for Vector Extensions With Stream-Based Specialization by Neves, Nuno, Domingos, Joao Mario, Roma, Nuno, Tomas, Pedro, Falcao, Gabriel

    Published in IEEE MICRO (01-09-2022)
    “…To overcome the current performance wall, data streaming and data-flow computing paradigms have been gradually making their way into the general-purpose…”
    Get full text
    Journal Article
  15. 15

    GPGPU Power Modeling for Multi-domain Voltage-Frequency Scaling by Guerreiro, Joao, Ilic, Aleksandar, Roma, Nuno, Tomas, Pedro

    “…Dynamic Voltage and Frequency Scaling (DVFS) on Graphics Processing Units (GPUs) components is one of the most promising power management strategies, due to…”
    Get full text
    Conference Proceeding
  16. 16

    Special issue on real-time energy-aware circuits and systems for HEVC and for its 3D and SVC extensions by Sousa, Leonel, Roma, Nuno

    Published in Journal of real-time image processing (01-03-2017)
    “…Since its approval, in 2013, the high-efficiency video coding (HEVC) standard [1, 2] has established as the new state-of-the-art on video compression…”
    Get full text
    Journal Article
  17. 17

    DVFS-aware application classification to improve GPGPUs energy efficiency by Guerreiro, João, Ilic, Aleksandar, Roma, Nuno, Tomás, Pedro

    Published in Parallel computing (01-04-2019)
    “…•Exploring the effects of core and memory DVFS on the execution of GPU applications.•GPU characterization scheme validated on multiple GPU devices.•Classes of…”
    Get full text
    Journal Article
  18. 18

    Positnn: Training Deep Neural Networks with Mixed Low-Precision Posit by Raposo, Goncalo, Tomas, Pedro, Roma, Nuno

    “…Low-precision formats have proven to be an efficient way to reduce not only the memory footprint but also the hardware resources and power consumption of deep…”
    Get full text
    Conference Proceeding
  19. 19

    Trading Performance, Power, and Area on Low-Precision Posit MAC Units for CNN Training by Crespo, Luis, Tomas, Pedro, Roma, Nuno, Neves, Nuno

    “…The recently proposed Posit number system has been regarded as a particularly well-suited floating-point format to optimize the throughput and efficiency of…”
    Get full text
    Conference Proceeding
  20. 20

    gem5-ndp: Near-Data Processing Architecture Simulation From Low Level Caches to DRAM by Vieira, Joao, Roma, Nuno, Falcao, Gabriel, Tomas, Pedro

    “…Unlike standard accelerators, the performance of Near-Data Processing (NDP) devices highly depends on the operation of the surrounding system, namely, the…”
    Get full text
    Conference Proceeding