Search Results - "Feliu, Josué"

Refine Results
  1. 1

    Speculative inter-thread store-to-load forwarding in SMT architectures by Feliu, Josué, Ros, Alberto, Acacio, Manuel E., Kaxiras, Stefanos

    “…Applications running on out-of-order cores have benefited for decades of store-to-load forwarding which accelerates communication of store values to loads of…”
    Get full text
    Journal Article
  2. 2

    DeepP: Deep Learning Multi-Program Prefetch Configuration for the IBM POWER 8 by Lurbe, Manel, Feliu, Josue, Petit, Salvador, Gomez, Maria E., Sahuquillo, Julio

    Published in IEEE transactions on computers (01-10-2022)
    “…Current multi-core processors implement sophisticated hardware prefetchers, that can be configured by application (PID), to improve the system performance…”
    Get full text
    Journal Article
  3. 3

    Perf&Fair: A Progress-Aware Scheduler to Enhance Performance and Fairness in SMT Multicores by Feliu, Josue, Sahuquillo, Julio, Petit, Salvador, Duato, Jose

    Published in IEEE transactions on computers (01-05-2017)
    “…Nowadays, high performance multicore processors implement multithreading capabilities. The processes running concurrently on these processors are continuously…”
    Get full text
    Journal Article
  4. 4

    Cloud White: Detecting and Estimating QoS Degradation of Latency-Critical Workloads in the Public Cloud by Pons, Lucía, Feliu, Josué, Sahuquillo, Julio, Gómez, María E., Petit, Salvador, Pons, Julio, Huang, Chaoyi

    Published in Future generation computer systems (01-01-2023)
    “…The increasing popularity of cloud computing has forced cloud providers to build economies of scale to meet the growing demand. Nowadays, data-centers include…”
    Get full text
    Journal Article
  5. 5

    Effect of Hyper-Threading in Latency-Critical Multithreaded Cloud Applications and Utilization Analysis of the Major System Resources by Pons, Lucía, Feliu, Josué, Puche, José, Huang, Chaoyi, Petit, Salvador, Pons, Julio, Gómez, María E., Sahuquillo, Julio

    Published in Future generation computer systems (01-06-2022)
    “…Multithreaded latency-critical applications represent an important subset of workloads running on public cloud systems. Most of these systems deploy powerful…”
    Get full text
    Journal Article
  6. 6

    Designing lab sessions focusing on real processors for computer architecture courses: A practical perspective by Feliu, Josué, Sahuquillo, Julio, Petit, Salvador

    “…Computer architecture courses typically include lab sessions to reinforce, from a practical perspective, concepts and architectural mechanisms studied in…”
    Get full text
    Journal Article
  7. 7

    Improving IBM POWER8 Performance Through Symbiotic Job Scheduling by Feliu, Josue, Eyerman, Stijn, Sahuquillo, Julio, Petit, Salvador, Eeckhout, Lieven

    “…Symbiotic job scheduling, i.e., scheduling applications that co-run well together on a core, can have a considerable impact on the performance of processors…”
    Get full text
    Journal Article
  8. 8

    Thread-to-Core Allocation in ARM Processors Building Synergistic Pairs by Navarro, Marta, Feliu, Josue, Petit, Salvador, Gomez, Maria E., Sahuquillo, Julio

    “…Simultaneous multithreading (SMT) processors can present significant throughput improvements over single-threaded (ST) processors thanks to sharing internal…”
    Get full text
    Conference Proceeding
  9. 9

    Cache-Hierarchy Contention-Aware Scheduling in CMPs by Feliu, Josue, Petit, Salvador, Sahuquillo, Julio, Duato, Jose

    “…To improve chip multiprocessor (CMP) performance, recent research has focused on scheduling strategies to mitigate main memory bandwidth contention. Nowadays,…”
    Get full text
    Journal Article
  10. 10

    Rebasing Microarchitectural Research with Industry Traces by Feliu, Josue, Perais, Arthur, Jimenez, Daniel A., Ros, Alberto

    “…Microarchitecture research relies on performance models with various degrees of accuracy and speed. In the past few years, one such model, ChampSim, has…”
    Get full text
    Conference Proceeding
  11. 11

    Thread Isolation to Improve Symbiotic Scheduling on SMT Multicore Processors by Feliu, Josue, Sahuquillo, Julio, Petit, Salvador, Eeckhout, Lieven

    “…Resource sharing is a critical issue in simultaneous multithreading (SMT) processors as threads running simultaneously on an SMT core compete for shared…”
    Get full text
    Journal Article
  12. 12

    VMT: Virtualized Multi-Threading for Accelerating Graph Workloads on Commodity Processors by Feliu, Josue, Naithani, Ajeya, Sahuquillo, Julio, Petit, Salvador, Qureshi, Moinuddin, Eeckhout, Lieven

    Published in IEEE transactions on computers (01-06-2022)
    “…Modern-day graph workloads operate on huge graphs through pointer chasing which leads to high last-level cache (LLC) miss rates and limited memory-level…”
    Get full text
    Journal Article
  13. 13

    Bandwidth-Aware Dynamic Prefetch Configuration for IBM POWER8 by Navarro, Carlos, Feliu, Josue, Petit, Salvador, Gomez, Maria E., Sahuquillo, Julio

    “…Advanced hardware prefetch engines are being integrated in current high-performance processors. Prefetching can boost the performance of most applications,…”
    Get full text
    Journal Article
  14. 14

    SYNPA: SMT Performance Analysis and Allocation of Threads to Cores in ARM Processors by Navarro, Marta, Feliu, Josué, Petit, Salvador, Gómez, María E, Sahuquillo, Julio

    Published 19-10-2023
    “…Simultaneous multithreading processors improve throughput over single-threaded processors thanks to sharing internal core resources among instructions from…”
    Get full text
    Journal Article
  15. 15

    Bandwidth-Aware On-Line Scheduling in SMT Multicores by Feliu, Josue, Sahuquillo, Julio, Petit, Salvador, Duato, Jose

    Published in IEEE transactions on computers (01-02-2016)
    “…The memory hierarchy plays a critical role on the performance of current chip multiprocessors. Main memory is shared by all the running processes, which can…”
    Get full text
    Journal Article
  16. 16

    Precise Runahead Execution by Naithani, Ajeya, Feliu, Josue, Adileh, Almutaz, Eeckhout, Lieven

    “…Runahead execution improves processor performance by accurately prefetching long-latency memory accesses. When a long-latency load causes the instruction…”
    Get full text
    Conference Proceeding
  17. 17

    CELLO: Compiler-Assisted Efficient Load-Load Ordering in Data-Race-Free Regions by Singh, Sawan, Feliu, Josue, Acacio, Manuel E., Jimborean, Alexandra, Ros, Alberto

    “…Efficient Total Store Order (TSO) implementations allow loads to execute speculatively out-of-order. To detect order violations, the load queue (LQ) holds all…”
    Get full text
    Conference Proceeding
  18. 18

    Precise Runahead Execution by Naithani, Ajeya, Feliu, Josue, Adileh, Almutaz, Eeckhout, Lieven

    Published in IEEE computer architecture letters (01-01-2019)
    “…Runahead execution improves processor performance by accurately prefetching long-latency memory accesses. When a long-latency load causes the instruction…”
    Get full text
    Journal Article
  19. 19

    Understanding Cache Hierarchy Contention in CMPs to Improve Job Scheduling by Feliu, J., Sahuquillo, J., Petit, S., Duato, J.

    “…In order to improve CMP performance, recent research has focused on scheduling to mitigate contention produced by the limited memory bandwidth. Nowadays,…”
    Get full text
    Conference Proceeding
  20. 20

    Understanding Cloud Workloads Performance in a Production like Environment by Pons, Lucia, Feliu, Josué, Puche, José, Huang, Chaoyi, Petit, Salvador, Pons, Julio, Gómez, María E, Sahuquillo, Julio

    Published 10-10-2020
    “…Understanding inter-VM interference is of paramount importance to provide a sound knowledge and understand where performance degradation comes from in the…”
    Get full text
    Journal Article