Search Results - "Toyama, Daniel"

  • Showing 1 - 9 results of 9
Refine Results
  1. 1

    Not All LLM Reasoners Are Created Equal by Hosseini, Arian, Sordoni, Alessandro, Toyama, Daniel, Courville, Aaron, Agarwal, Rishabh

    Published 02-10-2024
    “…We study the depth of grade-school math (GSM) problem-solving capabilities of LLMs. To this end, we evaluate their performance on pairs of existing math word…”
    Get full text
    Journal Article
  2. 2

    Learning how to Interact with a Complex Interface using Hierarchical Reinforcement Learning by Comanici, Gheorghe, Glaese, Amelia, Gergely, Anita, Toyama, Daniel, Ahmed, Zafarali, Jackson, Tyler, Hamel, Philippe, Precup, Doina

    Published 21-04-2022
    “…Hierarchical Reinforcement Learning (HRL) allows interactive agents to decompose complex problems into a hierarchy of sub-tasks. Higher-level tasks can invoke…”
    Get full text
    Journal Article
  3. 3

    AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents by Rawles, Christopher, Clinckemaillie, Sarah, Chang, Yifan, Waltz, Jonathan, Lau, Gabrielle, Fair, Marybeth, Li, Alice, Bishop, William, Li, Wei, Campbell-Ajala, Folawiyo, Toyama, Daniel, Berry, Robert, Tyamagundlu, Divya, Lillicrap, Timothy, Riva, Oriana

    Published 23-05-2024
    “…Autonomous agents that execute human tasks by controlling computers can enhance human productivity and application accessibility. However, progress in this…”
    Get full text
    Journal Article
  4. 4
  5. 5

    AndroidEnv: A Reinforcement Learning Platform for Android by Toyama, Daniel, Hamel, Philippe, Gergely, Anita, Comanici, Gheorghe, Glaese, Amelia, Ahmed, Zafarali, Jackson, Tyler, Mourad, Shibl, Precup, Doina

    Published 27-05-2021
    “…We introduce AndroidEnv, an open-source platform for Reinforcement Learning (RL) research built on top of the Android ecosystem. AndroidEnv allows RL agents to…”
    Get full text
    Journal Article
  6. 6

    RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning by Ramos, Sabela, Girgin, Sertan, Hussenot, Léonard, Vincent, Damien, Yakubovich, Hanna, Toyama, Daniel, Gergely, Anita, Stanczyk, Piotr, Marinier, Raphael, Harmsen, Jeremiah, Pietquin, Olivier, Momchev, Nikola

    Published 04-11-2021
    “…We introduce RLDS (Reinforcement Learning Datasets), an ecosystem for recording, replaying, manipulating, annotating and sharing data in the context of…”
    Get full text
    Journal Article
  7. 7
  8. 8

    The Option Keyboard: Combining Skills in Reinforcement Learning by Barreto, André, Borsa, Diana, Hou, Shaobo, Comanici, Gheorghe, Aygün, Eser, Hamel, Philippe, Toyama, Daniel, Hunt, Jonathan, Mourad, Shibl, Silver, David, Precup, Doina

    Published 24-06-2021
    “…The ability to combine known skills to create new ones may be crucial in the solution of complex reinforcement learning problems that unfold over extended…”
    Get full text
    Journal Article
  9. 9

    Scaling Language Models: Methods, Analysis & Insights from Training Gopher by Rae, Jack W, Borgeaud, Sebastian, Cai, Trevor, Millican, Katie, Hoffmann, Jordan, Song, Francis, Aslanides, John, Henderson, Sarah, Ring, Roman, Young, Susannah, Rutherford, Eliza, Hennigan, Tom, Menick, Jacob, Cassirer, Albin, Powell, Richard, Driessche, George van den, Hendricks, Lisa Anne, Rauh, Maribeth, Huang, Po-Sen, Glaese, Amelia, Welbl, Johannes, Dathathri, Sumanth, Huang, Saffron, Uesato, Jonathan, Mellor, John, Higgins, Irina, Creswell, Antonia, McAleese, Nat, Wu, Amy, Elsen, Erich, Jayakumar, Siddhant, Buchatskaya, Elena, Budden, David, Sutherland, Esme, Simonyan, Karen, Paganini, Michela, Sifre, Laurent, Martens, Lena, Li, Xiang Lorraine, Kuncoro, Adhiguna, Nematzadeh, Aida, Gribovskaya, Elena, Donato, Domenic, Lazaridou, Angeliki, Mensch, Arthur, Lespiau, Jean-Baptiste, Tsimpoukelli, Maria, Grigorev, Nikolai, Fritz, Doug, Sottiaux, Thibault, Pajarskas, Mantas, Pohlen, Toby, Gong, Zhitao, Toyama, Daniel, d'Autume, Cyprien de Masson, Li, Yujia, Terzi, Tayfun, Mikulik, Vladimir, Babuschkin, Igor, Clark, Aidan, Casas, Diego de Las, Guy, Aurelia, Jones, Chris, Bradbury, James, Johnson, Matthew, Hechtman, Blake, Weidinger, Laura, Gabriel, Iason, Isaac, William, Lockhart, Ed, Osindero, Simon, Rimell, Laura, Dyer, Chris, Vinyals, Oriol, Ayoub, Kareem, Stanway, Jeff, Bennett, Lorrayne, Hassabis, Demis, Kavukcuoglu, Koray, Irving, Geoffrey

    Published 08-12-2021
    “…Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and…”
    Get full text
    Journal Article