Search Results - "Toyama, Daniel" :: Katalog Arama

1
Not All LLM Reasoners Are Created Equal by Hosseini, Arian, Sordoni, Alessandro, Toyama, Daniel, Courville, Aaron, Agarwal, Rishabh

Published 02-10-2024
“…We study the depth of grade-school math (GSM) problem-solving capabilities of LLMs. To this end, we evaluate their performance on pairs of existing math word…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
2
Learning how to Interact with a Complex Interface using Hierarchical Reinforcement Learning by Comanici, Gheorghe, Glaese, Amelia, Gergely, Anita, Toyama, Daniel, Ahmed, Zafarali, Jackson, Tyler, Hamel, Philippe, Precup, Doina

Published 21-04-2022
“…Hierarchical Reinforcement Learning (HRL) allows interactive agents to decompose complex problems into a hierarchy of sub-tasks. Higher-level tasks can invoke…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
3
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents by Rawles, Christopher, Clinckemaillie, Sarah, Chang, Yifan, Waltz, Jonathan, Lau, Gabrielle, Fair, Marybeth, Li, Alice, Bishop, William, Li, Wei, Campbell-Ajala, Folawiyo, Toyama, Daniel, Berry, Robert, Tyamagundlu, Divya, Lillicrap, Timothy, Riva, Oriana

Published 23-05-2024
“…Autonomous agents that execute human tasks by controlling computers can enhance human productivity and application accessibility. However, progress in this…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
4
Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search by Mehrabian, Abbas, Anand, Ankit, Kim, Hyunjik, Sonnerat, Nicolas, Balog, Matej, Comanici, Gheorghe, Berariu, Tudor, Lee, Andrew, Ruoss, Anian, Bulanova, Anna, Toyama, Daniel, Blackwell, Sam, Paredes, Bernardino Romera, Veličković, Petar, Orseau, Laurent, Lee, Joonkyung, Naredla, Anurag Murty, Precup, Doina, Wagner, Adam Zsolt

Published 06-11-2023
“…This work studies a central extremal graph theory problem inspired by a 1975 conjecture of Erd\H{o}s, which aims to find graphs with a given size (number of…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
5
AndroidEnv: A Reinforcement Learning Platform for Android by Toyama, Daniel, Hamel, Philippe, Gergely, Anita, Comanici, Gheorghe, Glaese, Amelia, Ahmed, Zafarali, Jackson, Tyler, Mourad, Shibl, Precup, Doina

Published 27-05-2021
“…We introduce AndroidEnv, an open-source platform for Reinforcement Learning (RL) research built on top of the Android ecosystem. AndroidEnv allows RL agents to…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
6
RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning by Ramos, Sabela, Girgin, Sertan, Hussenot, Léonard, Vincent, Damien, Yakubovich, Hanna, Toyama, Daniel, Gergely, Anita, Stanczyk, Piotr, Marinier, Raphael, Harmsen, Jeremiah, Pietquin, Olivier, Momchev, Nikola

Published 04-11-2021
“…We introduce RLDS (Reinforcement Learning Datasets), an ecosystem for recording, replaying, manipulating, annotating and sharing data in the context of…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
7
AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning by Mathieu, Michaël, Ozair, Sherjil, Srinivasan, Srivatsan, Gulcehre, Caglar, Zhang, Shangtong, Jiang, Ray, Paine, Tom Le, Powell, Richard, Żołna, Konrad, Schrittwieser, Julian, Choi, David, Georgiev, Petko, Toyama, Daniel, Huang, Aja, Ring, Roman, Babuschkin, Igor, Ewalds, Timo, Bordbar, Mahyar, Henderson, Sarah, Colmenarejo, Sergio Gómez, Oord, Aäron van den, Czarnecki, Wojciech Marian, de Freitas, Nando, Vinyals, Oriol

Published 07-08-2023
“…StarCraft II is one of the most challenging simulated reinforcement learning environments; it is partially observable, stochastic, multi-agent, and mastering…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
8
The Option Keyboard: Combining Skills in Reinforcement Learning by Barreto, André, Borsa, Diana, Hou, Shaobo, Comanici, Gheorghe, Aygün, Eser, Hamel, Philippe, Toyama, Daniel, Hunt, Jonathan, Mourad, Shibl, Silver, David, Precup, Doina

Published 24-06-2021
“…The ability to combine known skills to create new ones may be crucial in the solution of complex reinforcement learning problems that unfold over extended…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
9
Scaling Language Models: Methods, Analysis & Insights from Training Gopher by Rae, Jack W, Borgeaud, Sebastian, Cai, Trevor, Millican, Katie, Hoffmann, Jordan, Song, Francis, Aslanides, John, Henderson, Sarah, Ring, Roman, Young, Susannah, Rutherford, Eliza, Hennigan, Tom, Menick, Jacob, Cassirer, Albin, Powell, Richard, Driessche, George van den, Hendricks, Lisa Anne, Rauh, Maribeth, Huang, Po-Sen, Glaese, Amelia, Welbl, Johannes, Dathathri, Sumanth, Huang, Saffron, Uesato, Jonathan, Mellor, John, Higgins, Irina, Creswell, Antonia, McAleese, Nat, Wu, Amy, Elsen, Erich, Jayakumar, Siddhant, Buchatskaya, Elena, Budden, David, Sutherland, Esme, Simonyan, Karen, Paganini, Michela, Sifre, Laurent, Martens, Lena, Li, Xiang Lorraine, Kuncoro, Adhiguna, Nematzadeh, Aida, Gribovskaya, Elena, Donato, Domenic, Lazaridou, Angeliki, Mensch, Arthur, Lespiau, Jean-Baptiste, Tsimpoukelli, Maria, Grigorev, Nikolai, Fritz, Doug, Sottiaux, Thibault, Pajarskas, Mantas, Pohlen, Toby, Gong, Zhitao, Toyama, Daniel, d'Autume, Cyprien de Masson, Li, Yujia, Terzi, Tayfun, Mikulik, Vladimir, Babuschkin, Igor, Clark, Aidan, Casas, Diego de Las, Guy, Aurelia, Jones, Chris, Bradbury, James, Johnson, Matthew, Hechtman, Blake, Weidinger, Laura, Gabriel, Iason, Isaac, William, Lockhart, Ed, Osindero, Simon, Rimell, Laura, Dyer, Chris, Vinyals, Oriol, Ayoub, Kareem, Stanway, Jeff, Bennett, Lorrayne, Hassabis, Demis, Kavukcuoglu, Koray, Irving, Geoffrey

Published 08-12-2021
“…Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and…”

Get full text

Journal Article
QR Code
Save to List

Saved in: