Search Results - "Hong, Joey"

Refine Results
  1. 1

    Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions by Hong, Joey, Sapp, Benjamin, Philbin, James

    “…We focus on the problem of predicting future states of entities in complex, real-world driving scenarios. Previous research has approached this problem via…”
    Get full text
    Conference Proceeding
  2. 2

    Wearing the Witch Identity as a Way of Becoming in Shirley Jackson's We Have Always Lived in the Castle by Hong, Joey Junsu

    Published 01-01-2022
    “…This thesis aims to analyze Shirley Jackson’s last novel We Have Always Lived in the Castle (1962) through the lens of 17th-century New England history and…”
    Get full text
    Dissertation
  3. 3

    Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning by Hong, Joey, Dragan, Anca, Levine, Sergey

    Published 07-11-2024
    “…Value-based reinforcement learning (RL) can in principle learn effective policies for a wide range of multi-turn problems, from games to dialogue to robotic…”
    Get full text
    Journal Article
  4. 4

    Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations by Hong, Joey, Levine, Sergey, Dragan, Anca

    Published 09-11-2023
    “…Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks. However, many of the most important applications of…”
    Get full text
    Journal Article
  5. 5

    Offline RL with Observation Histories: Analyzing and Improving Sample Complexity by Hong, Joey, Dragan, Anca, Levine, Sergey

    Published 31-10-2023
    “…Offline reinforcement learning (RL) can in principle synthesize more optimal behavior from a dataset consisting only of suboptimal trials. One way that this…”
    Get full text
    Journal Article
  6. 6

    Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations by Hong, Joey, Lin, Jessica, Dragan, Anca, Levine, Sergey

    Published 07-11-2024
    “…Recent progress on large language models (LLMs) has enabled dialogue agents to generate highly naturalistic and plausible text. However, current LLM language…”
    Get full text
    Journal Article
  7. 7

    Learning to Influence Human Behavior with Offline Reinforcement Learning by Hong, Joey, Levine, Sergey, Dragan, Anca

    Published 03-03-2023
    “…When interacting with people, AI agents do not just influence the state of the world -- they also influence the actions people take in response to the agent,…”
    Get full text
    Journal Article
  8. 8

    On the Sensitivity of Reward Inference to Misspecified Human Models by Hong, Joey, Bhatia, Kush, Dragan, Anca

    Published 09-12-2022
    “…Inferring reward functions from human behavior is at the center of value alignment - aligning AI objectives with what we, humans, actually want. But doing so…”
    Get full text
    Journal Article
  9. 9

    Confidence-Conditioned Value Functions for Offline Reinforcement Learning by Hong, Joey, Kumar, Aviral, Levine, Sergey

    Published 08-12-2022
    “…Offline reinforcement learning (RL) promises the ability to learn effective policies solely using existing, static datasets, without any costly online…”
    Get full text
    Journal Article
  10. 10

    Strategically Conservative Q-Learning by Shimizu, Yutaka, Hong, Joey, Levine, Sergey, Tomizuka, Masayoshi

    Published 06-06-2024
    “…Offline reinforcement learning (RL) is a compelling paradigm to extend RL's practical utility by leveraging pre-collected, static datasets, thereby avoiding…”
    Get full text
    Journal Article
  11. 11

    Multi-Task Off-Policy Learning from Bandit Feedback by Hong, Joey, Kveton, Branislav, Katariya, Sumeet, Zaheer, Manzil, Ghavamzadeh, Mohammad

    Published 09-12-2022
    “…Many practical applications, such as recommender systems and learning to rank, involve solving multiple similar tasks. One example is learning of…”
    Get full text
    Journal Article
  12. 12

    When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? by Kumar, Aviral, Hong, Joey, Singh, Anikait, Levine, Sergey

    Published 12-04-2022
    “…Offline reinforcement learning (RL) algorithms can acquire effective policies by utilizing previously collected experience, without any online interaction. It…”
    Get full text
    Journal Article
  13. 13

    Compositional Generalization and Decomposition in Neural Program Synthesis by Shi, Kensen, Hong, Joey, Zaheer, Manzil, Yin, Pengcheng, Sutton, Charles

    Published 07-04-2022
    “…When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to…”
    Get full text
    Journal Article
  14. 14

    LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models by Abdulhai, Marwa, White, Isadora, Snell, Charlie, Sun, Charles, Hong, Joey, Zhai, Yuexiang, Xu, Kelvin, Levine, Sergey

    Published 29-11-2023
    “…Large language models (LLMs) provide excellent text-generation capabilities, but standard prompting and generation methods generally do not lead to intentional…”
    Get full text
    Journal Article
  15. 15

    Deep Hierarchy in Bandits by Hong, Joey, Kveton, Branislav, Katariya, Sumeet, Zaheer, Manzil, Ghavamzadeh, Mohammad

    Published 03-02-2022
    “…Mean rewards of actions are often correlated. The form of these correlations may be complex and unknown a priori, such as the preferences of a user for…”
    Get full text
    Journal Article
  16. 16

    Hierarchical Bayesian Bandits by Hong, Joey, Kveton, Branislav, Zaheer, Manzil, Ghavamzadeh, Mohammad

    Published 12-11-2021
    “…Meta-, multi-task, and federated learning can be all viewed as solving similar tasks, drawn from a distribution that reflects task similarities. We provide a…”
    Get full text
    Journal Article
  17. 17

    ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis by Shi, Kensen, Hong, Joey, Deng, Yinlin, Yin, Pengcheng, Zaheer, Manzil, Sutton, Charles

    Published 25-07-2023
    “…When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to…”
    Get full text
    Journal Article
  18. 18

    Thompson Sampling with a Mixture Prior by Hong, Joey, Kveton, Branislav, Zaheer, Manzil, Ghavamzadeh, Mohammad, Boutilier, Craig

    Published 10-06-2021
    “…We study Thompson sampling (TS) in online decision making, where the uncertain environment is sampled from a mixture distribution. This is relevant in…”
    Get full text
    Journal Article
  19. 19

    Rules of the Road: Predicting Driving Behavior with a Convolutional Model of Semantic Interactions by Hong, Joey, Sapp, Benjamin, Philbin, James

    Published 21-06-2019
    “…The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 8454-8462 We focus on the problem of predicting future states of entities in…”
    Get full text
    Journal Article
  20. 20

    Latent Programmer: Discrete Latent Codes for Program Synthesis by Hong, Joey, Dohan, David, Singh, Rishabh, Sutton, Charles, Zaheer, Manzil

    Published 01-12-2020
    “…In many sequence learning tasks, such as program synthesis and document summarization, a key problem is searching over a large space of possible output…”
    Get full text
    Journal Article