Search Results - "Hong, Joey"

1
Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions by Hong, Joey, Sapp, Benjamin, Philbin, James

Published in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2019)
“…We focus on the problem of predicting future states of entities in complex, real-world driving scenarios. Previous research has approached this problem via…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
2
Wearing the Witch Identity as a Way of Becoming in Shirley Jackson's We Have Always Lived in the Castle by Hong, Joey Junsu

Published 01-01-2022
“…This thesis aims to analyze Shirley Jackson’s last novel We Have Always Lived in the Castle (1962) through the lens of 17th-century New England history and…”

Get full text

Dissertation
QR Code
Save to List

Saved in:
3
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning by Hong, Joey, Dragan, Anca, Levine, Sergey

Published 07-11-2024
“…Value-based reinforcement learning (RL) can in principle learn effective policies for a wide range of multi-turn problems, from games to dialogue to robotic…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
4
Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations by Hong, Joey, Levine, Sergey, Dragan, Anca

Published 09-11-2023
“…Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks. However, many of the most important applications of…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
5
Offline RL with Observation Histories: Analyzing and Improving Sample Complexity by Hong, Joey, Dragan, Anca, Levine, Sergey

Published 31-10-2023
“…Offline reinforcement learning (RL) can in principle synthesize more optimal behavior from a dataset consisting only of suboptimal trials. One way that this…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
6
Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations by Hong, Joey, Lin, Jessica, Dragan, Anca, Levine, Sergey

Published 07-11-2024
“…Recent progress on large language models (LLMs) has enabled dialogue agents to generate highly naturalistic and plausible text. However, current LLM language…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
7
Learning to Influence Human Behavior with Offline Reinforcement Learning by Hong, Joey, Levine, Sergey, Dragan, Anca

Published 03-03-2023
“…When interacting with people, AI agents do not just influence the state of the world -- they also influence the actions people take in response to the agent,…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
8
On the Sensitivity of Reward Inference to Misspecified Human Models by Hong, Joey, Bhatia, Kush, Dragan, Anca

Published 09-12-2022
“…Inferring reward functions from human behavior is at the center of value alignment - aligning AI objectives with what we, humans, actually want. But doing so…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
9
Confidence-Conditioned Value Functions for Offline Reinforcement Learning by Hong, Joey, Kumar, Aviral, Levine, Sergey

Published 08-12-2022
“…Offline reinforcement learning (RL) promises the ability to learn effective policies solely using existing, static datasets, without any costly online…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
10
Strategically Conservative Q-Learning by Shimizu, Yutaka, Hong, Joey, Levine, Sergey, Tomizuka, Masayoshi

Published 06-06-2024
“…Offline reinforcement learning (RL) is a compelling paradigm to extend RL's practical utility by leveraging pre-collected, static datasets, thereby avoiding…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
11
Multi-Task Off-Policy Learning from Bandit Feedback by Hong, Joey, Kveton, Branislav, Katariya, Sumeet, Zaheer, Manzil, Ghavamzadeh, Mohammad

Published 09-12-2022
“…Many practical applications, such as recommender systems and learning to rank, involve solving multiple similar tasks. One example is learning of…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
12
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? by Kumar, Aviral, Hong, Joey, Singh, Anikait, Levine, Sergey

Published 12-04-2022
“…Offline reinforcement learning (RL) algorithms can acquire effective policies by utilizing previously collected experience, without any online interaction. It…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
13
Compositional Generalization and Decomposition in Neural Program Synthesis by Shi, Kensen, Hong, Joey, Zaheer, Manzil, Yin, Pengcheng, Sutton, Charles

Published 07-04-2022
“…When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
14
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models by Abdulhai, Marwa, White, Isadora, Snell, Charlie, Sun, Charles, Hong, Joey, Zhai, Yuexiang, Xu, Kelvin, Levine, Sergey

Published 29-11-2023
“…Large language models (LLMs) provide excellent text-generation capabilities, but standard prompting and generation methods generally do not lead to intentional…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
15
Deep Hierarchy in Bandits by Hong, Joey, Kveton, Branislav, Katariya, Sumeet, Zaheer, Manzil, Ghavamzadeh, Mohammad

Published 03-02-2022
“…Mean rewards of actions are often correlated. The form of these correlations may be complex and unknown a priori, such as the preferences of a user for…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
16
Hierarchical Bayesian Bandits by Hong, Joey, Kveton, Branislav, Zaheer, Manzil, Ghavamzadeh, Mohammad

Published 12-11-2021
“…Meta-, multi-task, and federated learning can be all viewed as solving similar tasks, drawn from a distribution that reflects task similarities. We provide a…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
17
ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis by Shi, Kensen, Hong, Joey, Deng, Yinlin, Yin, Pengcheng, Zaheer, Manzil, Sutton, Charles

Published 25-07-2023
“…When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
18
Thompson Sampling with a Mixture Prior by Hong, Joey, Kveton, Branislav, Zaheer, Manzil, Ghavamzadeh, Mohammad, Boutilier, Craig

Published 10-06-2021
“…We study Thompson sampling (TS) in online decision making, where the uncertain environment is sampled from a mixture distribution. This is relevant in…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
19
Rules of the Road: Predicting Driving Behavior with a Convolutional Model of Semantic Interactions by Hong, Joey, Sapp, Benjamin, Philbin, James

Published 21-06-2019
“…The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 8454-8462 We focus on the problem of predicting future states of entities in…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
20
Latent Programmer: Discrete Latent Codes for Program Synthesis by Hong, Joey, Dohan, David, Singh, Rishabh, Sutton, Charles, Zaheer, Manzil

Published 01-12-2020
“…In many sequence learning tasks, such as program synthesis and document summarization, a key problem is searching over a large space of possible output…”

Get full text

Journal Article
QR Code
Save to List

Saved in:

Search Results - "Hong, Joey"

Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions by Hong, Joey, Sapp, Benjamin, Philbin, James

Wearing the Witch Identity as a Way of Becoming in Shirley Jackson's We Have Always Lived in the Castle by Hong, Joey Junsu

Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning by Hong, Joey, Dragan, Anca, Levine, Sergey

Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations by Hong, Joey, Levine, Sergey, Dragan, Anca

Offline RL with Observation Histories: Analyzing and Improving Sample Complexity by Hong, Joey, Dragan, Anca, Levine, Sergey

Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations by Hong, Joey, Lin, Jessica, Dragan, Anca, Levine, Sergey

Learning to Influence Human Behavior with Offline Reinforcement Learning by Hong, Joey, Levine, Sergey, Dragan, Anca

On the Sensitivity of Reward Inference to Misspecified Human Models by Hong, Joey, Bhatia, Kush, Dragan, Anca

Confidence-Conditioned Value Functions for Offline Reinforcement Learning by Hong, Joey, Kumar, Aviral, Levine, Sergey

Strategically Conservative Q-Learning by Shimizu, Yutaka, Hong, Joey, Levine, Sergey, Tomizuka, Masayoshi

Multi-Task Off-Policy Learning from Bandit Feedback by Hong, Joey, Kveton, Branislav, Katariya, Sumeet, Zaheer, Manzil, Ghavamzadeh, Mohammad

When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? by Kumar, Aviral, Hong, Joey, Singh, Anikait, Levine, Sergey

Compositional Generalization and Decomposition in Neural Program Synthesis by Shi, Kensen, Hong, Joey, Zaheer, Manzil, Yin, Pengcheng, Sutton, Charles

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models by Abdulhai, Marwa, White, Isadora, Snell, Charlie, Sun, Charles, Hong, Joey, Zhai, Yuexiang, Xu, Kelvin, Levine, Sergey

Deep Hierarchy in Bandits by Hong, Joey, Kveton, Branislav, Katariya, Sumeet, Zaheer, Manzil, Ghavamzadeh, Mohammad

Hierarchical Bayesian Bandits by Hong, Joey, Kveton, Branislav, Zaheer, Manzil, Ghavamzadeh, Mohammad

ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis by Shi, Kensen, Hong, Joey, Deng, Yinlin, Yin, Pengcheng, Zaheer, Manzil, Sutton, Charles

Thompson Sampling with a Mixture Prior by Hong, Joey, Kveton, Branislav, Zaheer, Manzil, Ghavamzadeh, Mohammad, Boutilier, Craig

Rules of the Road: Predicting Driving Behavior with a Convolutional Model of Semantic Interactions by Hong, Joey, Sapp, Benjamin, Philbin, James

Latent Programmer: Discrete Latent Codes for Program Synthesis by Hong, Joey, Dohan, David, Singh, Rishabh, Sutton, Charles, Zaheer, Manzil

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication