Universal Reinforcement Learning

We consider an agent interacting with an unmodeled environment. At each time, the agent makes an observation, takes an action, and incurs a cost. Its actions can influence future observations and costs. The goal is to minimize the long-term average cost. We propose a novel algorithm, known as the ac...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on information theory Vol. 56; no. 5; pp. 2441 - 2454
Main Authors:	Farias, Vivek F, Moallemi, Ciamac C, Van Roy, Benjamin, Weissman, Tsachy
Format:	Journal Article
Language:	English
Published:	New York, NY IEEE 01-05-2010 Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Active control Algorithms Applied sciences Australia Coding, codes Context tree Cost engineering Cost function Data compression Dynamic programming Engineering management Exact sciences and technology Games History Information systems Information theory Information, signal and communications theory Learning Lempel-Ziv Optimal control Optimization Reinforcement reinforcement learning Signal and communications theory Technology management Telecommunications and information theory value iteration Lempel-Ziv Lempel Ziv algorithm Optimal control Data compression Reinforcement learning Context tree Iterative method Dynamic programming value iteration
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!