Reinforcement learning in episodic non-stationary markovian environments

Samuel Ping Man Choi, Nevin L. Zhang, Dit Yan Yeung

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Reinforcement learning in non-stationary environments is generally regarded as a very difficult problem. Without any prior knowledge about the environment, this problem can be unsolvable in the worst case. In this paper, we attempt to partially address this grand challenge by formalizing a broad class of non-stationary Markovian environments, of which the state space, action space, transition function, and reward (or cost) function may change over time but with some regularities. We call these environments episodic non-stationary Markovian environments (ENME), which form a fairly common class of non-stationary environments for characterizing many real-world decision problems. We begin with a special subclass of ENMEs called periodic non-stationary Markovian environments (PNME) and then generalize this subclass to more general and realistic forms. Afterwards, we show how the episodic property can be exploited to make the problems solvable by combining conventional reinforcement learning algorithms with the state augmentation method.

Original languageEnglish
Title of host publicationProceedings of the International Conference on Artificial Intelligence, IC-AI'04 and Proceedings of the International Conference on Machine Learning; Models, Technologies and Applications, MLMTA'04)
EditorsH.R. Arabnia, M. Youngsong
Pages752-758
Number of pages7
Publication statusPublished - 2004
Externally publishedYes
EventProceedings of the International Conference on Artificial Intelligence, IC-AI'04 - Las Vegas, NV, United States
Duration: 21 Jun 200424 Jun 2004

Publication series

NameProceedings of the International Conference on Artificial Intelligence, IC-AI'04
Volume2

Conference

ConferenceProceedings of the International Conference on Artificial Intelligence, IC-AI'04
Country/TerritoryUnited States
CityLas Vegas, NV
Period21/06/0424/06/04

Fingerprint

Dive into the research topics of 'Reinforcement learning in episodic non-stationary markovian environments'. Together they form a unique fingerprint.

Cite this