ICAPS Workshop on

POMDP, Classification and Regression: Relationships and Joint Utilization

June 7, 2006
Ambleside, The English Lake District, U.K


Schedule

9:00 - 9:30 A Brief Tutorial on the Partially Observable Markov Decision Process and Its Applications   [Slides]
Lawrence Carin
9:30 - 10:00 Optimal Sensor Scheduling via Classification Reduction of Policy Search (CROPS)   [Paper]   [Slides]
Doron Blatt and Alfred O. Hero
      Abstract   The problem of sensor scheduling in multi-modal sensing systems is formulated as the sequential choice of experiments problem and solved via reinforcement learning methods. The sequential choice of experiments problem is a partially observed Markov decision problem (POMDP) in which the underlying state of nature is the system’s state and the sensors’ data are noisy state observations. The goal is to find a policy that sequentially determines the best sensor to deploy based on past data, which maximizes a given utility function while minimizing the deployment cost. Several examples are considered in which the exact model of the measurements given the state of nature is unknown but a generative model (a simulation or an experiment) is available. The problem is formulated as a reinforcement learning problem and solved via a reduction to a sequence of supervised classification subproblems. Finally, a simulation and an experiment with real data demonstrate the promise of our approach.
10:00 - 10:30 Adaptation of the Simulated Risk Disambiguation Protocol to a Discrete Setting   [Paper]   [Slides]
Al Aksakalli, Donniell E. Fishkind, and Carey E. Priebe
      Abstract   Suppose a spatial arrangement of possibly hazardous regions needs to be speedily and safely traversed, and there is a dynamic capability of discovering the true nature of each hazard when in close proximity of it; the traversal may enter the associated region only if it is revealed to be nonhazardous. The problem of identifying an optimal policy for where and when to execute disambiguations so as to minimize the expected length of the traversal can be cast both as a completely observed Markov decision process (MDP) and a partially observed Markov decision process (POMDP) and has been proven intractable in many broad settings. In this manuscript, we adapt the basic strategy of a policy called the simulated risk disambiguation protocol of Fishkind et al. (2006) to a different, discretized setting (a Canadian Traveller Problem with dependent edge probabilities), and we compare the performance of this adapted policy against the performance of the optimal policy—on a class of instances that are small enough for the optimal policy to be computed. On random such instances, the adapted simulated risk disambiguation protocol performed nearly as well as the optimal protocol, and used significantly less computational resources.
10:30 - 11:00 Coffee Break
11:00 - 11:30 Application of Partially Observable Markov Decision Processes to Robot Navigation in a Minefield   [Paper]   [Slides]
Lihan He, Shihao Ji, and Lawrence Carin
      Abstract   We consider the problem of a robotic sensing system navigating in a minefield, with the goal of detecting potential mines at low false alarm rates. Two types of sensors are used, namely, electromagnetic induction (EMI) and groundpenetrating radar (GPR). A partially observable Markov decision process (POMDP) is used as the decision framework for the minefield problem. The POMDP model is trained with physics-based features of various mines and clutters of interest. The training data are assumed sufficient to produce a reasonably good model. We give a detailed description of the POMDP formulation for the minefield problem and provide example results based on measured EMI and GPR data.
11:30 - 12:30 Panel (Open) Discussions
Topics (Questions)
  • What are emerging applications where POMDP theory can have impact (military and civilian) and what kinds of models are appropriate for these?
  • Common datasets for evaluating POMDP algorithms for sensing applications.
  • Bottlenecks in POMDP search and how to address them?
  • In what type of problems have POMDP approaches worked, i.e. beat out myopic approaches, and how to identify these problems?
  • Reinforcement learning approach for approximating POMDP optimal policies: is it practical?
We will dynamically augment the questions/topics. If you have interesting questions/topics relevant to the workshop theme, you are welcome to send them to us and we will post them here.
Call for Papers | Overview | Related Work | Topics