AEMS

CI codecov.io

Implements anytime error minimization search (AEMS) solver for POMDPs. This algorithm was originally described in

Ross, Stéphane, and Brahim Chaib-Draa. "AEMS: An Anytime Online Search Algorithm for Approximate Policy Refinement in Large POMDPs." IJCAI. 2007.

Specifically, this solver is AEMS2, which outperforms AEMS1 in nearly all published experiments.

Installation

using Pkg
Pkg.add("AEMS")

Quick Use

using POMDPs, POMDPTools, AEMS, POMDPModels

pomdp = BabyPOMDP()
solver = AEMSSolver()
planner = solve(solver, pomdp)      # planner is of type AEMSPlanner

# once you have a belief b
a = action(planner, b)

Solver Options

The following keyword options are available; for examle, AEMSSolver(max_iterations = 100, max_time = 0.1).

  • max_iterations Maximum number of fringe expansions during one action. Defaults to 1000.
  • max_time Maximum time (in seconds) to spend on one action. Defaults to 1 second.
  • updater The updater used to propagate beliefs in the tree. Defaults to a discrete updater.
  • lower_bound Defaults to a fixed-action policy.
  • upper_bound Defaults to a policy generated by FIB.
  • root_manager Determines how the root changes once an action is taken and an observation is received. This process is further described below. Allowed values are :clear, :belief, :user. Defaults to :clear.
  • action_selector Determines if action with best upper or lower bound should be selected after tree expansion. Defaults to :U to select action with best upper bound; use :L to select action with best lower bound.

Bounds

The upper and lower bounds must be subtypes of the Policy type and have the value function implemented. This function is then used to estimate the bound values at new beliefs.

Root Management

Once an action is taken and an observation is received, the root of the search tree becomes the updated belief. However, due to the structure of POMDPs.jl, the planner is not made aware of the resulting observation. Therefore we provide three different ways to manage the root.

1. Clearing the tree after each action

This solves the root issue by making a new tree with the current belief as the node, but it's probably less efficient to throw away the tree and start from scratch for each call to action. Tree clearing is the default behavior, but if you want to be explicit you can call AEMSSolver with the option root_manager = :clear.

2. Searching through child beliefs of previous root

The updated belief should be equal to one of the child beliefs of the previous root. We can set the root to whichever of the children match the updated belief. However, this requires == to be defined for the belief type you are using. Further, the beliefs must exactly match (no tolerance for slight numerical differences). If none of the child nodes match the belief provided to action, an error is thrown (perhaps it should just clear the tree and start over). Note that multiple child beliefs can be identical. In this case, the root is set to the first matching belief (perhaps it should select the one which has the tightest bounds). To use this method, call AEMSSolver with the option root_manager = :belief.

3. The user can provide the planner with the action taken and observation received.

This option is probably the most efficient, because the tree can be reused and no searching over beliefs must be performed. The planner can then follow the action and observation down the tree to the next belief node. To provide this information, the user can call update_root(planner, a, o), where a and o are the action and observtion. To use this method, call AEMSSolver with the option root_manager = :user. Note that update_root will throw an error if the root_manager is not set to :user; don't mess with the root unless you specify that it's the user's job.

Unfortunately, the user can't personally call update_root after each action/observation pair during simulation. Nor do existing simulators take care to update the planner (only the belief). Therefore, AEMS.jl has re-implemented some of the simulators from POMDPToolbox.jl for the AEMSPlanner type (so far only TestSimulator). The only changes are the following three lines of code after generating a new observation:

if planner.root_manager == :user
    update_root(planner, a, o)
end

Obviously, this solution is horrifying. Any simulation that wants to take advantage of the update_root function must be specifically made for the AEMSPlanner type. While the other root management methods, like belief-searching, don't require any changes to simulations, I'm not sure they are the optimal solution. It can be expensive if there are many beliefs and each belief is very large. There is also the issue of beliefs that should be the same but have small numerical differences; these will not be recognized as the same belief and the root will go un-updated. In the long-term, I think POMDPs.jl should implement a update(policy, a, o) function that defaults to doing nothing for all policies.

Visualization

Once you have a planner and have called action, you can use the following code to bring up an interactive tree in a Chrome browser window. Click a node to expand/unexpand it.

using D3Trees

tree = D3Tree(planner)      # creates a visualization tree
inchrome(tree)              # opens chrome tab to show visualization tree