DESPOTSolver(<keyword arguments>)

Implementation of the ARDESPOT solver trying to closely match the pseudo code of:

Each field may be set via keyword argument. The fields that correspond to algorithm parameters match the definitions in the paper exactly.


  • epsilon_0
  • xi
  • K
  • D
  • lambda
  • T_max
  • max_trials
  • bounds
  • default_action
  • rng
  • random_source
  • bounds_warnings
  • tree_in_info

Further information can be found in the field docstrings (e.g. ?DESPOTSolver.xi)

DefaultPolicyLB(policy; max_depth=nothing, final_value=(m,x)->0.0)
DefaultPolicyLB(solver; max_depth=nothing, final_value=(m,x)->0.0)

A lower bound calculated by running a default policy on the scenarios in a belief.

Keyword Arguments

  • max_depth::Union{Nothing,Int}=nothing: max depth to run the simulation. The depth of the belief will be automatically subtracted so simulations for the bound will be run for max_depth-b.depth steps. If nothing, the solver's max depth will be used.
  • final_value=(m,x)->0.0: a function (or callable object) that specifies an additional value to be added at the end of the simulation when max_depth is reached. This function will be called with two arguments, a POMDP, and a ScenarioBelief. It will not be called when the states in the belief are terminal.
IndependentBounds(lower, upper, check_terminal=false, consistency_fix_thresh=0.0)

Specify lower and upper bounds that are independent of each other (the most common case).

Keyword Arguments

  • check_terminal::Bool=false: if true, then if all the states in the belief are terminal, the upper and lower bounds will be overridden and set to 0.
  • consistency_fix_thresh::Float64=0.0: if upper < lower and upper >= lower-consistency_fix_thresh, then upper will be bumped up to lower.

Return a vector of lower bounds L of length lenb+lenba, with b nodes first followed by ba nodes.


Fill all the elements of the cache for b and children of b and return L[b]