Docstrings · ARDESPOT.jl

DESPOTSolver(<keyword arguments>)

Implementation of the ARDESPOT solver trying to closely match the pseudo code of:

http://bigbird.comp.nus.edu.sg/m2ap/wordpress/wp-content/uploads/2017/08/jair14.pdf

Each field may be set via keyword argument. The fields that correspond to algorithm parameters match the definitions in the paper exactly.

Fields

Further information can be found in the field docstrings (e.g. ?DESPOTSolver.xi)

DefaultPolicyLB(policy; max_depth=nothing, final_value=(m,x)->0.0)
DefaultPolicyLB(solver; max_depth=nothing, final_value=(m,x)->0.0)

A lower bound calculated by running a default policy on the scenarios in a belief.

Keyword Arguments

max_depth::Union{Nothing,Int}=nothing: max depth to run the simulation. The depth of the belief will be automatically subtracted so simulations for the bound will be run for max_depth-b.depth steps. If nothing, the solver's max depth will be used.
final_value=(m,x)->0.0: a function (or callable object) that specifies an additional value to be added at the end of the simulation when max_depth is reached. This function will be called with two arguments, a POMDP, and a ScenarioBelief. It will not be called when the states in the belief are terminal.

IndependentBounds(lower, upper, check_terminal=false, consistency_fix_thresh=0.0)

Specify lower and upper bounds that are independent of each other (the most common case).

Keyword Arguments

check_terminal::Bool=false: if true, then if all the states in the belief are terminal, the upper and lower bounds will be overridden and set to 0.
consistency_fix_thresh::Float64=0.0: if upper < lower and upper >= lower-consistency_fix_thresh, then upper will be bumped up to lower.

Return a vector of lower bounds L of length lenb+lenba, with b nodes first followed by ba nodes.

Fill all the elements of the cache for b and children of b and return L[b]