BasicPOMCP.POMCPSolver
— TypePOMCPSolver(#=keyword arguments=#)
Partially Observable Monte Carlo Planning Solver.
Keyword Arguments
max_depth::Int
Rollouts and tree expension will stop when this depth is reached. default:20
c::Float64
UCB exploration constant - specifies how much the solver should explore. default:1.0
tree_queries::Int
Number of iterations during each action() call. default:1000
max_time::Float64
Maximum time for planning in each action() call. default:Inf
tree_in_info::Bool
Iftrue
, returns the tree in the info dict when action_info is called. default:false
estimate_value::Any
Function, object, or number used to estimate the value at the leaf nodes. default:RolloutEstimator(RandomSolver(rng))
- If this is a function
f
,f(pomdp, s, h::BeliefNode, steps)
will be called to estimate the value. - If this is an object
o
,estimate_value(o, pomdp, s, h::BeliefNode, steps)
will be called. - If this is a number, the value will be set to that number
Note: In many cases, the simplest way to estimate the value is to do a rollout on the fully observable MDP with a policy that is a function of the state. To do this, use
FORollout(policy)
.- If this is a function
default_action::Any
Function, action, or Policy used to determine the action if POMCP fails with exceptionex
. default:ExceptionRethrow()
- If this is a Function
f
,f(pomdp, belief, ex)
will be called. - If this is a Policy
p
,action(p, belief)
will be called. - If it is an object
a
,default_action(a, pomdp, belief, ex)
will be called, and if this method is not implemented,a
will be returned directly.
- If this is a Function
rng::AbstractRNG
Random number generator. default:Random.GLOBAL_RNG
BasicPOMCP.extract_belief
— Functionextract_belief(rollout_updater::POMDPs.Updater, node::BeliefNode)
Return a belief compatible with the rollout_updater
from the belief in node
.
When a rollout simulation is started, this function is used to create the initial belief (compatible with rollout_updater
) based on the appropriate BeliefNode
at the edge of the tree. By overriding this, a belief can be constructed based on the entire tree or entire observation-action history.
BasicPOMCP.rollout
— MethodPerform a rollout simulation to estimate the value.
MCTS.estimate_value
— Functionestimate_value(estimator, problem::POMDPs.POMDP, start_state, h::BeliefNode, steps::Int)
Return an initial unbiased estimate of the value at belief node h.
By default this runs a rollout simulation