DiscreteValueIteration.DiscreteValueIteration
— ModuleThis module implements a value iteration solver that uses the interface defined in POMDPs.jl
DiscreteValueIteration.ValueIterationPolicy
— TypeValueIterationPolicy <: Policy
The policy type. Contains the Q-Matrix, the Utility function and an array of indices corresponding to optimal actions. There are three ways to initialize the policy type:
`policy = ValueIterationPolicy(mdp)`
`policy = ValueIterationPolicy(mdp, utility_array)`
`policy = ValueIterationPolicy(mdp, qmatrix)`
The Q-matrix is nxm, where n is the number of states and m is the number of actions.
Fields
qmat
Q matrix storing Q(s,a) valuesutil
The value function V(s)policy
Policy array, maps state index to action indexaction_map
Maps the action index to the concrete action typeinclude_Q
Flag for including the Q-matrixmdp
uses the model for indexing in the action function
DiscreteValueIteration.ValueIterationSolver
— TypeValueIterationSolver <: Solver
The solver type. Contains the following parameters that can be passed as keyword arguments to the constructor
- max_iterations::Int64, the maximum number of iterations value iteration runs for (default 100)
- belres::Float64, the Bellman residual (default 1e-3)
- verbose::Bool, if set to true, the bellman residual and the time per iteration will be printed to STDOUT (default false)
- include_Q::Bool, if set to true, the solver outputs the Q values in addition to the utility and the policy (default true)
- init_util::Vector{Float64}, provides a custom initialization of the utility vector. (initializes utility to 0 by default)