Docstrings · DiscreteValueIteration.jl

DiscreteValueIteration.DiscreteValueIteration — Module

This module implements a value iteration solver that uses the interface defined in POMDPs.jl

DiscreteValueIteration.ValueIterationPolicy — Type

ValueIterationPolicy <: Policy

The policy type. Contains the Q-Matrix, the Utility function and an array of indices corresponding to optimal actions. There are three ways to initialize the policy type:

`policy = ValueIterationPolicy(mdp)` 
`policy = ValueIterationPolicy(mdp, utility_array)`
`policy = ValueIterationPolicy(mdp, qmatrix)`

The Q-matrix is nxm, where n is the number of states and m is the number of actions.

Fields

qmat Q matrix storing Q(s,a) values
util The value function V(s)
policy Policy array, maps state index to action index
action_map Maps the action index to the concrete action type
include_Q Flag for including the Q-matrix
mdp uses the model for indexing in the action function

DiscreteValueIteration.ValueIterationSolver — Type

ValueIterationSolver <: Solver

The solver type. Contains the following parameters that can be passed as keyword arguments to the constructor

- max_iterations::Int64, the maximum number of iterations value iteration runs for (default 100)
- belres::Float64, the Bellman residual (default 1e-3)
- verbose::Bool, if set to true, the bellman residual and the time per iteration will be printed to STDOUT (default false)
- include_Q::Bool, if set to true, the solver outputs the Q values in addition to the utility and the policy (default true)
- init_util::Vector{Float64}, provides a custom initialization of the utility vector. (initializes utility to 0 by default)