BeliefGridValueIteration.BeliefGridValueIterationPolicyType
BeliefGridValueIterationPolicy{A}

A Policy object containing a belief grid and the value at each belief points.

Fields

  • m::Int64 The granularity of the belief grid
  • Vmap::Dict{Vector{Int}, Int} The mapping from vertices in the grid (freudenthal space) to indices in the value vector
  • val::Vector{Float64} The values for each belief points
  • pol::Vector{A} The action at each belief point.
BeliefGridValueIteration.BeliefGridValueIterationSolverType
BeliefGridValueIterationSolver

An offline POMDP solver from "Computationally Feasible Bounds for Partially Observed Markov Decision Processes" (1991), by W. S. Lovejoy. It computes an upper bound on the value function by performing value iteration on a discretized belief space.

Options

  • m::Int64 = 1 Granularity of the belief grid for the triangulation
  • precision::Float64 = 0.0 The solver stops when the desired convergence precision is reached
  • max_iterations::Int64 = 100 Number of iteration of value iteration
  • verbose::Bool = false whether or not the solver prints information
BeliefGridValueIteration.barycentric_coordinatesMethod
barycentric_coordinates(x::Vector{Int64}, V::Vector{Vector{Int64}})

Given a point x and its simplex V in the Freudenthal grid, returns the barycentric coordinates of x in the grid. V must be in the same order as provided by the output of freudenthal_simplex

BeliefGridValueIteration.freudenthal_matrix_invMethod
freudenthal_matrix_inv(n::Int64, m::Int64)

returns the inverse of the matrix used to switch from Freudenthal space to belief space. Let IFM = freudenthal_matrix_inv(n, m), then x = IFM * b

BeliefGridValueIteration.freudenthal_verticesMethod
freudenthal_vertices(n::Int64, m::Int64)

Construct the list of Freudenthal vertices in an n dimensional space with grid resolution m. The vertices are represented by a list of n dimensional vectors.

BeliefGridValueIteration.lovejoy_upper_boundFunction
lovejoy_upper_bound(pomdp, m, ϵ, k_max, verbose=false)

Construct the belief grid and perform some preprocessing operations first (see lovejoyupperbound_data). Then run vectorized value iteration over the belief grid. Returns the value at each belief point, the associated best action, and a mapping between belief points (in the freudenthal space) and their index.

BeliefGridValueIteration.lovejoy_upper_bound_dataFunction
lovejoy_upper_bound_data(pomdp::SparseTabularPOMDP, m::Int64, verbose=false)

Precompute useful quantities before computing the upper bound:

  • Construct the belief space triangulation
  • Store mapping from vertices to index
  • Compute the next belief points
  • Compute the coordinates of the next belief points in the grid
  • Precompute R(b, a)
  • Precompute Pr(o | b, a)
BeliefGridValueIteration.to_beliefMethod
to_belief(x, m)

Transform a point x in the Freudenthal space to a point in the belief space. m is the resolution of the Freudenthal grid.

BeliefGridValueIteration.to_freudenthalMethod
to_freudenthal(b, m::Int64)

Transform a point b in the belief space to a point in the Freudenthal space. m is the resolution of the Freudenthal grid.

BeliefGridValueIteration.update_batchMethod
update_batch(B::Vector{Vector{R}}, pomdp::SparseTabularPOMDP) where R <: Real

Compute the next belief point starting from B, for every action and observation in the POMDP. Return a 4 dimensional array of size n_states x n_belief_points x n_actions x n_observations

BeliefGridValueIteration.update_single!Method
update_single!(OT::AbstractArray, b::Vector{R}, bp::AbstractArray) where R<:Real

update a single belief point using a precomputed observation * transition matrix for a given o,a pair