BeliefGridValueIteration.BeliefGridValueIterationPolicy
— TypeBeliefGridValueIterationPolicy{A}
A Policy object containing a belief grid and the value at each belief points.
Fields
m::Int64
The granularity of the belief gridVmap::Dict{Vector{Int}, Int}
The mapping from vertices in the grid (freudenthal space) to indices in the value vectorval::Vector{Float64}
The values for each belief pointspol::Vector{A}
The action at each belief point.
BeliefGridValueIteration.BeliefGridValueIterationSolver
— TypeBeliefGridValueIterationSolver
An offline POMDP solver from "Computationally Feasible Bounds for Partially Observed Markov Decision Processes" (1991), by W. S. Lovejoy. It computes an upper bound on the value function by performing value iteration on a discretized belief space.
Options
m::Int64 = 1
Granularity of the belief grid for the triangulationprecision::Float64 = 0.0
The solver stops when the desired convergence precision is reachedmax_iterations::Int64 = 100
Number of iteration of value iterationverbose::Bool = false
whether or not the solver prints information
BeliefGridValueIteration.barycentric_coordinates
— Methodbarycentric_coordinates(x::Vector{Int64}, V::Vector{Vector{Int64}})
Given a point x
and its simplex V
in the Freudenthal grid, returns the barycentric coordinates of x
in the grid. V
must be in the same order as provided by the output of freudenthal_simplex
BeliefGridValueIteration.freudenthal_matrix_inv
— Methodfreudenthal_matrix_inv(n::Int64, m::Int64)
returns the inverse of the matrix used to switch from Freudenthal space to belief space. Let IFM = freudenthal_matrix_inv(n, m)
, then x = IFM * b
BeliefGridValueIteration.freudenthal_simplex
— Methodfreudenthal_simplex(x::Vector{Int64})
Returns the list of vertices of the simplex of point x
in the Freudenthal grid.
BeliefGridValueIteration.freudenthal_simplex_and_coords!
— Methodfreudenthal_simplex_and_coords!(x::AbstractArray{Int64}, V::Vector{Vector{Int64}}, λ::Vector{Float64})
Fills V
and λ
with the simplex points in the Freudenthal space and associated coordinates respectively.
BeliefGridValueIteration.freudenthal_vertices
— Methodfreudenthal_vertices(n::Int64, m::Int64)
Construct the list of Freudenthal vertices in an n
dimensional space with grid resolution m
. The vertices are represented by a list of n
dimensional vectors.
BeliefGridValueIteration.lovejoy_upper_bound
— Functionlovejoy_upper_bound(pomdp, m, ϵ, k_max, verbose=false)
Construct the belief grid and perform some preprocessing operations first (see lovejoyupperbound_data). Then run vectorized value iteration over the belief grid. Returns the value at each belief point, the associated best action, and a mapping between belief points (in the freudenthal space) and their index.
BeliefGridValueIteration.lovejoy_upper_bound_data
— Functionlovejoy_upper_bound_data(pomdp::SparseTabularPOMDP, m::Int64, verbose=false)
Precompute useful quantities before computing the upper bound:
- Construct the belief space triangulation
- Store mapping from vertices to index
- Compute the next belief points
- Compute the coordinates of the next belief points in the grid
- Precompute R(b, a)
- Precompute Pr(o | b, a)
BeliefGridValueIteration.to_belief
— Methodto_belief(x, m)
Transform a point x
in the Freudenthal space to a point in the belief space. m
is the resolution of the Freudenthal grid.
BeliefGridValueIteration.to_freudenthal
— Methodto_freudenthal(b, m::Int64)
Transform a point b
in the belief space to a point in the Freudenthal space. m
is the resolution of the Freudenthal grid.
BeliefGridValueIteration.to_freudenthal_batch
— Methodto_freudenthal_batch(B::AbstractArray, m::Int64)
Given a batch of belief points B
, returns the corresponding points in the Freudenthal space.
BeliefGridValueIteration.update_batch
— Methodupdate_batch(B::Vector{Vector{R}}, pomdp::SparseTabularPOMDP) where R <: Real
Compute the next belief point starting from B, for every action and observation in the POMDP. Return a 4 dimensional array of size n_states x n_belief_points x n_actions x n_observations
BeliefGridValueIteration.update_single!
— Methodupdate_single!(OT::AbstractArray, b::Vector{R}, bp::AbstractArray) where R<:Real
update a single belief point using a precomputed observation * transition matrix for a given o,a pair