API Documentation

Contents

Index

Types/Functors

CompressedBeliefMDPs.CompressedBeliefMDPType
CompressedBeliefMDP{B, A}

The CompressedBeliefMDP struct is a generalization of the compressed belief-state MDP presented in Exponential Family PCA for Belief Compression in POMDPs.

Type Parameters

  • B: The type of compressed belief states.
  • A: The type of actions.

Fields

  • bmdp::GenerativeBeliefMDP: The generative belief-state MDP.
  • compressor::Compressor: The compressor used to compress belief states.
  • ϕ::Bijection: A bijection representing the mapping from uncompressed belief states to compressed belief states. See notes.

Constructors

CompressedBeliefMDP(pomdp::POMDP, updater::Updater, compressor::Compressor)
CompressedBeliefMDP(pomdp::POMDP, sampler::Sampler, updater::Updater, compressor::Compressor)

Constructs a CompressedBeliefMDP using the specified POMDP, updater, and compressor.

Warning

The 4-argument constructor is a quality-of-life constructor that calls fit! on the given compressor.

Example Usage

pomdp = TigerPOMDP()
updater = DiscreteUpdater(pomdp)
compressor = PCACompressor(1)
mdp = CompressedBeliefMDP(pomdp, updater, compressor)

For continuous POMDPs, see ParticleFilters.jl.

Notes

  • While compressions aren't usually injective, we cache beliefs and their compressions on a first-come, first-served basis, so we can effectively use a bijection without loss of generality.
CompressedBeliefMDPs.CompressedBeliefPolicyType
CompressedBeliefPolicy

Maps a base policy for the compressed belief-state MDP to a policy for the true POMDP.

Fields

  • m::CompressedBeliefMDP: The compressed belief-state MDP.
  • base_policy::Policy: The base policy used for decision-making in the compressed belief-state MDP.

Constructors

CompressedBeliefPolicy(m::CompressedBeliefMDP, base_policy::Policy)

Constructs a CompressedBeliefPolicy using the specified compressed belief-state MDP and base policy.

Example Usage

policy = solve(solver, pomdp)
s = initialstate(pomdp)
a = action(policy, s) # returns the approximately optimal action for state s
v = value(policy, s)  # returns the approximately optimal value for state s
CompressedBeliefMDPs.CompressedBeliefSolverType
CompressedBeliefSolver

The CompressedBeliefSolver struct represents a solver for compressed belief-state MDPs. It combines a compressed belief-state MDP with a base solver to approximate the value function.

Fields

  • m::CompressedBeliefMDP: The compressed belief-state MDP.
  • base_solver::Solver: The base solver used to solve the compressed belief-state MDP.

Constructors

CompressedBeliefSolver(pomdp::POMDP, base_solver::Solver; updater::Updater=DiscreteUpdater(pomdp), sampler::Sampler=BeliefExpansionSampler(pomdp), compressor::Compressor=PCACompressor(1))
CompressedBeliefSolver(pomdp::POMDP; updater::Updater=DiscreteUpdater(pomdp), sampler::Sampler=BeliefExpansionSampler(pomdp), compressor::Compressor=PCACompressor(1), interp::Union{Nothing, LocalFunctionApproximator}=nothing, k::Int=1, verbose::Bool=false, max_iterations::Int=1000, n_generative_samples::Int=10, belres::Float64=1e-3)

Constructs a CompressedBeliefSolver using the specified POMDP, base solver, updater, sampler, and compressor. Alternatively, you can omit the base solver in which case a LocalApproximationValueIterationSolver(https://github.com/JuliaPOMDP/LocalApproximationValueIteration.jl) will be created instead. For example, different base solvers are needed if the POMDP state and action space are continuous.

Example Usage

julia> pomdp = TigerPOMDP();
julia> solver = CompressedBeliefSolver(pomdp; verbose=true, max_iterations=10);
julia> solve(solver, pomdp);
[Iteration 1   ] residual:       8.51 | iteration runtime:    635.870 ms, (     0.636 s total)
[Iteration 2   ] residual:       3.63 | iteration runtime:      0.504 ms, (     0.636 s total)
[Iteration 3   ] residual:       10.1 | iteration runtime:      0.445 ms, (     0.637 s total)
[Iteration 4   ] residual:       15.2 | iteration runtime:      0.494 ms, (     0.637 s total)
[Iteration 5   ] residual:       6.72 | iteration runtime:      0.432 ms, (     0.638 s total)
[Iteration 6   ] residual:       7.38 | iteration runtime:      0.508 ms, (     0.638 s total)
[Iteration 7   ] residual:       6.03 | iteration runtime:      0.495 ms, (     0.639 s total)
[Iteration 8   ] residual:       5.73 | iteration runtime:      0.585 ms, (     0.639 s total)
[Iteration 9   ] residual:       4.02 | iteration runtime:      0.463 ms, (      0.64 s total)
[Iteration 10  ] residual:       7.28 | iteration runtime:      0.576 ms, (      0.64 s total)

Functions

CompressedBeliefMDPs.make_cacheFunction
make_cache(B, B̃)

Helper function that creates a cache that maps each unique belief from the set B to its corresponding compressed representation in .

Arguments

  • B::Vector{<:Any}: A vector of beliefs.
  • B̃::Matrix{Float64}: A matrix where each row corresponds to the compressed representation of the beliefs in B.

Returns

  • Dict{<:Any, Vector{Float64}}: A dictionary mapping each unique belief in B to its corresponding compressed representation in .

Example Usage

B = [belief1, belief2, belief3]
B̃ = [compressed_belief1; compressed_belief2; compressed_belief3]
ϕ = make_cache(B, B̃)
CompressedBeliefMDPs.make_numericalFunction
make_numerical(B, pomdp)

Helper function that converts a set of beliefs B into a numerical matrix representation suitable for processing by numerical algorithms/compressors.

Arguments

  • B::Vector{<:Any}: A vector of beliefs.
  • pomdp::POMDP: The POMDP model associated with the beliefs.

Returns

  • Matrix{Float64}: A matrix where each row corresponds to a numerical representation of a belief in B.

Example Usage

B = [belief1, belief2, belief3]
B_numerical = make_numerical(B, pomdp)
CompressedBeliefMDPs.compress_POMDPFunction
compress_POMDP(pomdp, sampler, updater, compressor)

Creates a compressed belief-state MDP by sampling, compressing, and caching beliefs from the given POMDP.

Arguments

  • pomdp::POMDP: The POMDP model to be compressed.
  • sampler::Sampler: A sampler to generate a set of beliefs from the POMDP.
  • updater::Updater: An updater to initialize beliefs from states.
  • compressor::Compressor: A compressor to reduce the dimensionality of the beliefs.

Returns

  • CompressedBeliefMDP: The constructed compressed belief-state MDP.
  • Matrix{Float64}: A matrix where each row corresponds to the compressed representation of the sampled beliefs.

Example Usage

```julia pomdp = TigerPOMDP() sampler = BeliefExpansionSampler(pomdp) updater = DiscreteUpdater(pomdp) compressor = PCACompressor(2) m, B̃ = compress_POMDP(pomdp, sampler, updater, compressor)