Docstrings · DeepQLearning.jl

DeepQLearning.batch_trajectories — Method

batch_trajectories(s::AbstractArray, traj_length::Int64, batch_size::Int64)

converts multidimensional arrays into batches of trajectories to be process by a Flux recurrent model. It takes as input an array of dimension statedim... x trajlength x batch_size

DeepQLearning.evaluation — Function

evaluation(eval_policy, policy, env, obs, global_step, rng)
returns the average reward of the current policy, the user can specify its own function 
f to carry the evaluation, we provide a default basic_evaluation that is just a rollout.

DeepQLearning.exploration — Method

exploration(exp_policy, policy, env, obs, global_step, rng)
return an action following an exploration policy 
the use can provide its own exp_policy function

DeepQLearning.flattenbatch — Method

flattenbatch(x::AbstractArray)

flatten a multi dimensional array to keep only the last dimension. It returns a 2 dimensional array of size (flattendim, batchsize)

DeepQLearning.getnetwork — Function

getnetwork(policy)
return the  value network of the policy

DeepQLearning.globalnorm — Method

globalnorm(p::Params, gs::Flux.Zygote.Grads)

returns the maximum absolute values in the gradients of W

DeepQLearning.hiddenstates — Method

hiddenstates(m)

returns the hidden states of all the recurrent layers of a model

DeepQLearning.huber_loss — Method

huber_loss(x)

Compute the Huber Loss (from ReinforcementLearning.jl)

DeepQLearning.isrecurrent — Method

isrecurrent(m)

returns true if m contains a recurrent layer

DeepQLearning.resetstate! — Function

resetstate!(policy)

reset the hidden states of a policy

DeepQLearning.sethiddenstates! — Method

sethiddenstates!(m, hs)

Given a list of hiddenstate, set the hidden state of each recurrent layer of the model m to what is in the list. The order of the list should match the order of the recurrent layers in the model.