API Reference

API Reference

Core Functions

control_dependencies(f, ops::Union{Array{PyObject}, PyObject})

Executes all operations in ops before any operations created inside the block.

op1 = tf.print("print op1")
op3 = tf.print("print op3")
control_dependencies(op1) do
    global op2 = tf.print("print op2")
end
run(sess, [op2,op3])

In this example, op1 must be executed before op2. But there is no guarantee when op3 will be executed. There are several possible outputs of the program such as

print op3
print op1
print op2

or

print op1
print op3
print op2
ADCME.get_collectionFunction.
get_collection(name::Union{String, Missing})

Returns the collection with name name. If name is missing, returns all the trainable variables.

ADCME.has_gpuMethod.
has_gpu()

Checks if GPU is available.

Note

ADCME will use GPU automatically if GPU is available. To disable GPU, set the environment variable ENV["CUDA_VISIBLE_DEVICES"]="" before importing ADCME

ADCME.if_elseMethod.
if_else(condition::Union{PyObject,Array,Bool}, fn1, fn2, args...;kwargs...)
  • If condition is a scalar boolean, it outputs fn1 or fn2 (a function with no input argument or a tensor) based on whether condition is true or false.
  • If condition is a boolean array, if returns condition .* fn1 + (1 - condition) .* fn2
Info

If you encounter an error like this:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Retval[0] does not have value

It's probably that your code within if_else is not valid.

ADCME.independentMethod.
independent(o::PyObject, args...; kwargs...)

Returns o but when computing the gradients, the top gradients will not be back-propagated into dependent variables of o.

reset_default_graph()

Resets the graph by removing all the operators.

ADCME.tensorMethod.
tensor(s::String)

Returns the tensor with name s. See tensorname.

ADCME.tensornameMethod.
tensorname(o::PyObject)

Returns the name of the tensor. See tensor.

ADCME.while_loopMethod.
while_loop(condition::Union{PyObject,Function}, body::Function, loop_vars::Union{PyObject, Array{Any}, Array{PyObject}};
    parallel_iterations::Int64=10, kwargs...)

Loops over loop_vars while condition is true. This operator only creates one extra node to mark the loops in the computational graph.

Example

The following script computes

\[\sum_{i=1}^{10} i\]
function condition(i, ta)
    i <= 10
end
function body(i, ta)
    u = read(ta, i-1)
    ta = write(ta, i, u+1)
    i+1, ta
end
ta = TensorArray(10)
ta = write(ta, 1, constant(1.0))
i = constant(2, dtype=Int32)
_, out = while_loop(condition, body, [i, ta])
summation = stack(out)[10]
ADCME.run_profileMethod.
run_profile(args...;kwargs...)

Runs the session with tracing information.

ADCME.save_profileFunction.
save_profile(filename::String="default_timeline.json")

Save the timeline information to file filename.

  • Open Chrome and navigate to chrome://tracing
  • Load the timeline file
Base.bindMethod.
bind(op::PyObject, ops...)

Adding operations ops to the dependencies of op. The function is useful when we want to execute ops but ops is not in the dependency of the final output. For example, if we want to print i each time i is evaluated

i = constant(1.0)
op = tf.print(i)
i = bind(i, op)

Variables

ADCME.TensorArrayFunction.
TensorArray(size_::Int64=0, args...;kwargs...)

Constructs a tensor array for while_loop.

ADCME.VariableMethod.
Variable(initial_value;kwargs...)

Constructs a trainable tensor from value.

ADCME.cellMethod.
cell(arr::Array, args...;kwargs...)

Construct a cell tensor.

Example

julia> r = cell([[1.],[2.,3.]])
julia> run(sess, r[1])
1-element Array{Float32,1}:
 1.0
julia> run(sess, r[2])
2-element Array{Float32,1}:
 2.0
 3.0
ADCME.constantMethod.
constant(value; kwargs...)

Constructs a non-trainable tensor from value.

convert_to_tensor(o::Union{PyObject, Number, Array{T}, Missing, Nothing}; dtype::Union{Type, Missing}=missing) where T<:Number
convert_to_tensor(os::Array, dtypes::Array)

Converts the input o to tensor. If o is already a tensor and dtype (if provided) is the same as that of o, the operator does nothing. Otherwise, convert_to_tensor converts the numerical array to a constant tensor or casts the data type. convert_to_tensor also accepts multiple tensors.

Example

convert_to_tensor([1.0, constant(rand(2)), rand(10)], [Float32, Float64, Float32])
ADCME.get_variableMethod.
get_variable(o::Union{PyObject, Bool, Array{<:Number}}; 
    name::Union{String, Missing} = missing, 
    scope::String = "")

Creates a new variable with initial value o. If name exists, get_variable returns the variable instead of create a new one.

ADCME.get_variableMethod.
get_variable(dtype::Type;
shape::Union{Array{<:Integer}, NTuple{N, <:Integer}}, 
name::Union{Missing,String} = missing
scope::String = "")

Creates a new variable with initial value o. If name exists, get_variable returns the variable instead of create a new one.

gradient_checkpointing(type::String="speed")

Uses checkpointing scheme for gradients.

  • 'speed': checkpoint all outputs of convolutions and matmuls. these ops are usually the most expensive, so checkpointing them maximizes the running speed (this is a good option if nonlinearities, concats, batchnorms, etc are taking up a lot of memory)
  • 'memory': try to minimize the memory usage (currently using a very simple strategy that identifies a number of bottleneck tensors in the graph to checkpoint)
  • 'collection': look for a tensorflow collection named 'checkpoints', which holds the tensors to checkpoint
gradient_magnitude(l::PyObject, o::Union{Array, PyObject})

Returns the gradient sum

\[\sqrt{\sum_{i=1}^n \|\frac{\partial l}{\partial o_i}\|^2}\]

This function is useful for debugging the training

ADCME.gradientsMethod.
gradients(ys::PyObject, xs::PyObject; kwargs...)

Computes the gradients of ys w.r.t xs.

  • If ys is a scalar, gradients returns the gradients with the same shape as xs.
  • If ys is a vector, gradients returns the Jacobian $\frac{\partial y}{\partial x}$
Note

The second usage is not suggested since ADCME adopts reverse mode automatic differentiation. Although in the case ys is a vector and xs is a scalar, gradients cleverly uses forward mode automatic differentiation, it requires that the second order gradients are implemented for relevant operators.

ADCME.hessianMethod.

hessian computes the hessian of a scalar function f with respect to vector inputs xs

ADCME.ones_likeMethod.
ones_like(o::Union{PyObject,Real, Array{<:Real}}, args...; kwargs...)

Returns a all-one tensor, which has the same size as o.

Example

a = rand(100,10)
b = ones_like(a)
@assert run(sess, b)≈ones(100,10)
ADCME.placeholderMethod.
placeholder(dtype::Type; kwargs...)

Creates a placeholder of the type dtype.

Example

a = placeholder(Float64, shape=[20,10])
b = placeholder(Float64, shape=[]) # a scalar 
c = placeholder(Float64, shape=[nothing]) # a vector
ADCME.placeholderMethod.
placeholder(o::Union{Number, Array, PyObject}; kwargs...)

Creates a placeholder of the same type and size as o. o is the default value.

ADCME.tensorMethod.
tensor(v::Array{T,2}; dtype=Float64, sparse=false) where T
ADCME.tensorMethod.
tensor(v::Array{T,2}; dtype=Float64, sparse=false) where T

Convert a generic array v to a tensor. For example,

v = [0.0 constant(1.0) 2.0
    constant(2.0) 0.0 1.0]
u = tensor(v)

u will be a $2\times 3$ tensor.

Note

This function is expensive. Use with caution.

ADCME.zeros_likeMethod.
zeros_like(o::Union{PyObject,Real, Array{<:Real}}, args...; kwargs...)

Returns a all-zero tensor, which has the same size as o.

Example

a = rand(100,10)
b = zeros_like(a)
@assert run(sess, b)≈zeros(100,10)
Base.copyMethod.
copy(o::PyObject)

Creates a tensor that has the same value that is currently stored in a variable.

Note

The output is a graph node that will have that value when evaluated. Any time you evaluate it, it will grab the current value of o.

Base.readMethod.
read(ta::PyObject, i::Union{PyObject,Integer})

Reads data from TensorArray at index i.

Base.writeMethod.
write(ta::PyObject, i::Union{PyObject,Integer}, obj)

Writes data obj to TensorArray at index i.

Random Variables

ADCME.categoricalMethod.

categorical(n::Union{PyObject, Integer}; kwargs...)

kwargs has a keyword argument logits, a 2-D Tensor with shape [batch_size, num_classes]. Each slice [i, :] represents the unnormalized log-probabilities for all classes.

ADCME.choiceMethod.

choice(inputs::Union{PyObject, Array}, n_samples::Union{PyObject, Integer};replace::Bool=false)

Choose n_samples samples from inputs with/without replacement.

ADCME.logpdfMethod.
logpdf(dist::T, x) where T<:ADCMEDistribution

Returns the log(prob) for a distribution dist.

Sparse Matrix

SparseTensor

A sparse matrix object. It has two fields

  • o: internal data structure

  • _diag: true if the sparse matrix is marked as "diagonal".

ADCME.SparseTensorMethod.
SparseTensor(A::SparseMatrixCSC)
SparseTensor(A::Array{Float64, 2})

Creates a SparseTensor from numerical arrays.

ADCME.SparseTensorMethod.
SparseTensor(I::Union{PyObject,Array{T,1}}, J::Union{PyObject,Array{T,1}}, V::Union{Array{Float64,1}, PyObject}, m::Union{S, PyObject, Nothing}=nothing, n::Union{S, PyObject, Nothing}=nothing) where {T<:Integer, S<:Integer}

Constructs a sparse tensor. Examples:

ii = [1;2;3;4]
jj = [1;2;3;4]
vv = [1.0;1.0;1.0;1.0]
s = SparseTensor(ii, jj, vv, 4, 4)
s = SparseTensor(sprand(10,10,0.3))
ADCME.SparseAssemblerFunction.
SparseAssembler(handle::Union{PyObject, <:Integer}, n::Union{PyObject, <:Integer}, tol::Union{PyObject, <:Real}=0.0)

Creates a SparseAssembler for accumulating row, col, val for sparse matrices.

  • handle: an integer handle for creating a sparse matrix. If the handle already exists, SparseAssembler return the existing sparse matrix handle. If you are creating different sparse matrices, the handles should be different.
  • n: Number of rows of the sparse matrix.
  • tol (optional): Tolerance. SparseAssembler will treats any values less than tol as zero.

Example 1

handle = SparseAssembler(100, 5, 1e-8)
op1 = accumulate(handle, 1, [1;2;3], [1.0;2.0;3.0])
op2 = accumulate(handle, 2, [1;2;3], [1.0;2.0;3.0])
J = assemble(5, 5, [op1;op2])

J will be a SparseTensor object.

Example 2

handle = SparseAssembler(0, 5)
op1 = accumulate(handle, 1, [1;2;3], ones(3))
op2 = accumulate(handle, 1, [3], [1.])
op3 = accumulate(handle, 2, [1;3], ones(2))
J = assemble(5, 5, [op1;op2;op3]) # op1, op2, op3 are parallel
Array(run(sess, J))≈[1.0  1.0  2.0  0.0  0.0
                1.0  0.0  1.0  0.0  0.0
                0.0  0.0  0.0  0.0  0.0
                0.0  0.0  0.0  0.0  0.0
                0.0  0.0  0.0  0.0  0.0]
ADCME.assembleMethod.
assemble(m::Union{PyObject, <:Integer}, n::Union{PyObject, <:Integer}, ops::PyObject)

Assembles the sparse matrix from the ops created by accumulate. ops is either a single output from accumulate, or concated from several ops

op1 = accumulate(handle, 1, [1;2;3], [1.0;2.0;3.0])
op2 = accumulate(handle, 2, [1;2;3], [1.0;2.0;3.0])
op = [op1;op2] # equivalent to `vcat([op1, op2]...)`

m and n are rows and columns of the sparse matrix.

See SparseAssembler for an example.

ADCME.findMethod.
find(s::SparseTensor)

Returns the row, column and values for sparse tensor s.

ADCME.scatter_addMethod.
scatter_update(A::Union{SparseTensor, SparseMatrixCSC{Float64,Int64}},
i1::Union{Integer, Colon, UnitRange{T}, PyObject,Array{S,1}},
i2::Union{Integer, Colon, UnitRange{T}, PyObject,Array{T,1}},
B::Union{SparseTensor, SparseMatrixCSC{Float64,Int64}})  where {S<:Real,T<:Real}

Adds B to a subblock of a sparse matrix A. Equivalently,

A[i1, i2] += B
scatter_update(A::Union{SparseTensor, SparseMatrixCSC{Float64,Int64}},
i1::Union{Integer, Colon, UnitRange{T}, PyObject,Array{S,1}},
i2::Union{Integer, Colon, UnitRange{T}, PyObject,Array{T,1}},
B::Union{SparseTensor, SparseMatrixCSC{Float64,Int64}})  where {S<:Real,T<:Real}

Updates a subblock of a sparse matrix by B. Equivalently,

A[i1, i2] = B
ADCME.solveMethod.
solve(A_factorized::Tuple{SparseTensor, PyObject}, rhs::Union{Array{Float64,1}, PyObject})

Solves the equation A_factorized * x = rhs using the factorized sparse matrix. See factorize.

ADCME.spdiagMethod.
spdiag(n::Int64)

Constructs a sparse identity matrix of size $n\times n$, which is equivalent to spdiag(n, 0=>ones(n))

ADCME.spdiagMethod.
spdiag(m::Integer, pair::Pair...)

Constructs a square $m\times m$SparseTensor from pairs of the form

offset => array 

Example

Suppose we want to construct a $10\times 10$ tridiagonal matrix, where the lower off-diagonals are all -2, the diagonals are all 2, and the upper off-diagonals are all 3, the corresponding Julia code is

spdiag(10, -1=>-2*ones(9), 0=>2*ones(10), 1=>3ones(9))
ADCME.spdiagMethod.
spdiag(o::PyObject)

Constructs a sparse diagonal matrix where the diagonal entries are o, which is equivalent to spdiag(length(o), 0=>o)

ADCME.spzeroFunction.
spzero(m::Int64, n::Union{Missing, Int64}=missing)

Constructs a empty sparse matrix of size $m\times n$. n=m if n is missing

Base.accumulateMethod.
accumulate(handle::PyObject, row::Union{PyObject, <:Integer}, cols::Union{PyObject, Array{<:Integer}}, vals::Union{PyObject, Array{<:Real}})

Accumulates row-th row. It adds the value to the sparse matrix

for k = 1:length(cols)
    A[row, cols[k]] += vals[k]
end

handle is the handle created by SparseAssembler.

See SparseAssembler for an example.

Note

The function accumulate returns a op::PyObject. Only when op is executed, the nonzero values are populated into the sparse matrix.

factorize(A::Union{SparseTensor, SparseMatrixCSC}, max_cache_size::Int64 = 999999)

Factorizes $A$ for sparse matrix solutions. max_cache_size specifies the maximum cache sizes in the C++ kernels, which determines the maximum number of factorized matrices. The function returns the factorized matrix, which is basically Tuple{SparseTensor, PyObject}.

Example

A = sprand(10,10,0.7)
Afac = factorize(A) # factorizing the matrix
run(sess, Afac\rand(10)) # no factorization, solving the equation
run(sess, Afac\rand(10)) # no factorization, solving the equation
Base.:\Function.
\(A::SparseTensor, o::PyObject, method::String="SparseLU")

Solves the linear equation $A x = o$

Method

For square matrices $A$, one of the following methods is available

  • auto: using the solver specified by ADCME.options.sparse.solver
  • SparseLU
  • SparseQR
  • SimplicialLDLT
  • SimplicialLLT
Note

In the case o is 2 dimensional, \ is understood as "batched solve". o must have size $n_{b} \times m$, and $A$ has a size $m\times n$. It returns the solution matrix of size $n_b \times n$

\[s_{i,:} = A^{-1} o_{i,:}\]
Base.:\Method.
Base.:\(A_factorized::Tuple{SparseTensor, PyObject}, rhs::Union{Array{Float64,1}, PyObject})

A convenient overload for solve. See factorize.

Operations

ADCME.argsortMethod.
argsort(o::PyObject; 
stable::Bool = false, rev::Bool=false, dims::Integer=-1, name::Union{Nothing,String}=nothing)

Returns the indices of a tensor that give its sorted order along an axis.

ADCME.batch_matmulMethod.
batch_matmul(o1::PyObject, o2::PyObject)

Computes o1[i,:,:] * o2[i, :] or o1[i,:,:] * o2[i, :, :] for each index i.

ADCME.clipMethod.
clip(o::Union{Array{Any}, Array{PyObject}}, vmin, vmax, args...;kwargs...)

Clips the values of o to the range [vmin, vmax]

Example

a = constant(3.0)
a = clip(a, 1.0, 2.0)
b = constant(rand(3))
b = clip(b, 0.5, 1.0)
ADCME.cvecMethod.
rvec(o::PyObject; kwargs...)

Vectorizes the tensor o to a column vector, assuming column major.

ADCME.padMethod.
pad(o::PyObject, paddings::Array{Int64, 2}, args...; kwargs...)

Pads o with values on the boundary.

Example

o = rand(3,3)
o = pad(o, [1 4      # first dimension
             2 3])   # second dimension
run(sess, o)

Expected:

8×8 Array{Float64,2}:
 0.0  0.0  0.0       0.0       0.0       0.0  0.0  0.0
 0.0  0.0  0.250457  0.666905  0.823611  0.0  0.0  0.0
 0.0  0.0  0.23456   0.625145  0.646713  0.0  0.0  0.0
 0.0  0.0  0.552415  0.226417  0.67802   0.0  0.0  0.0
 0.0  0.0  0.0       0.0       0.0       0.0  0.0  0.0
 0.0  0.0  0.0       0.0       0.0       0.0  0.0  0.0
 0.0  0.0  0.0       0.0       0.0       0.0  0.0  0.0
 0.0  0.0  0.0       0.0       0.0       0.0  0.0  0.0
ADCME.pmapMethod.
pmap(fn::Function, o::Union{Array{PyObject}, PyObject})

Parallel for loop. There should be no data dependency between different iterations.

Example

x = constant(ones(10))
y1 = pmap(x->2.0*x, x)
y2 = pmap(x->x[1]+x[2], [x,x])
y3 = pmap(1:10, x) do z
    i = z[1]
    xi = z[2]
    xi + cast(Float64, i)
end
run(sess, y1)
run(sess, y2)
run(sess, y3)
ADCME.rvecMethod.
rvec(o::PyObject; kwargs...)

Vectorizes the tensor o to a row vector, assuming column major.

ADCME.scatter_addMethod.
scatter_add(A::PyObject, 
    xind::Union{Colon, Int64, Array{Int64}, BitArray{1}, Array{Bool,1}, UnitRange{Int64}, StepRange{Int64, Int64}, PyObject},
    yind::Union{Colon, Int64, Array{Int64}, BitArray{1}, Array{Bool,1}, UnitRange{Int64}, StepRange{Int64, Int64}, PyObject},
    updates::Union{Array{<:Real}, Real, PyObject})
A[xind, yind] += updates
ADCME.scatter_addMethod.
scatter_add(a::PyObject, 
    indices::Union{Colon, Int64, Array{Int64}, BitArray{1}, Array{Bool,1}, UnitRange{Int64}, StepRange{Int64, Int64}, PyObject},
    updates::Union{Array{<:Real}, Real, PyObject})

Updates array ref

a[indices] += updates
ADCME.scatter_subMethod.
scatter_add(A::PyObject, 
    xind::Union{Colon, Int64, Array{Int64}, BitArray{1}, Array{Bool,1}, UnitRange{Int64}, StepRange{Int64, Int64}, PyObject},
    yind::Union{Colon, Int64, Array{Int64}, BitArray{1}, Array{Bool,1}, UnitRange{Int64}, StepRange{Int64, Int64}, PyObject},
    updates::Union{Array{<:Real}, Real, PyObject})
A[xind, yind] -= updates
ADCME.scatter_subMethod.
scatter_sub(a::PyObject, 
    indices::Union{Colon, Int64, Array{Int64}, BitArray{1}, Array{Bool,1}, UnitRange{Int64}, StepRange{Int64, Int64}, PyObject},
    updates::Union{Array{<:Real}, Real, PyObject})

Updates array ref

a[indices] -= updates
scatter_update(A::PyObject, 
    xind::Union{Colon, Int64, Array{Int64}, BitArray{1}, Array{Bool,1}, UnitRange{Int64}, StepRange{Int64, Int64}, PyObject},
    yind::Union{Colon, Int64, Array{Int64}, BitArray{1}, Array{Bool,1}, UnitRange{Int64}, StepRange{Int64, Int64}, PyObject},
    updates::Union{Array{<:Real}, Real, PyObject})
A[xind, yind] = updates
scatter_update(a::PyObject, 
    indices::Union{Colon, Int64, Array{Int64}, BitArray{1}, Array{Bool,1}, UnitRange{Int64}, StepRange{Int64, Int64}, PyObject},
    updates::Union{Array{<:Real}, Real, PyObject})

Updates array ref

a[indices] = updates
ADCME.set_shapeMethod.
set_shape(o::PyObject, s::Union{Array{<:Integer}, Tuple{Vararg{<:Integer, N}}}) where N
set_shape(o::PyObject, s::Integer...)

Sets the shape of o to s. s must be the actual shape of o. This function is used to convert a tensor with unknown dimensions to a tensor with concrete dimensions.

Example

a = placeholder(Float64, shape=[nothing, 10])
b = set_shape(a, 3, 10)
run(sess, b, a=>rand(3,10)) # OK 
run(sess, b, a=>rand(5,10)) # Error
run(sess, b, a=>rand(10,3)) # Error
ADCME.solve_batchMethod.
solve_batch(A::Union{PyObject, Array{<:Real, 2}}, rhs::Union{PyObject, Array{<:Real,2}})

Solves $Ax = b$ for a batch of right hand sides.

  • A: a $m\times n$ matrix, where $m\geq n$
  • rhs: a $n_b\times m$ matrix. Each row is a new right hand side to solve.

The returned value is a $n_b\times n$ matrix.

Example

a = rand(10,5)
b = rand(100, 10)
sol = solve_batch(a, b)
@assert run(sess, sol) ≈ (a\b')'
Note

Internally, the matrix $A$ is factorized first and then the factorization is used to solve multiple right hand side.

ADCME.stackMethod.
stack(o::PyObject)

Convert a TensorArrayo to a normal tensor. The leading dimension is the size of the tensor array.

ADCME.topkFunction.
topk(o::PyObject, k::Union{PyObject,Integer}=1;
    sorted::Bool=true, name::Union{Nothing,String}=nothing)

Finds values and indices of the k largest entries for the last dimension. If sorted=true the resulting k elements will be sorted by the values in descending order.

ADCME.vectorMethod.
vector(i::Union{Array{T}, PyObject, UnitRange, StepRange}, v::Union{Array{Float64},PyObject},s::Union{Int64,PyObject})

Returns a vector V with length s such that

V[i] = v
Base.adjointMethod.
adjoint(o::PyObject; kwargs...)

Returns the conjugate adjoint of o. When the dimension of o is greater than 2, only the last two dimensions are permuted, i.e., permutedims(o, [1,2,...,n,n-1])

Base.vecMethod.
vec(o::PyObject;kwargs...)

Vectorizes the tensor o assuming column major.

LinearAlgebra.svdMethod.
svd(o::PyObject, args...; kwargs...)

Returns a TFSVD structure which holds the following data structures

S::PyObject
U::PyObject
V::PyObject
Vt::PyObject

We have the equality $o = USV'$

Example

A = rand(10,20)
r = svd(constant(A))
A2 = r.U*diagm(r.S)*r.Vt # The value of `A2` should be equal to `A`
Base.mapMethod.
map(fn::Function, o::Union{Array{PyObject},PyObject};
kwargs...)

Applies fn to each element of o.

  • oArray{PyObject} : returns [fn(x) for x in o]
  • o∈PyObject : splits o according to the first dimension and then applies fn.

Example

a = constant(rand(10,5))
b = map(x->sum(x), a) # equivalent to `sum(a, dims=2)`
Note

If fn is a multivariate function, we need to specify the output type using dtype keyword. For example,

a = constant(ones(10))
b = constant(ones(10))
fn = x->x[1]+x[2]
c = map(fn, [a, b], dtype=Float64)
Base.reshapeMethod.
reshape(o::PyObject, s::Union{Array{<:Integer}, Tuple{Vararg{<:Integer, N}}}) where N 
reshape(o::PyObject, s::Integer; kwargs...)
reshape(o::PyObject, m::Integer, n::Integer; kwargs...)
reshape(o::PyObject, ::Colon, n::Integer)
reshape(o::PyObject, n::Integer, ::Colon)

Reshapes the tensor according to row major if the "TensorFlow style" syntax is used; otherwise reshaping according to column major is assumed.

Example

reshape(a, [10,5]) # row major 
reshape(a, 10, 5) # column major 
Base.reverseMethod.
reverse(o::PyObject, kwargs...)

Given a tensor o, and an index dims representing the set of dimensions of tensor to reverse.

Example

a = rand(10,2)
A = constant(a)
@assert run(sess, reverse(A, dims=1)) == reverse(a, dims=1)
@assert run(sess, reverse(A, dims=2)) == reverse(a, dims=2)
@assert run(sess, reverse(A, dims=-1)) == reverse(a, dims=2)
Base.sortMethod.
Base.:sort(o::PyObject; 
rev::Bool=false, dims::Integer=-1, name::Union{Nothing,String}=nothing)

Sort a multidimensional array o along the given dimension.

  • rev: true for DESCENDING and false (default) for ASCENDING
  • dims: -1 for last dimension.
Base.splitMethod.
split(o::PyObject, 
    num_or_size_splits::Union{Integer, Array{<:Integer}, PyObject}; kwargs...)

Splits o according to num_or_size_splits

Example 1

a = constant(rand(10,8,6))
split(a, 5)

Expected output:

5-element Array{PyCall.PyObject,1}:
 PyObject <tf.Tensor 'split_5:0' shape=(2, 8, 6) dtype=float64>
 PyObject <tf.Tensor 'split_5:1' shape=(2, 8, 6) dtype=float64>
 PyObject <tf.Tensor 'split_5:2' shape=(2, 8, 6) dtype=float64>
 PyObject <tf.Tensor 'split_5:3' shape=(2, 8, 6) dtype=float64>
 PyObject <tf.Tensor 'split_5:4' shape=(2, 8, 6) dtype=float64>

Example 2

a = constant(rand(10,8,6))
split(a, [4,3,1], dims=2)

Expected output:

3-element Array{PyCall.PyObject,1}:
 PyObject <tf.Tensor 'split_6:0' shape=(10, 4, 6) dtype=float64>
 PyObject <tf.Tensor 'split_6:1' shape=(10, 3, 6) dtype=float64>
 PyObject <tf.Tensor 'split_6:2' shape=(10, 1, 6) dtype=float64>

Example 3

a = constant(rand(10,8,6))
split(a, 3, dims=3)

Expected output:

3-element Array{PyCall.PyObject,1}:
 PyObject <tf.Tensor 'split_7:0' shape=(10, 8, 2) dtype=float64>
 PyObject <tf.Tensor 'split_7:1' shape=(10, 8, 2) dtype=float64>
 PyObject <tf.Tensor 'split_7:2' shape=(10, 8, 2) dtype=float64>

IO

ADCME.DiaryType.
Diary(suffix::Union{String, Nothing}=nothing)

Creates a diary at a temporary directory path. It returns a writer and the corresponding directory path

ADCME.activateFunction.
activate(sw::Diary, port::Int64=6006)

Running Diary at http://localhost:port.

ADCME.loadFunction.
load(sess::PyObject, file::String, vars::Union{PyObject, Nothing, Array{PyObject}}=nothing, args...; kwargs...)

Loads the values of variables to the session sess from the file file. If vars is nothing, it loads values to all the trainable variables. See also save, load

ADCME.loadMethod.
load(sw::Diary, dirp::String)

Loads Diary from dirp.

ADCME.loggingMethod.
logging(file::Union{Nothing,String}, o::PyObject...; summarize::Int64 = 3, sep::String = " ")

Logging o to file. This operator must be used with bind.

ADCME.ploadMethod.
pload(file::String)

Loads a Python objection from file. See also psave

ADCME.psaveMethod.
psave(o::PyObject, file::String)

Saves a Python objection o to file. See also pload

ADCME.saveFunction.
save(sess::PyObject, file::String, vars::Union{PyObject, Nothing, Array{PyObject}}=nothing, args...; kwargs...)

Saves the values of vars in the session sess. The result is written into file as a dictionary. If vars is nothing, it saves all the trainable variables. See also save, load

ADCME.saveMethod.
save(sw::Diary, dirp::String)

Saves Diary to dirp.

ADCME.scalarFunction.
scalar(o::PyObject, name::String)

Returns a scalar summary object.

Base.writeMethod.
write(sw::Diary, step::Int64, cnt::Union{String, Array{String}})

Writes to Diary.

Optimization

ADCME.BFGS!Function.
BFGS!(value_and_gradients_function::Function, initial_position::Union{PyObject, Array{Float64}}, max_iter::Int64=50, args...;kwargs...)

Applies the BFGS optimizer to value_and_gradients_function

ADCME.BFGS!Function.
BFGS!(sess::PyObject, loss::PyObject, max_iter::Int64=15000; 
vars::Array{PyObject}=PyObject[], callback::Union{Function, Nothing}=nothing, kwargs...)

BFGS! is a simplified interface for BFGS optimizer. See also ScipyOptimizerInterface. callback is a callback function with signature

callback(vs::Array, iter::Int64, loss::Float64)

vars is an array consisting of tensors and its values will be the input to vs.

Example 1

a = Variable(1.0)
loss = (a - 10.0)^2
sess = Session(); init(sess)
BFGS!(sess, loss)

Example 2

θ1 = Variable(1.0)
θ2 = Variable(1.0)
loss = (θ1-1)^2 + (θ2-2)^2
cb = (vs, iter, loss)->begin 
    printstyled("[#iter $iter] θ1=$(vs[1]), θ2=$(vs[2]), loss=$loss\n", color=:green)
end
sess = Session(); init(sess)
cb(run(sess, [θ1, θ2]), 0, run(sess, loss))
BFGS!(sess, loss, 100; vars=[θ1, θ2], callback=cb)

Example 3

Use bounds to specify upper and lower bound of a variable.

x = Variable(2.0)    
loss = x^2
sess = Session(); init(sess)
BFGS!(sess, loss, bounds=Dict(x=>[1.0,3.0]))
ADCME.BFGS!Method.
BFGS!(sess::PyObject, loss::PyObject, grads::Union{Array{T},Nothing,PyObject}, 
vars::Union{Array{PyObject},PyObject}; kwargs...) where T<:Union{Nothing, PyObject}

Running BFGS algorithm $\min_{\texttt{vars}} \texttt{loss}(\texttt{vars})$ The gradients grads must be provided. Typically, grads[i] = gradients(loss, vars[i]). grads[i] can exist on different devices (GPU or CPU).

Example 1

import Optim # required
a = Variable(0.0)
loss = (a-1)^2
g = gradients(loss, a)
sess = Session(); init(sess)
BFGS!(sess, loss, g, a)

Example 2

import Optim # required
a = Variable(0.0)
loss = (a^2+a-1)^2
g = gradients(loss, a)
sess = Session(); init(sess)
cb = (vs, iter, loss)->begin 
    printstyled("[#iter $iter] a = $vs, loss=$loss\n", color=:green)
end
BFGS!(sess, loss, g, a; callback = cb)
CustomOptimizer(opt::Function, name::String)

creates a custom optimizer with struct name name. For example, we can integrate Optim.jl with ADCME by constructing a new optimizer

CustomOptimizer("Con") do f, df, c, dc, x0, x_L, x_U
    opt = Opt(:LD_MMA, length(x0))
    bd = zeros(length(x0)); bd[end-1:end] = [-Inf, 0.0]
    opt.lower_bounds = bd
    opt.xtol_rel = 1e-4
    opt.min_objective = (x,g)->(g[:]= df(x); return f(x)[1])
    inequality_constraint!(opt, (x,g)->( g[:]= dc(x);c(x)[1]), 1e-8)
    (minf,minx,ret) = NLopt.optimize(opt, x0)
    minx
end

Here

f: a function that returns $f(x)$

df: a function that returns $\nabla f(x)$

c: a function that returns the constraints $c(x)$

dc: a function that returns $\nabla c(x)$

x0: initial guess

nineq: number of inequality constraints

neq: number of equality constraints

x_L: lower bounds of optimizable variables

x_U: upper bounds of optimizable variables

Then we can create an optimizer with

opt = Con(loss, inequalities=[c1], equalities=[c2])

To trigger the optimization, use

minimize(opt, sess)

Note thanks to the global variable scope of Julia, step_callback, optimizer_kwargs can actually be passed from Julia environment directly.

NonlinearConstrainedProblem(f::Function, L::Function, θ::PyObject, u0::Union{PyObject, Array{Float64}}; options::Union{Dict{String, T}, Missing}=missing) where T<:Integer

Computes the gradients $\frac{\partial L}{\partial \theta}$

\[\min \ L(u) \quad \mathrm{s.t.} \ F(\theta, u) = 0\]

u0 is the initial guess for the numerical solution u, see newton_raphson.

Caveats: Assume r, A = f(θ, u) and θ are the unknown parameters, gradients(r, θ) must be defined (backprop works properly)

Returns: It returns a tuple (L: loss, C: constraints, and Graidents)

\[\left(L(u), u, \frac{\partial L}{\partial θ}\right)\]
ScipyOptimizerInterface(loss; method="L-BFGS-B", options=Dict("maxiter"=> 15000, "ftol"=>1e-12, "gtol"=>1e-12), kwargs...)

A simple interface for Scipy Optimizer. See also ScipyOptimizerMinimize and BFGS!.

ScipyOptimizerMinimize(sess::PyObject, opt::PyObject; kwargs...)

Minimizes a scalar Tensor. Variables subject to optimization are updated in-place at the end of optimization.

Note that this method does not just return a minimization Op, unlike minimize; instead it actually performs minimization by executing commands to control a Session https://www.tensorflow.org/api_docs/python/tf/contrib/opt/ScipyOptimizerInterface. See also ScipyOptimizerInterface and BFGS!.

  • feed_dict: A feed dict to be passed to calls to session.run.
  • fetches: A list of Tensors to fetch and supply to loss_callback as positional arguments.
  • step_callback: A function to be called at each optimization step; arguments are the current values of all optimization variables flattened into a single vector.
  • loss_callback: A function to be called every time the loss and gradients are computed, with evaluated fetches supplied as positional arguments.
  • run_kwargs: kwargs to pass to session.run.
newton_raphson(func::Function, 
    u0::Union{Array,PyObject}, 
    θ::Union{Missing,PyObject, Array{<:Real}}=missing,
    args::PyObject...) where T<:Real

Newton Raphson solver for solving a nonlinear equation. ∘ func has the signature

  • func(θ::Union{Missing,PyObject}, u::PyObject)->(r::PyObject, A::Union{PyObject,SparseTensor}) (if linesearch is off)
  • func(θ::Union{Missing,PyObject}, u::PyObject)->(fval::PyObject, r::PyObject, A::Union{PyObject,SparseTensor}) (if linesearch is on)

where r is the residual and A is the Jacobian matrix; in the case where linesearch is on, the function value fval must also be supplied. ∘ θ are external parameters. ∘ u0 is the initial guess for uargs: additional inputs to the func function ∘ kwargs: keyword arguments to func

The solution can be configured via ADCME.options.newton_raphson

  • max_iter: maximum number of iterations (default=100)
  • rtol: relative tolerance for termination (default=1e-12)
  • tol: absolute tolerance for termination (default=1e-12)
  • LM: a float number, Levenberg-Marquardt modification $x^{k+1} = x^k - (J^k + \mu^k)^{-1}g^k$ (default=0.0)
  • linesearch: whether linesearch is used (default=false)

Currently, the backtracing algorithm is implemented. The parameters for linesearch are supplied via options.newton_raphson.linesearch_options

  • c1: stop criterion, $f(x^k) < f(0) + \alpha c_1 f'(0)$
  • ρ_hi: the new step size $\alpha_1\leq \rho_{hi}\alpha_0$
  • ρ_lo: the new step size $\alpha_1\geq \rho_{lo}\alpha_0$
  • iterations: maximum number of iterations for linesearch
  • maxstep: maximum allowable steps
  • αinitial: initial guess for the step size $\alpha$
newton_raphson_with_grad(f::Function, 
u0::Union{Array,PyObject}, 
θ::Union{Missing,PyObject, Array{<:Real}}=missing,
args::PyObject...) where T<:Real

Differentiable Newton-Raphson algorithm. See newton_raphson.

Use ADCME.options.newton_raphson to supply options.

Example

function f(θ, x)
    x^3 - θ, 3spdiag(x^2)
end

θ = constant([2. .^3;3. ^3; 4. ^3])
x = newton_raphson_with_grad(f, constant(ones(3)), θ)
run(sess, x)≈[2.;3.;4.]
run(sess, gradients(sum(x), θ))

Neural Networks

BatchNormalization(dims::Int64=2; kwargs...)

Creates a batch normalization layer.

Example

b = BatchNormalization(2)
x = rand(10,2)
training = placeholder(true)
y = b(x, training)
run(sess, y)
ADCME.Conv1DType.
Conv1D(filters, kernel_size, strides, activation, args...;kwargs...)
c = Conv1D(32, 3, 1, "relu")
x = rand(100, 6, 128) # 128-length vectors with 6 timesteps ("channels")
y = c(x) # shape=(100, 4, 32)
ADCME.Conv2DType.
Conv2D(filters, kernel_size, strides, activation, args...;kwargs...)

The arrangement is (samples, rows, cols, channels) (dataformat='channelslast')

Conv2D(32, 3, 1, "relu")
ADCME.Conv3DType.
Conv3D(filters, kernel_size, strides, activation, args...;kwargs...)

The arrangement is (samples, rows, cols, channels) (dataformat='channelslast')

c = Conv3D(32, 3, 1, "relu")
x = constant(rand(100, 10, 10, 10, 16))
y = c(x)
# shape=(100, 8, 8, 8, 32)
ADCME.DenseType.
Dense(units::Int64, activation::Union{String, Function, Nothing} = nothing,
    args...;kwargs...)

Creates a callable dense neural network.

ADCME.Resnet1DType.
Resnet1D(out_features::Int64, hidden_features::Int64;
    num_blocks::Int64=2, activation::Union{String, Function, Nothing} = "relu", 
    dropout_probability::Float64 = 0.0, use_batch_norm::Bool = false, name::Union{String, Missing} = missing)

Creates a 1D residual network. If name is not missing, Resnet1D does not create a new entity.

Example

resnet = Resnet1D(20)
x = rand(1000,10)
y = resnet(x)

Example: Digit recognition

using MLDatasets
using ADCME

# load data 
train_x, train_y = MNIST.traindata()
train_x = reshape(Float64.(train_x), :, size(train_x,3))'|>Array
test_x, test_y = MNIST.testdata()
test_x = reshape(Float64.(test_x), :, size(test_x,3))'|>Array

# construct loss function 
ADCME.options.training.training = placeholder(true)
x = placeholder(rand(64, 784))
l = placeholder(rand(Int64, 64))
resnet = Resnet1D(10, num_blocks=10)
y = resnet(x)
loss = mean(sparse_softmax_cross_entropy_with_logits(labels=l, logits=y))

# train the neural network 
opt = AdamOptimizer().minimize(loss)
sess = Session(); init(sess)
for i = 1:10000
    idx = rand(1:60000, 64)
    _, loss_ = run(sess, [opt, loss], feed_dict=Dict(l=>train_y[idx], x=>train_x[idx,:]))
    @info i, loss_
end

# test 
for i = 1:10
    idx = rand(1:10000,100)
    y0 = resnet(test_x[idx,:])
    y0 = run(sess, y0, ADCME.options.training.training=>false)
    pred = [x[2]-1 for x in argmax(y0, dims=2)]
    @info "Accuracy = ", sum(pred .== test_y[idx])/100
end

ADCME.aeFunction.
ae(x::PyObject, output_dims::Array{Int64}, scope::String = "default";
    activation::Union{Function,String} = "tanh")

Alias: fc

Creates a neural network with intermediate numbers of neurons output_dims.

ADCME.aeMethod.
ae(x::Union{Array{Float64}, PyObject}, 
    output_dims::Array{Int64}, 
    θ::Union{Array{Array{Float64}}, Array{PyObject}};
    activation::Union{Function,String} = "tanh")

Alias: fc

Constructs a neural network with given weights and biases θ

Example

x = constant(rand(10,30))
θ = ae_init([30, 20, 20, 5])
y = ae(x, [20, 20, 5], θ) # 10×5
ADCME.aeMethod.
ae(x::Union{Array{Float64}, PyObject}, output_dims::Array{Int64}, θ::Union{Array{Float64}, PyObject};
activation::Union{Function,String, Nothing} = "tanh")

Alias: fc

Creates a neural network with intermediate numbers of neurons output_dims. The weights are given by θ

Example 1: Explicitly construct weights and biases

x = constant(rand(10,2))
n = ae_num([2,20,20,20,2])
θ = Variable(randn(n)*0.001)
y = ae(x, [20,20,20,2], θ)

Example 2: Implicitly construct weights and biases

θ = ae_init([10,20,20,20,2]) 
x = constant(rand(10,10))
y = ae(x, [20,20,20,2], θ)

See also ae_num, ae_init.

ADCME.ae_initMethod.
ae_init(output_dims::Array{Int64}; T::Type=Float64, method::String="xavier")
fc_init(output_dims::Array{Int64})

Return the initial weights and bias values by TensorFlow as a vector. The neural network architecture is

\[o_1 (Input layer) \rightarrow o_2 \rightarrow \cdots \rightarrow o_n (Output layer)\]

Three types of random initializers are provided

  • xavier (default). It is useful for tanh fully connected neural network.
\[W^l_i \sim \sqrt{\frac{1}{n_{l-1}}}\]
  • xavier_avg. A variant of xavier
\[W^l_i \sim \sqrt{\frac{2}{n_l + n_{l-1}}}\]
  • he. This is the activation aware initialization of weights and helps mitigate the problem

of vanishing/exploding gradients.

\[W^l_i \sim \sqrt{\frac{2}{n_{l-1}}}\]

Example

x = constant(rand(10,30))
θ = ae_init([30, 20, 20, 5])
y = ae(x, [20, 20, 5], θ) # 10×5
ADCME.ae_numMethod.
ae_num(output_dims::Array{Int64})
fc_num(output_dims::Array{Int64})

Estimates the number of weights and biases for the neural network. Note the first dimension should be the feature dimension (this is different from ae since in ae the feature dimension can be inferred), and the last dimension should be the output dimension.

Example

x = constant(rand(10,30))
θ = ae_init([30, 20, 20, 5])
@assert ae_num([30, 20, 20, 5])==length(θ)
y = ae(x, [20, 20, 5], θ)
ADCME.ae_to_codeMethod.
ae_to_code(file::String, scope::String; activation::String = "tanh")

Return the code string from the feed-forward neural network data in file. Usually we can immediately evaluate the code string into Julia session by

eval(Meta.parse(s))

If activation is not specified, tanh is the default.

ADCME.bnMethod.
bn(args...;center = true, scale=true, kwargs...)

bn accepts a keyword parameter is_training.

Example

bn(inputs, name="batch_norm", is_training=true)
Note

bn should be used with control_dependency

update_ops = get_collection(UPDATE_OPS)
control_dependencies(update_ops) do 
    global train_step = AdamOptimizer().minimize(loss)
end 
ADCME.denseMethod.
dense(inputs::Union{PyObject, Array{<:Real}}, units::Int64, args...; 
    activation::Union{String, Function} = nothing, kwargs...)

Creates a fully connected layer with the activation function specified by activation

ADCME.dropoutFunction.
dropout(x::Union{PyObject, Real, Array{<:Real}}, 
rate::Union{Real, PyObject}, training::Union{PyObject,Bool} = true; kwargs...)

Randomly drops out entries in x with a rate of rate.

ADCME.fcFunction.
ae(x::PyObject, output_dims::Array{Int64}, scope::String = "default";
    activation::Union{Function,String} = "tanh")

Alias: fc

Creates a neural network with intermediate numbers of neurons output_dims.

ae(x::Union{Array{Float64}, PyObject}, output_dims::Array{Int64}, θ::Union{Array{Float64}, PyObject};
activation::Union{Function,String, Nothing} = "tanh")

Alias: fc

Creates a neural network with intermediate numbers of neurons output_dims. The weights are given by θ

Example 1: Explicitly construct weights and biases

x = constant(rand(10,2))
n = ae_num([2,20,20,20,2])
θ = Variable(randn(n)*0.001)
y = ae(x, [20,20,20,2], θ)

Example 2: Implicitly construct weights and biases

θ = ae_init([10,20,20,20,2]) 
x = constant(rand(10,10))
y = ae(x, [20,20,20,2], θ)

See also ae_num, ae_init.

ae(x::Union{Array{Float64}, PyObject}, 
    output_dims::Array{Int64}, 
    θ::Union{Array{Array{Float64}}, Array{PyObject}};
    activation::Union{Function,String} = "tanh")

Alias: fc

Constructs a neural network with given weights and biases θ

Example

x = constant(rand(10,30))
θ = ae_init([30, 20, 20, 5])
y = ae(x, [20, 20, 5], θ) # 10×5
ADCME.fc_initFunction.
ae_init(output_dims::Array{Int64}; T::Type=Float64, method::String="xavier")
fc_init(output_dims::Array{Int64})

Return the initial weights and bias values by TensorFlow as a vector. The neural network architecture is

$

o1 (Input layer) \rightarrow o2 \rightarrow \cdots \rightarrow o_n (Output layer) $

Three types of random initializers are provided

  • xavier (default). It is useful for tanh fully connected neural network.
$

W^li \sim \sqrt{\frac{1}{n{l-1}}} $

  • xavier_avg. A variant of xavier
$

W^li \sim \sqrt{\frac{2}{nl + n_{l-1}}} $

  • he. This is the activation aware initialization of weights and helps mitigate the problem

of vanishing/exploding gradients.

$

W^li \sim \sqrt{\frac{2}{n{l-1}}} $

Example

x = constant(rand(10,30))
θ = ae_init([30, 20, 20, 5])
y = ae(x, [20, 20, 5], θ) # 10×5
ADCME.fc_numFunction.
ae_num(output_dims::Array{Int64})
fc_num(output_dims::Array{Int64})

Estimates the number of weights and biases for the neural network. Note the first dimension should be the feature dimension (this is different from ae since in ae the feature dimension can be inferred), and the last dimension should be the output dimension.

Example

x = constant(rand(10,30))
θ = ae_init([30, 20, 20, 5])
@assert ae_num([30, 20, 20, 5])==length(θ)
y = ae(x, [20, 20, 5], θ)
ADCME.fcxMethod.
fcx(x::Union{Array{Float64,2},PyObject}, output_dims::Array{Int64,1}, 
θ::Union{Array{Float64,1}, PyObject};
activation::String = "tanh")

Creates a fully connected neural network with output dimension o and inputs $x\in \mathbb{R}^{m\times n}$.

\[x \rightarrow o_1 \rightarrow o_2 \rightarrow \ldots \rightarrow o_k\]

θ is the weights and biases of the neural network, e.g., θ = ae_init(output_dims).

fcx outputs two tensors:

  • the output of the neural network: $u\in \mathbb{R}^{m\times o_k}$.

  • the sensitivity of the neural network per sample: $\frac{\partial u}{\partial x}\in \mathbb{R}^{m \times o_k \times n}$

Generative Neural Nets

ADCME.GANType.
GAN(dat::Union{Array,PyObject}, generator::Function, discriminator::Function,
loss::Union{Missing, Function}=missing; latent_dim::Union{Missing, Int64}=missing,
    batch_size::Int64=32)

Creates a GAN instance.

  • dat$\in \mathbb{R}^{n\times d}$ is the training data for the GAN, where $n$ is the number of training data, and $d$ is the dimension per training data.
  • generator$:\mathbb{R}^{d'} \rightarrow \mathbb{R}^d$ is the generator function, $d'$ is the hidden dimension.
  • discriminator$:\mathbb{R}^{d} \rightarrow \mathbb{R}$ is the discriminator function.
  • loss is the loss function. See klgan, rklgan, wgan, lsgan for examples.
  • latent_dim (default=$d$, the same as output dimension) is the latent dimension.
  • batch_size (default=32) is the batch size in training.

Example: Constructing a GAN

dat = rand(10000,10)
generator = (z, gan)->10*z
discriminator = (x, gan)->sum(x)
gan = GAN(dat, generator, discriminator, "wgan_stable")

Example: Learning a Gaussian random variable

using ADCME 
using PyPlot
using Distributions
dat = randn(10000, 1) * 0.5 .+ 3.0
function gen(z, gan)
    ae(z, [20,20,20,1], "generator_$(gan.ganid)", activation = "relu")
end
function disc(x, gan)
    squeeze(ae(x, [20,20,20,1], "discriminator_$(gan.ganid)", activation = "relu"))
end
gan = GAN(dat, gen, disc, g->wgan_stable(g, 0.001); latent_dim = 10)

dopt = AdamOptimizer(0.0002, beta1=0.5, beta2=0.9).minimize(gan.d_loss, var_list=gan.d_vars)
gopt = AdamOptimizer(0.0002, beta1=0.5, beta2=0.9).minimize(gan.g_loss, var_list=gan.g_vars)
sess = Session(); init(sess)
for i = 1:5000
    batch_x = rand(1:10000, 32)
    batch_z = randn(32, 10)
    for n_critic = 1:1
        global _, dl = run(sess, [dopt, gan.d_loss], 
                feed_dict=Dict(gan.ids=>batch_x, gan.noise=>batch_z))
    end
    _, gl, gm, dm, gp = run(sess, [gopt, gan.g_loss, 
        gan.STORAGE["g_grad_magnitude"], gan.STORAGE["d_grad_magnitude"], 
        gan.STORAGE["gradient_penalty"]],
        feed_dict=Dict(gan.ids=>batch_x, gan.noise=>batch_z))
    mod(i, 100)==0 && (@info i, dl, gl, gm, dm, gp)
end

hist(run(sess, squeeze(rand(gan,10000))), bins=50, density = true)
nm = Normal(3.0,0.5)
x0 = 1.0:0.01:5.0
y0 = pdf.(nm, x0)
plot(x0, y0, "g")
ADCME.jsganMethod.
jsgan(gan::GAN)

Computes the vanilla GAN loss function.

ADCME.klganMethod.
klgan(gan::GAN)

Computes the KL-divergence GAN loss function.

ADCME.lsganMethod.
lsgan(gan::GAN)

Computes the least square GAN loss function.

ADCME.predictMethod.
predict(gan::GAN, input::Union{PyObject, Array})

Predicts the GAN gan output given input input.

ADCME.rklganMethod.
rklgan(gan::GAN)

Computes the reverse KL-divergence GAN loss function.

ADCME.sampleMethod.
sample(gan::GAN, n::Int64)
rand(gan::GAN, n::Int64)

Samples n instances from gan.

ADCME.wganMethod.
wgan(gan::GAN)

Computes the Wasserstein GAN loss function.

ADCME.wgan_stableFunction.
wgan_stable(gan::GAN, λ::Float64)

Returns the discriminator and generator loss for the Wasserstein GAN loss with penalty parameter $\lambda$

The objective function is

\[L = E_{\tilde x\sim P_g} [D(\tilde x)] - E_{x\sim P_r} [D(x)] + \lambda E_{\hat x\sim P_{\hat x}}[(||\nabla_{\hat x}D(\hat x)||^2-1)^2]\]
ADCME.build!Method.
build!(gan::GAN)

Builds the GAN instances. This function returns gan for convenience.

Tools

ADCME.cleanMethod.
clean()

Clean up CustomOps directory.

ADCME.customopFunction.
customop(simple::Bool=false)

Create a new custom operator. If simple=true, the custom operator only supports CPU and does not have gradients.

Example

julia> customop() # create an editable `customop.txt` file
[ Info: Edit custom_op.txt for custom operators
julia> customop() # after editing `customop.txt`, call it again to generate interface files.
ADCME.debugMethod.
debug(sess::PyObject, o::PyObject)

In the case a session run yields an error from the TensorFlow backend, this function can help print the exact error. For example, you might encounter InvalidArgumentError() with no detailed error information, and this function can be useful for debugging.

ADCME.doctorMethod.
doctor()

Reports health of the current installed ADCME package. If some components are broken, possible fix is proposed.

ADCME.installMethod.
install(s::String; force::Bool = false)

Install a custom operator via URL. s can be

  • A URL. ADCME will download the directory through git
  • A string. ADCME will search for the associated package on https://github.com/ADCMEMarket
ADCME.install_adeptFunction.
install_adept(force::Bool=false)

Install adept-2 library: https://github.com/rjhogan/Adept-2

ADCME.load_opMethod.
load_op(oplibpath::String, opname::String)

Loads the operator opname from library oplibpath.

load_op_and_grad(oplibpath::String, opname::String; multiple::Bool=false)

Loads the operator opname from library oplibpath; gradients are also imported. If multiple is true, the operator is assumed to have multiple outputs.

ADCME.load_system_opFunction.
load_system_op(s::String, oplib::String, grad::Bool=true)

Loads custom operator from CustomOps directory (shipped with ADCME instead of TensorFlow) For example

s = "SparseOperator"
oplib = "libSO"
grad = true

this will direct Julia to find library CustomOps/SparseOperator/libSO.dylib on MACOSX

ADCME.nnuqMethod.
nnuq(H::Array{Float64,2}, invR::Union{Float64, Array{Float64,2}}, invQ::Union{Float64, Array{Float64,2}})

Returns the variance matrix for the Baysian inversion.

The negative log likelihood function is

\[l(s) =\frac{1}{2} (y-h(s))^T R^{-1} (y-h(s)) + \frac{1}{2} s^T Q^{-1} s\]

The covariance matrix is computed by first linearizing $h(s)$

\[h(s)\approx h(s_0) + \nabla h(s_0) (s-s_0)\]

and then computing the second order derivative

\[V = \left(\frac{\partial^2 l}{\partial s^T\partial s}\right)^{-1} = (H^T R^{-1} H + Q^{-1})^{-1}\]

Note the result is independent of $s_0$, $y_0$, and only depends on $\nabla h(s_0)$

ADCME.registerMethod.
register(forward::Function, backward::Function; multiple::Bool=false)

Register a function forward with back-propagated gradients rule backward to the backward. ∘ forward: it takes $n$ inputs and outputs $m$ tensors. When $m>1$, the keyword multiple must be true. ∘ backward: it takes $\tilde m$ top gradients from float/double output tensors of forward, $m$ outputs of the forward, and $n$ inputs of the forward. backward outputs $n$ gradients for each input of forward. When input $i$ of forward is not float/double, backward should return nothing for the corresponding gradients.

Example

forward = x->log(1+exp(x))
backward = (dy, y, x)->dy*(1-1/(1+y))
f = register(forward, backward)
test_jacobian(f::Function, x0::Array{Float64}; scale::Float64 = 1.0)

Testing the gradients of a vector function f: y, J = f(x) where y is a vector output and J is the Jacobian.

ADCME.xavier_initFunction.
xavier_init(size, dtype=Float64)

Returns a matrix of size size and its values are from Xavier initialization.

ADCME.compileMethod.
compile(s::String; force::Bool=false)

Compiles the library given by path deps/s. If force is false, compile first check whether the binary product exists. If the binary product exists, return 2. Otherwise, compile tries to compile the binary product, and returns 0 if successful; it return 1 otherwise.

ADCME.test_gpuMethod.
test_gpu()

Tests the GPU ultilities

Base.precompileFunction.
precompile(force::Bool=true)

Compiles all the operators in formulas.txt.

ODE

ADCME.ode45Method.
ode45(y::Union{PyObject, Float64, Array{Float64}}, T::Union{PyObject, Float64}, 
            NT::Union{PyObject,Int64}, f::Function, θ::Union{PyObject, Missing}=missing)

Solves

\[\frac{dy}{dt} = f(y, t, \theta)\]

with six-stage, fifth-order, Runge-Kutta method.

ADCME.rk4Method.
rk4(y::Union{PyObject, Float64, Array{Float64}}, T::Union{PyObject, Float64}, 
            NT::Union{PyObject,Int64}, f::Function, θ::Union{PyObject, Missing}=missing)

Solves

\[\frac{dy}{dt} = f(y, t, \theta)\]

with Runge-Kutta (order 4) method.

ADCME.αschemeMethod.
αscheme(M::Union{SparseTensor, SparseMatrixCSC}, 
    C::Union{SparseTensor, SparseMatrixCSC}, 
    K::Union{SparseTensor, SparseMatrixCSC}, 
    Force::Union{Array{Float64}, PyObject}, 
    d0::Union{Array{Float64, 1}, PyObject}, 
    v0::Union{Array{Float64, 1}, PyObject}, 
    a0::Union{Array{Float64, 1}, PyObject}, 
    Δt::Array{Float64}; 
    solve::Union{Missing, Function} = missing,
    extsolve::Union{Missing, Function} = missing, 
    ρ::Float64 = 1.0)

Generalized α-scheme. $M u_{tt} + C u_{t} + K u = F$

Force must be an array of size n×p, where d0, v0, and a0 have a size pΔt is an array (variable time step).

The generalized α scheme solves the equation by the time stepping

\[\begin{aligned} \bf d_{n+1} &= \bf d_n + h\bf v_n + h^2 \left(\left(\frac{1}{2}-\beta_2 \right)\bf a_n + \beta_2 \bf a_{n+1} \right)\\ \bf v_{n+1} &= \bf v_n + h((1-\gamma_2)\bf a_n + \gamma_2 \bf a_{n+1})\\ \bf F(t_{n+1-\alpha_{f_2}}) &= M \bf a _{n+1-\alpha_{m_2}} + C \bf v_{n+1-\alpha_{f_2}} + K \bf{d}_{n+1-\alpha_{f_2}} \end{aligned}\]

where

\[\begin{aligned} \bf d_{n+1-\alpha_{f_2}} &= (1-\alpha_{f_2})\bf d_{n+1} + \alpha_{f_2} \bf d_n\\ \bf v_{n+1-\alpha_{f_2}} &= (1-\alpha_{f_2}) \bf v_{n+1} + \alpha_{f_2} \bf v_n \\ \bf a_{n+1-\alpha_{m_2} } &= (1-\alpha_{m_2}) \bf a_{n+1} + \alpha_{m_2} \bf a_n\\ t_{n+1-\alpha_{f_2}} & = (1-\alpha_{f_2}) t_{n+1 + \alpha_{f_2}} + \alpha_{f_2}t_n \end{aligned}\]

Here the parameters are computed using

\[\begin{aligned} \gamma_2 &= \frac{1}{2} - \alpha_{m_2} + \alpha_{f_2}\\ \beta_2 &= \frac{1}{4} (1-\alpha_{m_2}+\alpha_{f_2})^2 \\ \alpha_{m_2} &= \frac{2\rho_\infty-1}{\rho_\infty+1}\\ \alpha_{f_2} &= \frac{\rho_\infty}{\rho_\infty+1} \end{aligned}\]

solve: users can provide a solver function, solve(A, rhs) for solving Ax = rhsextsolve: similar to solve, but the signature has the form

extsolve(A, rhs, i)

This provides the users with more control, e.g., (time-dependent) Dirichlet boundary conditions. See Generalized α Scheme for details.

Note

In the case $u$ has a nonzero essential boundary condition $u_b$, we let $\tilde u=u-u_b$, then $M \tilde u_{tt} + C \tilde u_t + K u = F - K u_b - C \dot u_b$

αscheme_time(Δt::Array{Float64}; ρ::Float64 = 1.0)

Returns the integration time $t_{i+1-\alpha_{f_2}}$ between $[t_i, t_{i+1}]$ using the alpha scheme. If $\Delta t$ has length $n$, the output will also have length $n$.

ADCME.runge_kuttaFunction.
runge_kutta(f::Function, T::Union{PyObject, Float64}, 
            NT::Union{PyObject,Int64}, y::Union{PyObject, Float64, Array{Float64}}, θ::Union{PyObject, Missing}=missing; method::String="rk4")

Solves

\[\frac{dy}{dt} = f(y, t, \theta)\]

with Runge-Kutta method.

For example, the default solver, RK4, has the following numerical scheme per time step

\[\begin{aligned} k_1 &= \Delta t f(t_n, y_n, \theta)\\ k_2 &= \Delta t f(t_n+\Delta t/2, y_n + k_1/2, \theta)\\ k_3 &= \Delta t f(t_n+\Delta t/2, y_n + k_2/2, \theta)\\ k_4 &= \Delta t f(t_n+\Delta t, y_n + k_3, \theta)\\ y_{n+1} &= y_n + \frac{k_1}{6} +\frac{k_2}{3} +\frac{k_3}{3} +\frac{k_4}{6} \end{aligned}\]

Optimal Transport

ADCME.distFunction.
dist(x::Union{PyObject, Array{Float64}}, y::Union{PyObject, Array{Float64}}, order::Union{Int64, PyObject}=2)

Computes the distance function with norm order. dist returns a $n\times m$ matrix, where $x\in \mathbb{R}^{n\times d}$ and $y\in \mathbb{R}^{m\times d}$, and the return $M\in \mathbb{R}^{n\times m}$

\[M_{ij} = ||x_i - y_j||_{o}\]
ADCME.dtwFunction.
dtw(s::Union{PyObject, Array{Float64}}, t::Union{PyObject, Array{Float64}}, 
    use_fast::Bool = false)

Computes the dynamic time wrapping (DTW) distance between two time series s and t. Returns the distance and path. use_fast specifies whether fast algorithm is used. Note fast algorithm may not be accurate.

empirical_sinkhorn(x::Union{PyObject, Array{Float64}}, y::Union{PyObject, Array{Float64}}, dist::Function;
reg::Union{PyObject,Float64} = 1.0, iter::Int64 = 1000, tol::Float64 = 1e-9, method::String="sinkhorn")

Computes the empirical Wasserstein distance with sinkhorn algorithm. The implementation are adapted from https://github.com/rflamary/POT.

ADCME.sinkhornMethod.
sinkhorn(a::Union{PyObject, Array{Float64}}, b::Union{PyObject, Array{Float64}}, M::Union{PyObject, Array{Float64}};
reg::Float64 = 1.0, iter::Int64 = 1000, tol::Float64 = 1e-9, method::String="sinkhorn")

Computes the optimal transport with Sinkhorn algorithm. The implementation are adapted from https://github.com/rflamary/POT.

Misc