Reflection

Because of using a different compilation toolchain, CUDAnative.jl offers counterpart functions to the code_ functionality from Base:

Missing docstring.

Missing docstring for CUDAnative.code_llvm. Check Documenter's build log for details.

Missing docstring.

Missing docstring for CUDAnative.code_ptx. Check Documenter's build log for details.

CUDAnative.code_sassFunction
code_sass([io], f, types, cap::VersionNumber)

Prints the SASS code generated for the method matching the given generic function and type signature to io which defaults to stdout.

The following keyword arguments are supported:

  • cap which device to generate code for
  • kernel: treat the function as an entry-point kernel
  • verbose: enable verbose mode, which displays code generation statistics

See also: @device_code_sass

Convenience macros

For ease of use, CUDAnative.jl also implements @device_code_ macros wrapping the above reflection functionality. These macros evaluate the expression argument, while tracing compilation and finally printing or returning the code for every invoked CUDA kernel. Do note that this evaluation can have side effects, as opposed to similarly-named @code_ macros in Base which are free of side effects.

CUDAnative.@device_code_loweredMacro
@device_code_lowered ex

Evaluates the expression ex and returns the result of InteractiveUtils.code_lowered for every compiled GPU kernel.

See also: InteractiveUtils.@code_lowered

CUDAnative.@device_code_typedMacro
@device_code_typed ex

Evaluates the expression ex and returns the result of InteractiveUtils.code_typed for every compiled GPU kernel.

See also: InteractiveUtils.@code_typed

CUDAnative.@device_code_warntypeMacro
@device_code_warntype [io::IO=stdout] ex

Evaluates the expression ex and prints the result of InteractiveUtils.code_warntype to io for every compiled GPU kernel.

See also: InteractiveUtils.@code_warntype

CUDAnative.@device_code_llvmMacro
@device_code_llvm [io::IO=stdout, ...] ex

Evaluates the expression ex and prints the result of InteractiveUtils.codellvm to io for every compiled GPU kernel. For other supported keywords, see [`GPUCompiler.codellvm`](@ref).

See also: InteractiveUtils.@code_llvm

CUDAnative.@device_codeMacro
@device_code dir::AbstractString=... [...] ex

Evaluates the expression ex and dumps all intermediate forms of code to the directory dir.

CUDAnative.versionFunction
version()

Returns the version of the CUDA toolkit in use.

version(k::HostKernel)

Queries the PTX and SM versions a kernel was compiled for. Returns a named tuple.

CUDAnative.maxthreadsFunction
maxthreads(k::HostKernel)

Queries the maximum amount of threads a kernel can use in a single block.

CUDAnative.memoryFunction
memory(k::HostKernel)

Queries the local, shared and constant memory usage of a compiled kernel in bytes. Returns a named tuple.