CpuId.CpuId
— ModuleModule CpuId
Query information about and directly from your CPU.
CpuId._cpuid_vendor_id
— ConstantMap vendor string of type 'char[12]' provided by cpuid, eax=0x0
to a Julia symbol.
CpuId.CpuFeature
— TypeTuple of cpuid leaf in eax, result register and bit, and a descriptive string.
This table is an edited combination of sources from Wikipedia page on cpuid
, sandpile.org, and of course Intel's 4670 page combined Architectures Software Devleoper Manual.
Expect this table to be incomplete and improvable.
CpuId.__cachesize_level
— MethodHelper function to determine the cache size for a given subleaf sl
on Intel or AMD Extended.
CpuId.__datacachesize
— MethodHelper function that performs the actual computation of the cache size with register values retrieved from cpuid
on leaf 0x04.
Cache size information on leaf 0x04 is computed with size in bytes = (ways+1) * (partitions+1) * (linesize+1) * (sets+1) where ways = ebx[22:31], partitions = ebx[12:21], linesize = ebx[0:11] sets = ecx[:]
CpuId._throw_unsupported_leaf
— MethodHelper function, tagged noinline to not have detrimental effect on performance.
CpuId.address_size
— Methodaddress_size()
Determine the maximum virtual address size supported by this CPU as reported by the cpuid
instructions.
This information may be used to determine the number of high bits that can be used in a pointer for tagging; viz. sizeof(Int) - address_size() ÷ 8
, which gives on most 64 bit Intel machines 2 bytes = 16 bit for other purposes.
CpuId.cacheinclusive
— Functioncacheinclusive()
cacheinclusive(lvl::Integer)
Obtain information on the CPU's data cache inclusiveness. Returns true
for a cache that is inclusive of the lower cache levels, and false
otherwise.
Determine the data cache size for each cache level as reported by the CPU using a set of calls to the cpuid
instruction. Returns a tuple with the tuple indices matching the cache levels.
If given an integer, then the data cache inclusiveness of the respective cache level will be returned. This is significantly faster than the tuple version above.
CpuId.cachelinesize
— Methodcachelinesize()
Query the CPU about the L1 data cache line size in bytes. This is typically 64 byte. Returns zero if cache line size information is not available from the CPU.
CpuId.cachesize
— Functioncachesize()
cachesize(lvl::Integer)
Obtain information on the CPU's data cache sizes.
Determine the data cache size for each cache level as reported by the CPU using a set of calls to the cpuid
instruction. Returns a tuple with the tuple indices matching the cache levels; sizes are given in bytes.
If given an integer, then the data cache size of the respective cache level will be returned. This is significantly faster than the tuple version above.
Note that these are total cache sizes, where some cache levels are typically shared by multiple cpu cores, the higher cache levels may include lower levels. To print the cache levels in kbyte, use e.g. CpuId.cachesize() .÷ 1024
.
This functions throws an error if cache size detection is not supported.
CpuId.cpu_base_frequency
— Methodcpu_base_frequency()
Determine the CPU nominal base frequency in MHz as reported directly from the CPU through a cpuid
instruction call. Returns zero if the CPU doesn't provide base frequency information.
The actual cpu frequency might be lower due to throttling, or higher due to frequency boosting (see cpu_max_frequency
).
CpuId.cpu_bus_frequency
— Methodcpu_bus_frequency()
Determine the bus CPU frequency in MHz as reported directly from the CPU through a cpuid
instrauction call. Returns zero if the CPU doesn't provide bus frequency information.
CpuId.cpu_max_frequency
— Methodcpu_max_frequency()
Determine the maximum CPU frequency in MHz as reported directly from the CPU through a cpuid
instrauction call. The maximum frequency typically refers to the CPU's boost frequency. Returns zero if the CPU doesn't provide maximum frequency information.
CpuId.cpuarchitecture
— Methodcpuarchitecture()
This function tries to infer the CPU microarchitecture with a call to the cpuid
instruction. For now, only Intel CPUs are suppored according to the following table. Others are identified as :Unknown
.
Table C-1 of Intel's Optimization Reference Manual:
Family_Model | Microarchitecture |
---|---|
064EH, 065EH | Skylake |
063DH, 0647H, 06_56H | Broadwell |
063CH, 0645H, 0646H, 063FH | Haswell |
063AH, 063EH | Ivy Bridge |
062AH, 062DH | Sandy Bridge |
0625H, 062CH, 06_2FH | Westmere |
061AH, 061EH, 061FH, 062EH | Nehalem |
0617H, 061DH | Enhanced Intel Core |
06_0FH | Intel Core |
CpuId.cpubrand
— Methodcpubrand()
Determine the cpu brand as a string as provided by the CPU through executing the cpuid
instruction. This function throws if no CPU brand information is available form the CPU, which should never be the case on recent hardware.
CpuId.cpucores
— Methodcpucores()
Determine the number of physical cores on the current executing CPU by invoking a cpuid
instruction. On systems with multiple CPUs, this only gives information on the single CPU that is executing the code. Returns zero if querying this feature is not supported, which may also be due to a running hypervisor (as observed on hvvendor() == :Microsoft).
Also, this function does not take logical cores (aka hyperthreading) into account, but determines the true number of physical cores, which typically also share L3 caches and main memory bandwidth.
See also the Julia global variable Base.Sys.CPU_THREADS
, which gives the total count of all logical cores on the machine.
CpuId.cpucycle
— Functioncpucycle()
Read the CPU's Time Stamp Counter, TSC, directly with a rdtsc
instruction. This counter is increased for every CPU cycle, until reset. This function has, when inlined, practically no overhead and is, thus, probably the fasted method to count how many cycles the CPU has spent working since last read.
Note, the TSC runs at a constant rate if hasfeature(:TSCINV)==true
; otherwise, it is tied to the current CPU clock frequency.
Hint: This function is extremely efficient when inlined into your own code. Convince yourself by typing @code_native CpuId.cpucycle()
. To use this for benchmarking, simply subtract the results of two calls. The result is the actual CPU clock cycles spent, independent of the current (and possible non-constant) CPU clock frequency.
CpuId.cpucycle_id
— Functioncpucycle_id()
Read the CPU's Time Stamp Counter, TSC, and executing CPU id directly with a rdtscp
instruction. This function is similar to the cpucycle()
, but uses an instruction that also allows to detect if the code has been moved to a different executing CPU. See also the comments for cpucycle()
which equally apply.
CpuId.cpufeature
— Functioncpufeature( feature::Symbol ) ::Bool
cpufeature( feature::CpuFeature ) ::Bool
Query the CPU whether it supports the given feature. For fast checking provide directly the CpuFeature
defined as a global const in CpuId
. Explicitly typed CpuFeature
s got by the same name as the corresponding symbols. Valid symbols are available from keys(CpuId.CpuFeatureDescription)
.
CpuId.cpufeaturedesc
— Methodcpufeaturedesc( feature::Symbol ) ::String
Get the textual description of a CPU feature flag given as a symbol.
CpuId.cpufeatures
— Methodcpufeatures() ::Vector{Symbol}
Get a list of symbols of all cpu supported features. Might be extensive and not exactly useful other than for testing purposes. Also, this implementation is not efficient since each feature is queried independently.
CpuId.cpufeaturetable
— Methodcpufeaturetable() ::MarkdownString
Generate a markdown table of all the detected/available/supported CPU features along with some textural description.
CpuId.cpuinfo
— Methodcpuinfo()
Generate a markdown table with the results of all of the CPU querying functions provided by the module CpuId
. Intended to give a quick overview for diagnostic purposes e.g. in log files.
CpuId.cpumodel
— Methodcpumodel()
Obtain the CPU model information as a Dict of pairs of :Family
, :Model
, :Stepping
, and :CpuType
.
CpuId.cpunodes
— Methodcpunodes() -> Int
Determine the number of core complexes, aka nodes, on this processor. This notion is introduced by AMD, where L3 caches are shared among the cores of a comples
CpuId.cputhreads
— Methodcputhreads()
Determine the number of logical cores on the current executing CPU by invoking a cpuid
instruction. On systems with multiple CPUs, this only gives information on the single CPU that is executing the code. Returns zero if querying this feature is not supported, which may also be due to a running hypervisor (as observed on hvvendor() == :Microsoft).
In contrast to cpucores()
, this function also takes logical cores aka hyperthreading into account. For practical purposes, only I/O intensive code should make use of these total number of cores; memory or computation bound code will not benefit, but rather experience a detrimental effect.
See also Julia's global variable Base.Sys.CPU_THREADS
, which gives the total count of all logical cores on the machine. Thus, Base.Sys.CPU_THREADS ÷ CpuId.cputhreads()
should give you the number of CPUs (packages) in your system.
CpuId.cputhreads_per_core
— Methodcputhreads_per_core() -> Int
Determine the of threads per hardware core on the currently executing CPU. A value larger than one indicates simulatenous multithreading being enabled, aka SMT, aka Hyperthreading.
CpuId.cpuvendor
— Methodcpuvendor()
Determine the cpu vendor as a Julia symbol. In case the CPU vendor identification is unknown :Unknown
is returned (then also consider raising an issue on Github).
CpuId.cpuvendorstring
— Methodcpuvendorstring()
Determine the cpu vendor string as provided by the cpu by executing a cpuid
instruction. Note, this string has a fixed length of 12 characters. Use cpuvendor()
if you prefer getting a parsed Julia symbol.
CpuId.has_cpu_frequencies
— Methodhas_cpu_frequencies()
Determine whether the CPU provides clock frequency information. If true, then cpu_base_frequency()
, cpu_max_frequency()
and cpu_bus_frequency()
should be expected to return sensible information.
CpuId.hasleaf
— Methodhasleaf(leaf::UInt32) ::Bool
Helper function (not exported) to test whether the CPU claims to provide the given leaf in a cpuid
instruction call.
Note: It appears LLVM really know its gear: If this function is inlined, and just-in-time compiled, then this test is eliminated completly if the executing machine does support this feature. Yeah!
CpuId.hvinfo
— Methodhvinfo() ::MarkdownString
Generate a markdown table of all the detected/available/supported tags of a running hypervisor. If there is no hosting hypervisor, an empty markdown string is returned.
CpuId.hvvendor
— Methodhvvendor()
Determine the hypervisor vendor as a Julia symbol or :NoHypervisor
if not running a hypervisor. In case the hypervisor vendor identification is unknown :Unknown
is returned (then also consider raising an issue on Github).
CpuId.hvvendorstring
— Methodhvvendorstring()
Determine the hypervisor vendor string as provided by the cpu by executing a cpuid
instruction. Note, this string has a fixed length of 12 characters. Use hvvendor()
if you prefer getting a parsed Julia symbol. If the CPU is not running a hypervisor, a string of undefined content will be returned.
CpuId.hvversion
— Methodhvversion()
Get a dictionary with additional information of the running hypervisor. The dictionary is empty if no hypervisor is detected, and only tags that are provided by the hypervisor are included.
Note, the data available is hypervisor vendor dependent.
CpuId.hypervised
— Methodhypervised()
Check whether the CPU reports to run a hypervisor context, that is, whether the current process runs in a virtual machine.
A positive answer may indicate that other information reported by the CPU is fake, such as number of physical and logical cores. This is because the hypervisor is free to decide which information to pass.
CpuId.perf_fix_bits
— Methodperf_fix_bits()
Determine the number of bits fixed-function counters performance counters on the executing CPU.
This information is only available if cpufeature(PDCM) == true
.
CpuId.perf_fix_counters
— Methodperf_fix_counters()
Determine the number of fixed-function performance counters on the executing machine.
This information is only available if cpufeature(PDCM) == true
.
CpuId.perf_gen_bits
— Methodperf_gen_bits()
Determine the number of bits general purpose counters performance counters on the executing CPU.
This information is only available if cpufeature(PDCM) == true
.
CpuId.perf_gen_counters
— Methodperf_gen_counters()
Determine the number of general purpose counters performance counters on the executing CPU. Number of counters is given as per logical processor.
This information is only available if cpufeature(PDCM) == true
.
CpuId.perf_revision
— Methodperf_revision()
Determine the revision number of the performance monitoring unit.
This information is only available if cpufeature(PDCM) == true
.
CpuId.physical_address_size
— Methodphysical_address_size()
Determine the maximum phyiscal addresses size supported by this CPU as reported by the cpuid
instructions. Prefer to make use of address_size
for practical purposes; use this only for diagnostic issues, such as determining the theoretical maximum memory size. Also note that this address size is manipulated by a running hypervisor.
CpuId.simdbits
— Methodsimdbits()
Query the CPU on the maximum supported SIMD vector size in bits, or sizeof(Int)
in bits if no SIMD capability is reported by the invoked cpuid
instruction.
CpuId.simdbytes
— Methodsimdbytes()
Query the CPU on the maximum supported SIMD vector size in bytes, or sizeof(Int)
if no SIMD capability is reported by the invoked cpuid
instruction.
CpuId.CpuInstructions
— ModuleModule 'CpuInstructions'
The module 'CpuInstructions' is part of the package 'CpuId', and provides a selection of wrapped low-level assembly functions to diagnose potential computational efficiency issues.
Though primarily intended as a helper module to 'CpuId', the functions may be used directly in other code e.g. for benchmarking purposes. Just include the file directly, or copy & paste.
CpuId.CpuInstructions.cpuid
— Functioncpuid( [leaf], [subleaf]) ::NTuple{4, UInt32}
Invoke the cpu's hardware instruction cpuid
with the values of the arguments stored as registers EAX = leaf, ECX = subleaf, respectively. Returns a tuple of the response of registers EAX, EBX, ECX, EDX. Input values may be given as individual UInt32
arguments, or converted from any Integer
. Unspecified arguments are assumed zero.
This function is primarily intended as a low-level interface to the CPU.
Note: Expected to work on all CPUs that implement the assembly instruction cpuid
, which is at least Intel and AMD.