CheapThreads

CheapThreads.UnsignedIteratorEarlyStopType
UnsignedIteratorEarlyStop(thread_mask[, num_threads = count_ones(thread_mask)])

Iterator, returning (i,t) = Tuple{UInt32,UInt32}, where i iterates from 1,2,...,num_threads, and t gives the threadids to call ThreadingUtilities.taskpointer with.

Unfortunately, codegen is suboptimal when used in the ergonomic for (i,tid) ∈ thread_iterator fashion. If you want to microoptimize, You'd get better performance from a pattern like:

function sumk(u,l = count_ones(u) % UInt32)
    uu = ServiceSolicitation.UnsignedIteratorEarlyStop(u,l)
    s = zero(UInt32); state = ServiceSolicitation.initial_state(uu)
    while true
        iter = iterate(uu, state)
        iter === nothing && break
        (i,t),state = iter
        s += t
    end
    s
end

This iterator will iterate at least once; it's important to check and exit early with a single threaded version.