OnlineStatsBase.CircBuff
— TypeCircBuff(T, b; rev=false)
Create a fixed-length circular buffer of b
items of type T
.
rev=false
:o[1]
is the oldest.rev=true
:o[end]
is the oldest.
Example
a = CircBuff(Int, 5)
b = CircBuff(Int, 5, rev=true)
fit!(a, 1:10)
fit!(b, 1:10)
a[1] == b[end] == 1
a[end] == b[1] == 10
value(o; ordered=false) # Retrieve values (no copy) without ordering
OnlineStatsBase.CountMap
— TypeCountMap(T::Type)
CountMap(dict::AbstractDict{T, Int})
Track a dictionary that maps unique values to its number of occurrences. Similar to StatsBase.countmap
.
Counts can be incremented by values other than one (and decremented) using the fit!(::CountMap{T}, ::Pair{T,Int})
method, e.g.
o = fit!(CountMap(String), ["A", "B"])
fit!(o, "A" => 5)
fit!(o, "A" => -1)
Example
o = fit!(CountMap(Int), rand(1:10, 1000))
value(o)
OnlineStatsBase.probs(o)
OnlineStats.pdf(o, 1)
collect(keys(o))
sort!(o)
delete!(o, 1)
OnlineStatsBase.CountMissing
— TypeCountMissing(stat)
Calculate a stat
along with the count of missing
values.
Example
o = CountMissing(Mean())
fit!(o, [1, missing, 3])
OnlineStatsBase.Counter
— TypeCounter(T=Number)
Count the number of items in a data stream with elements of type T
.
Example
fit!(Counter(Int), 1:100)
OnlineStatsBase.CovMatrix
— TypeCovMatrix(p=0; weight=EqualWeight())
CovMatrix(::Type{T}, p=0; weight=EqualWeight())
Calculate a covariance/correlation matrix of p
variables. If the number of variables is unknown, leave the default p=0
.
Example
o = fit!(CovMatrix(), randn(100, 4) |> eachrow)
cor(o)
cov(o)
mean(o)
var(o)
OnlineStatsBase.EqualWeight
— TypeEqualWeight()
Equally weighted observations.
$γ(t) = 1 / t$
OnlineStatsBase.ExponentialWeight
— TypeExponentialWeight(λ::Float64)
ExponentialWeight(lookback::Int)
Exponentially weighted observations. Each weight is λ = 2 / (lookback + 1)
.
ExponentialWeight
does not satisfy the usual assumption that γ(1) == 1
. Therefore, some statistics have an implicit starting value.
# E.g. Mean has an implicit starting value of 0.
o = Mean(weight=ExponentialWeight(.1))
fit!(o, 10)
value(o) == 1
$γ(t) = λ$
OnlineStatsBase.Extrema
— TypeExtrema(T::Type = Float64)
Extrema(min_init::T, max_init::T)
Maximum and minimum (and number of occurrences for each) for a data stream of type T
.
Example
Extrema(Float64) == Extrema(Inf, -Inf)
o = fit!(Extrema(), rand(10^5))
extrema(o)
maximum(o)
minimum(o)
OnlineStatsBase.ExtremeValues
— TypeOnlineStatsBase.FTSeries
— TypeDeprecated! See FilterTransform
.
FTSeries(stats...; filter=x->true, transform=identity)
Track multiple stats for one data stream that is filtered and transformed before being fitted.
FTSeries(T, stats...; filter, transform)
Create an FTSeries and specify the type T
of the pre-transformed values.
Example
o = FTSeries(Mean(), Variance(); transform=abs)
fit!(o, -rand(1000))
# Remove missing values represented as DataValues
using DataValues
y = DataValueArray(randn(100), rand(Bool, 100))
o = FTSeries(DataValue, Mean(); transform=get, filter=!isna)
fit!(o, y)
# Remove missing values represented as Missing
y = [rand(Bool) ? rand() : missing for i in 1:100]
o = FTSeries(Union{Missing,Number}, Mean(); filter=!ismissing)
fit!(o, y)
# Alternatively for Missing:
fit!(Mean(), skipmissing(y))
OnlineStatsBase.FilterTransform
— TypeFilterTransform(stat::OnlineStat{S}, T = S; filter = x->true, transform = identity)
FilterTransform(T => filter => transform => stat)
Wrapper around an OnlineStat that the filters and transforms its input. Note that, depending on your transformation, you may need to specify the type of a single observation (T
).
Examples
o = FilterTransform(Mean(), Union{Missing,Number}, filter=!ismissing)
fit!(o, [1, missing, 3])
o = FilterTransform(String => (x->true) => (x->parse(Int,x)) => Mean())
fit!(o, "1")
OnlineStatsBase.Group
— TypeGroup(stats::OnlineStat...)
Group(; stats...)
Group(collection)
Create a vector-input stat from several scalar-input stats. For a new observation y
, y[i]
is sent to stats[i]
.
Examples
x = randn(100, 2)
fit!(Group(Mean(), Mean()), eachrow(x))
fit!(Group(Mean(), Variance()), eachrow(x))
o = fit!(Group(m1 = Mean(), m2 = Mean()), eachrow(x))
o.stats.m1
o.stats.m2
OnlineStatsBase.GroupBy
— TypeGroupBy(T, stat)
Update stat
for each group (of type T
). A single observation is either a (named) tuple with two elements or a Pair.
Example
x = rand(Bool, 10^5)
y = x .+ randn(10^5)
fit!(GroupBy(Bool, Series(Mean(), Extrema())), zip(x,y))
OnlineStatsBase.HarmonicWeight
— TypeHarmonicWeight(a = 10.0)
Weight determined by harmonic series.
$γ(t) = a / (a + t - 1)$
OnlineStatsBase.LearningRate
— TypeLearningRate(r = .6)
Slowly decreasing weight. Satisfies the standard stochastic approximation assumption $∑ γ(t) = ∞, ∑ γ(t)^2 < ∞$ if $r ∈ (.5, 1]$.
$γ(t) = inv(t ^ r)$
OnlineStatsBase.LearningRate2
— TypeLearningRate2(c = .5)
Slowly decreasing weight.
$γ(t) = inv(1 + c * (t - 1))$
OnlineStatsBase.McclainWeight
— TypeMcclainWeight(α = .1)
Weight which decreases into a constant.
$γ(t) = γ(t-1) / (1 + γ(t-1) - α)$
OnlineStatsBase.Mean
— TypeMean(T = Float64; weight=EqualWeight())
Track a univariate mean, stored as type T
.
Example
@time fit!(Mean(), randn(10^6))
OnlineStatsBase.Moments
— TypeMoments(; weight=EqualWeight())
First four non-central moments.
Example
o = fit!(Moments(), randn(1000))
mean(o)
var(o)
std(o)
skewness(o)
kurtosis(o)
OnlineStatsBase.RepeatingRange
— TypeRepeatingRange(rng)
Range that repeats forever. e.g.
r = OnlineStatsBase.RepeatingRange(1:2)
r[1:5] == [1, 2, 1, 2, 1]
OnlineStatsBase.Series
— TypeSeries(stats)
Series(stats...)
Series(; stats...)
Track a collection stats for one data stream.
Example
s = Series(Mean(), Variance())
fit!(s, randn(1000))
OnlineStatsBase.SkipMissing
— TypeSkipMissing(stat)
Wrapper around an OnlineStat that will skip over missing
values.
Example
o = SkipMissing(Mean())
fit!(o, [1, missing, 3])
OnlineStatsBase.Sum
— TypeSum(T::Type = Float64)
Track the overall sum.
Example
fit!(Sum(Int), fill(1, 100))
OnlineStatsBase.TryCatch
— TypeTryCatch(stat; error_limit=1000, error_message_limit=90)
Wrap each call to fit!
in a try
-catch
block and track the errors encountered (via CountMap
). Only error_limit
unique errors will be included in the CountMap
. If a new error occurs after error_limit
has been reached, it will be included in the CountMap
as "Other"
. Only the first error_message_limit
characters of each error message will be recorded.
Example
o = TryCatch(Mean())
fit!(o, [1, missing, 3])
OnlineStatsBase.errors(o)
OnlineStatsBase.Variance
— TypeVariance(T = Float64; weight=EqualWeight())
Univariate variance, tracked as type T
.
Example
o = fit!(Variance(), randn(10^6))
mean(o)
var(o)
std(o)
Base.merge!
— Methodmerge!(stat1, stat2)
Merge stat2
into stat1
(supported by most OnlineStat
types).
Example
a = fit!(Mean(), 1:10)
b = fit!(Mean(), 11:20)
merge!(a, b)
OnlineStatsBase.smooth!
— Methodsmooth!(a, b, γ)
Update a
in place by applying the smooth
function elementwise with b
.
OnlineStatsBase.smooth
— Methodsmooth(a, b, γ)
Weighted average of a
and b
with weight γ
.
$(1 - γ) * a + γ * b$
OnlineStatsBase.smooth_syr!
— Methodsmooth_syr!(A::AbstractMatrix, x, γ::Number)
Weighted average of symmetric rank-1 update. Updates the upper triangle of:
A = (1 - γ) * A + γ * x * x'
OnlineStatsBase.value
— Methodvalue(stat::OnlineStat)
Calculate the value of stat
from its "sufficient statistics".
StatsBase.fit!
— Methodfit!(stat1::OnlineStat, stat2::OnlineStat)
Alias for merge!
. Merges stat2
into stat1
.
Useful for reductions of OnlineStats using fit!
.
Example
julia> v = [reduce(fit!, [1, 2, 3], init=Mean()) for _ in 1:3]
3-element Vector{Mean{Float64, EqualWeight}}:
Mean: n=3 | value=2.0
Mean: n=3 | value=2.0
Mean: n=3 | value=2.0
julia> reduce(fit!, v, init=Mean())
Mean: n=9 | value=2.0
StatsBase.fit!
— Methodfit!(stat::OnlineStat, data)
Update the "sufficient statistics" of a stat
with more data. If typeof(data)
is not the type of a single observation for the provided stat
, fit!
will attempt to iterate through and fit!
each item in data
. Therefore, fit!(Mean(), 1:10)
translates roughly to:
Example
o = Mean()
for x in 1:10
fit!(o, x)
end