API
OnlineStats.ADAGRAD
OnlineStats.ADAM
OnlineStats.ADAMAX
OnlineStats.Bootstrap
OnlineStats.BoundedEqualWeight
OnlineStats.CovMatrix
OnlineStats.Diff
OnlineStats.EqualWeight
OnlineStats.ExponentialWeight
OnlineStats.Extrema
OnlineStats.FitBeta
OnlineStats.FitCategorical
OnlineStats.FitCauchy
OnlineStats.FitGamma
OnlineStats.FitLogNormal
OnlineStats.FitMultinomial
OnlineStats.FitMvNormal
OnlineStats.FitNormal
OnlineStats.HarmonicWeight
OnlineStats.HyperLogLog
OnlineStats.KMeans
OnlineStats.LearningRate
OnlineStats.LearningRate2
OnlineStats.LinReg
OnlineStats.MAXSPGD
OnlineStats.MMXTX
OnlineStats.MV
OnlineStats.McclainWeight
OnlineStats.Mean
OnlineStats.Moments
OnlineStats.OrderStats
OnlineStats.QuantileISGD
OnlineStats.QuantileMM
OnlineStats.QuantileSGD
OnlineStats.ReservoirSample
OnlineStats.SPGD
OnlineStats.Series
OnlineStats.StatLearn
OnlineStats.StochasticLoss
OnlineStats.Sum
OnlineStats.Variance
LearnBase.value
OnlineStats.maprows
OnlineStats.replicates
OnlineStats.stats
StatsBase.confint
StatsBase.fit!
OnlineStats.Bootstrap
— Type.Bootstrap(s::Series, nreps, d, f = value)
Online Statistical Bootstrapping.
Create nreps
replicates of the OnlineStat in Series s
. When fit!
is called, each of the replicates will be updated rand(d)
times. Standard choices for d
are Distributions.Poisson()
, [0, 2]
, etc. value(b)
returns f
mapped to the replicates.
Example
b = Bootstrap(Series(Mean()), 100, [0, 2])
fit!(b, randn(1000))
value(b) # `f` mapped to replicates
mean(value(b)) # mean
OnlineStats.Diff
— Type.Diff()
Track the difference and the last value.
Example
s = Series(randn(1000), Diff())
value(s)
OnlineStats.FitCategorical
— Type.FitCategorical(T)
Fit a categorical distribution where the inputs are of type T
.
Example
using Distributions
s = Series(rand(1:10, 1000), FitCategorical(Int))
value(s)
vals = ["small", "medium", "large"]
s = Series(rand(vals, 1000), FitCategorical(String))
value(s)
OnlineStats.MV
— Type.MV(p, o)
Track p
univariate OnlineStats o
Example
y = randn(1000, 5)
o = MV(5, Mean())
s = Series(y, o)
OnlineStats.ReservoirSample
— Type.ReservoirSample(k)
ReservoirSample(k, Float64)
Reservoir sample of k
items.
Example
o = ReservoirSample(k, Int)
s = Series(o)
fit!(s, 1:10000)
OnlineStats.Series
— Type.Series(stats...)
Series(data, stats...)
Series(weight, stats...)
Series(weight, data, stats...)
A Series is a container for a Weight and any number of OnlineStats. Updating the Series with fit!(s, data)
will update the OnlineStats it holds according to its Weight.
Examples
Series(randn(100), Mean(), Variance())
Series(ExponentialWeight(.1), Mean())
s = Series(Mean())
fit!(s, randn(100))
s2 = Series(randn(123), Mean())
merge(s, s2)
OnlineStats.StatLearn
— Type.StatLearn(p, loss, penalty, λ, updater)
Fit a statistical learning model of p
independent variables for a given loss
, penalty
, and λ
. Arguments are:
loss
: any Loss from LossFunctions.jlpenalty
: any Penalty from PenaltyFunctions.jl.λ
: a Float64 regularization parameterupdater
:SPGD()
,ADAGRAD()
,ADAM()
, orADAMAX()
Example
using LossFunctions, PenaltyFunctions
x = randn(100_000, 10)
y = x * linspace(-1, 1, 10) + randn(100_000)
o = StatLearn(10, L2DistLoss(), L1Penalty(), .1, SPGD())
s = Series(o)
fit!(s, x, y)
coef(o)
predict(o, x)
OnlineStats.StochasticLoss
— Type. s = Series(randn(1000), StochasticLoss(QuantileLoss(.7)))
Minimize a loss (from LossFunctions.jl) using stochastic gradient descent.
Example
o1 = StochasticLoss(QuantileLoss(.7)) # approx. .7 quantile
o2 = StochasticLoss(L2DistLoss()) # approx. mean
o3 = StochasticLoss(L1DistLoss()) # approx. median
s = Series(randn(10_000), o1, o2, o3)
OnlineStats.Sum
— Type.Sum()
Track the overall sum.
Example
s = Series(randn(1000), Sum())
value(s)
OnlineStats.ADAGRAD
— Type.ADAGRAD(η)
Adaptive (element-wise learning rate) SPGD with step size η
OnlineStats.ADAM
— Type.ADAM(α1, α2, η)
Adaptive Moment Estimation with step size η
and momentum parameters α1
, α2
OnlineStats.ADAMAX
— Type.ADAMAX(α1, α2, η)
ADAMAX with step size η
and momentum parameters α1
, α2
OnlineStats.BoundedEqualWeight
— Type.BoundedEqualWeight(λ::Real = 0.1)
BoundedEqualWeight(lookback::Integer)
Use EqualWeight until threshold
λ
is hit, then hold constant.Singleton weight at observation
t
isγ = max(1 / t, λ)
OnlineStats.CovMatrix
— Type.CovMatrix(d)
Covariance Matrix of d
variables.
Example
y = randn(100, 5)
Series(y, CovMatrix(5))
OnlineStats.EqualWeight
— Type.EqualWeight()
Equally weighted observations
Singleton weight at observation
t
isγ = 1 / t
OnlineStats.ExponentialWeight
— Type.ExponentialWeight(λ::Real = 0.1)
ExponentialWeight(lookback::Integer)
Exponentially weighted observations (constant)
Singleton weight at observation
t
isγ = λ
OnlineStats.Extrema
— Type.Extrema()
Maximum and minimum.
Example
s = Series(randn(100), Extrema())
value(s)
OnlineStats.FitBeta
— Type.FitBeta()
Online parameter estimate of a Beta distribution (Method of Moments)
Example
using Distributions, OnlineStats
y = rand(Beta(3, 5), 1000)
s = Series(y, FitBeta())
Beta(value(s)...)
OnlineStats.FitCauchy
— Type.FitCauchy()
Online parameter estimate of a Cauchy distribution
Example
using Distributions
y = rand(Cauchy(0, 10), 10_000)
s = Series(y, FitCauchy())
Cauchy(value(s)...)
OnlineStats.FitGamma
— Type.FitGamma()
Online parameter estimate of a Gamma distribution (Method of Moments)
Example
using Distributions
y = rand(Gamma(5, 1), 1000)
s = Series(y, FitGamma())
Gamma(value(s)...)
OnlineStats.FitLogNormal
— Type.FitLogNormal()
Online parameter estimate of a LogNormal distribution (MLE)
Example
using Distributions
y = rand(LogNormal(3, 4), 1000)
s = Series(y, FitLogNormal())
LogNormal(value(s)...)
OnlineStats.FitMultinomial
— Type.FitMultinomial(p)
Online parameter estimate of a Multinomial distribution.
Example
using Distributions
y = rand(Multinomial(10, [.2, .2, .6]), 1000)
s = Series(y', FitMultinomial())
Multinomial(value(s)...)
OnlineStats.FitMvNormal
— Type.FitMvNormal(d)
Online parameter estimate of a d
-dimensional MvNormal distribution (MLE)
Example
using Distributions
y = rand(MvNormal(zeros(3), eye(3)), 1000)
s = Series(y', FitMvNormal(3))
OnlineStats.FitNormal
— Type.FitNormal()
Online parameter estimate of a Normal distribution (MLE)
Example
using Distributions
y = rand(Normal(-3, 4), 1000)
s = Series(y, FitNormal())
OnlineStats.HarmonicWeight
— Type.HarmonicWeight(a = 10.0)
Decreases at a slow rate
Singleton weight at observation
t
isγ = a / (a + t - 1)
OnlineStats.HyperLogLog
— Type.HyperLogLog(b) # 4 ≤ b ≤ 16
Approximate count of distinct elements.
Example
s = Series(rand(1:10, 1000), HyperLogLog(12))
OnlineStats.KMeans
— Type.KMeans(p, k)
Approximate K-Means clustering of k
clusters of p
variables
Example
using OnlineStats, Distributions
d = MixtureModel([Normal(0), Normal(5)])
y = rand(d, 100_000, 1)
s = Series(y, LearningRate(.6), KMeans(1, 2))
OnlineStats.LearningRate
— Type.LearningRate(r = .6, λ = 0.0)
Mainly for stochastic approximation types (
QuantileSGD
,QuantileMM
etc.)Decreases at a "slow" rate until threshold
λ
is reachedSingleton weight at observation
t
isγ = max(1 / t ^ r, λ)
OnlineStats.LearningRate2
— Type.LearningRate2(c = .5, λ = 0.0)
Mainly for stochastic approximation types (
QuantileSGD
,QuantileMM
etc.)Decreases at a "slow" rate until threshold
λ
is reachedSingleton weight at observation
t
isγ = max(inv(1 + c * (t - 1), λ)
OnlineStats.LinReg
— Type.LinReg(p)
LinReg(p, λ)
Create a linear regression object with p
predictors and optional ridge (L2-regularization) parameter λ
.
Example
x = randn(1000, 5)
y = x * linspace(-1, 1, 5) + randn(1000)
o = LinReg(5)
s = Series(o)
fit!(s, x, y)
coef(o)
predict(o, x)
coeftable(o)
vcov(o)
confint(o)
OnlineStats.MAXSPGD
— Type.MAXSPGD(η)
SPGD where only the largest gradient element is used to update the parameter.
OnlineStats.MMXTX
— Type.MMXTX(c)
Online MM algorithm via quadratic approximation. Approximates Lipschitz constant with x'x * c * I
.
OnlineStats.McclainWeight
— Type.McclainWeight(ᾱ = 0.1)
"smoothed" version of
BoundedEqualWeight
weights asymptotically approach
ᾱ
Singleton weight at observation
t
isγ(t-1) / (1 + γ(t-1) - ᾱ)
OnlineStats.Mean
— Type.Mean()
Univariate mean.
Example
s = Series(randn(100), Mean())
value(s)
OnlineStats.Moments
— Type.Moments()
First four non-central moments.
Example
s = Series(randn(1000), Moments(10))
value(s)
OnlineStats.OrderStats
— Type.OrderStats(b)
Average order statistics with batches of size b
.
Example
s = Series(randn(1000), OrderStats(10))
value(s)
OnlineStats.QuantileISGD
— Type.QuantileISGD()
Approximate quantiles via implicit stochastic gradient descent.
Example
s = Series(randn(1000), LearningRate(.7), QuantileISGD())
value(s)
OnlineStats.QuantileMM
— Type.QuantileMM()
Approximate quantiles via an online MM algorithm.
Example
s = Series(randn(1000), LearningRate(.7), QuantileMM())
value(s)
OnlineStats.QuantileSGD
— Type.QuantileSGD()
Approximate quantiles via stochastic gradient descent.
Example
s = Series(randn(1000), LearningRate(.7), QuantileSGD())
value(s)
OnlineStats.SPGD
— Type.SPGD(η)
Stochastic Proximal Gradient Descent with step size η
OnlineStats.Variance
— Type.Variance()
Univariate variance.
Example
s = Series(randn(100), Variance())
value(s)
LearnBase.value
— Method.Map value
to the stats
field of a Series.
OnlineStats.maprows
— Method.maprows(f::Function, b::Integer, data...)
Map rows of data
in batches of size b
. Most usage is done through do
blocks.
Example
s = Series(Mean())
maprows(10, randn(100)) do yi
fit!(s, yi)
info("nobs: $(nobs(s))")
end
OnlineStats.replicates
— Method.replicates(b)
Return the vector of replicates from Bootstrap b
OnlineStats.stats
— Method.Return the stats
field of a Series.
StatsBase.confint
— Function.confint(b, coverageprob = .95)
Return a confidence interval for a Bootstrap b
.
StatsBase.fit!
— Method.fit!(s, y)
fit!(s, y, w)
Update a Series s
with more data y
and optional weighting w
.
Examples
y = randn(100)
w = rand(100)
s = Series(Mean())
fit!(s, y[1]) # one observation: use Series weight
fit!(s, y[1], w[1]) # one observation: override weight
fit!(s, y) # multiple observations: use Series weight
fit!(s, y, w[1]) # multiple observations: override each weight with w[1]
fit!(s, y, w) # multiple observations: y[i] uses weight w[i]