Star us on GitHub!
StarWelcome!
OnlineStats does statistics and data visualization for big/streaming data via online algorithms. Each algorithm:
- processes data one observation at a time.
- uses O(1) memory.
Basics
1) Creating
- Stats are subtypes of
OnlineStat{T}
whereT
is the type of a single observation.
julia> using OnlineStats
julia> m = Mean()
Mean: n=0 | value=0.0
julia> supertype(Mean)
OnlineStat{Number}
2) Updating
- Stats can be updated with single or multiple observations e.g.
fit!(m, 1)
andfit!(m, [1,2,3])
.
julia> y = randn(100);
julia> fit!(m, y)
Mean: n=100 | value=0.0802305
julia> value(m)
0.08023048634351752
3) Merging
- Stats can be merged.
julia> y2 = randn(100);
julia> m2 = fit!(Mean(), y2)
Mean: n=100 | value=-0.0581861
julia> merge!(m, m2)
Mean: n=200 | value=0.0110222
Some OnlineStat
s are not analytically mergeable. In these cases, you will see a warning that either no merging occurred or that the merge is approximate.