Star us on GitHub!



OnlineStats does statistics and data visualization for big/streaming data via online algorithms. Each algorithm:

  1. processes data one observation at a time.
  2. uses O(1) memory.


1) Creating

  • Stats are subtypes of OnlineStat{T} where T is the type of a single observation.
julia> using OnlineStats

julia> m = Mean()
Mean: n=0 | value=0.0

julia> supertype(Mean)

2) Updating

  • Stats can be updated with single or multiple observations e.g. fit!(m, 1) and fit!(m, [1,2,3]).
julia> y = randn(100);

julia> fit!(m, y)
Mean: n=100 | value=0.0530936

julia> value(m)

3) Merging

  • Stats can be merged.
julia> y2 = randn(100);

julia> m2 = fit!(Mean(), y2)
Mean: n=100 | value=-0.0307929

julia> merge!(m, m2)
Mean: n=200 | value=0.0111503

Some OnlineStats are not analytically mergeable. In these cases, you will see a warning that either no merging occurred or that the merge is approximate.