Here is the Python function that I used to calculate the mean and variance of the stream. I couldn't find any implementations so I wrote my own using exponential smoothing.
Code: Select all
def streaming_average_and_variance(new_value, existing_aggregate, exponential_decay_over_n_events=1000):
(old_mean, old_var) = existing_aggregate
new_mean = ((old_mean * (exponential_decay_over_n_events - 1)) + new_value) / exponential_decay_over_n_events
new_var = ((old_var * (exponential_decay_over_n_events - 1)) + pow(new_value - new_mean, 2)) / exponential_decay_over_n_events
return new_mean, new_var
Code: Select all
stream_data
Code: Select all
streaming_statistics = list()
streaming_aggregate = (0, 0)
for value in stream_data:
streaming_aggregate = streaming_average_and_variance(value, streaming_aggregate, 1000)
streaming_statistics.append(streaming_aggregate)
David