Pages tagged

Accurately computing running variance

http://www.johndcook.com/standard_deviation.html

The most direct way of computing sample variance or standard deviation can have severe numerical problems. [...] There is a way to compute variance that is more accurate and is guaranteed to always give positive results. Furthermore, the method computes a running variance. That is, the method computes the variance as the x's arrive one at a time. The data do not need to be saved for a second pass.

"This better way of computing variance goes back to a 1962 paper by B. P. Welford and is presented in Donald Knuth's Art of Computer Programming, Vol 2, page 232, 3rd edition. Although this solution has been known for decades, not enough people know about it. Most people are probably unaware that computing sample variance can be difficult until the first time they compute a standard deviation and get an exception for taking the square root of a negative number. It is not obvious that the method is correct even in exact arithmetic. It's even less obvious that the method has superior numerical properties, but it does."

A simple way to compute running sample variance (standard deviation).

Computing mean, variance and standard deviation on a stream of data.