Your SEO optimized title page contents

What is Variance, Standard Deviation and Spread

The standard deviation of the mean (SD) is the most commonly used measure of the spread of values in a distribution. SD is calculated as the square root of the variance (the average squared deviation from the mean).Variance in a population is:

[x is a value from the population, μ is the mean of all x, n is the number of x in the population, Σ is the summation]

Variance is usually estimated from a sample drawn from a population. The unbiased estimate of population variance calculated from a sample is:

[xi is the ith observation from a sample of the population, x-bar is the sample mean, n (sample size) -1 is degrees of freedom, Σ is the summation]

The spread of a distribution is also referred to as dispersion and variability. All three terms mean the extent to which values in a distribution differ from one another.

SD is the best measure of spread of an approximately normal distribution. This is not the case when there are extreme values in a distribution or when the distribution is skewed, in these situations interquartile range or semi-interquartile are preferred measures of spread. Interquartile range is the difference between the 25th and 75th centiles. Semi-interquartile range is half of the difference between the 25th and 75th centiles. For any symmetrical (not skewed) distribution, half of its values will lie one semi-interquartile range either side of the median, i.e. in the interquartile range. When distributions are approximately normal, SD is a better measure of spread because it is less susceptible to sampling fluctuation than (semi-)interquartile range.

If a variable y is a linear (y = a + bx) transformation of x then the variance of y is b² times the variance of x and the standard deviation of y is b times the variance of x.

The standard error of the mean is the expected value of the standard deviation of means of several samples, this is estimated from a single sample as:

[s is standard deviation of the sample mean, n is the sample size]

Skewness describes the asymmetry of a distribution. A skewed distribution therefore has one tail longer than the other.

A positively skewed distribution has a longer tail to the right:

A negatively skewed distribution has a longer tail to the left:

A distribution with no skew (e.g. a normal distribution) is symmetrical:

In a perfectly symmetrical, non-skewed, distribution the mean, median and mode are equal. As distributions become more skewed the difference between these different measures of central tendency gets larger.

Positively skewed distributions are more common than negatively skewed ones.

A coefficient of skewness for a sample is calculated by StatsDirect as:

– where xi is a sample observation, x bar is the sample mean and n is the sample size.

Skewed distributions can sometimes be “normalized” by transformation.