Mean and variance are data descriptors. The mean
(μ), the average of an observed population, is computed as:
$$µ=\frac{\sum X}{N} $$
Where: Σ means “the sum of”, X = all the individual items in the group,
and N = the number of items in the group.
Variance (σ²), a measure of spread within the data, measures how far
each number in the set is from the mean. It is computed by taking the differences between
the numbers in the population and the mean, squaring the differences and dividing the sum
of the squares by the number of values in the set, i.e.:
$$ σ^2 = \frac{\sum (X-μ)^2}{N} $$
Consider the following populations and their descriptive statistics:
Population I: {1,2,3,4,5,6,7,8,9,10}
- mean (μ)= 5.5,
- variance (σ2) = 9.17,
- standard deviation (σ)= 3.03,
- relative standard deviation (σ') = σ/μ = 3.03/5.5 = 0.55
Population II: {10,20,30,40,50,60,70,80,90,100}
- μ = 55,
- σ2 = 917
- σ=30.28,
- σ'= 0.55
Population III: {11,12,13,14,15}
- μ = 13,
- σ2 = 2.5,
- σ=1.58,
- σ'= 0.12
The relative standard deviation (σ'), also known as relative standard
error (RSE) and coefficient of variation is a scale-invariant measure of variability.
Note that population II differs from population I by a factor of 10. The data
in these populations might be the same, but recorded in different units, such as millimetres
and centimetres. Due to the difference in scale, the variance and the standard deviation of
these populations differ substantially, but the relative standard deviation is exactly the
same. This is why, in the context of sampling, the relative standard deviation is a more
meaningful measure for spread.
Population III, on the other hand, exhibits a much lower relative standard
deviation, indicating a lower spread within this dataset.
Note that if a variable Y = βX, then the variance Var(Y) = β2X.
Therefore, if X ∈ Population I and Y ∈ Population II, then β = 10, and the
computed variances differ by a factor of β2=100.