Basic Statistics

ANOVA

Conceptually similar to the t-test, ANOVA (analysis of variance) tests whether the means of three or more groups are equal.

The test statistic is the f-ratio, which is essentially:

$$ F=\frac {variance \;between \;groups}{variance \;within \;groups} $$

If the between-groups variation is large compared to the within-groups variation (f-ratio >> 1), it is more likely that the groups have different characteristics.

The test is called the f-test, and the p-value is obtained by referencing the f-ratio in the f-distribution.

Note: A p-value from f-ratio calculator is provided on this webpage. The data analysis add-in in excel provides an easy-to-use facility to conduct the f-test.

H₀: μ₁ = μ₂ = μ₃ = μ₄ = … = μ_k

H_A: Not all population means are equal

α = 5% (usually)

k: number of groups

Group Means: x̄₁,x̄₂, x̄₃, x̄₄ … x̄_k

Group Variances: s₁², s₂², s₃², s₄² … s_k²

Group Sample Sizes: n₁, n₂, n₃, n₄ … n_k

Total All Samples: n_T = n₁+n₂+n₃+n₄ … +n_k

$$ Average \;All \;Samples: \bar{\bar{x}} = \frac {\sum \bar x × n}{\sum n} $$ $$ F=\frac {variance \;between \;groups}{variance \;within \;groups}=\frac {MS_{between}}{MS_{within}}=\frac{SS_{between}/(k-1)}{SS_{within}/(n_T-k)} $$

Where MS is the mean square, SS is sum of squares, and k ˗ 1 and n_T ˗ k are the degrees of freedom.

$$ SS_{between}=\sum_{j=1}^k n_j (\bar x_j - \bar{\bar{x}})^2 $$ $$ SS_{within}=\sum_{j=1}^k \sum_{i=1}^{n_j} (x_{ji} - \bar x_j)^2 = \sum_{j=1}^k(n_j-1)s_j^2 $$

Low	Mid	Low Upper	High Upper
8	10	13	17
10	12	15	19
12	14	17	21
mean = 10	mean = 12	mean = 15	mean = 19

Exhibit 34.23 Household consumption of wine in litres/year, for 12 respondents, 3 in each income group — low, mid, low upper and high upper.

Example: Household consumption of wine in litres/year, across various income groups is provided in Exhibit 34.23. $$ SS_{between}=\sum_{j=1}^k n_j (\bar x_j - \bar{\bar{x}})^2 $$ $$ \qquad=3×(10-14)^2+ 3×(10-14)^2+3×(10-14)^2 $$ $$ \qquad=138 $$ $$ SS_{within}=\sum_{j=1}^k (n_j-1)s_j^2,\quad s_j^2=\frac{1}{n_j-1} \sum_{i=1}^{n_j}(x_{ji}-\bar x_j)^2 $$ $$ SS_{within}= \sum_{j=1}^k \sum_{i=1}^{n_j} (x_{ji} - \bar x_j)^2 $$ $$ SS_{within}=(8-10)^2+(10-10)^2+(12-10)^2 $$ $$\qquad+(10-12)^2+(12-12)^2+(14-12)^2 $$ $$\qquad+(13-15)^2+(15-15)^2+(17-15)^2 $$ $$\qquad+(17-19)^2+(19-19)^2+(21-19)^2 $$ $$\qquad= 32 $$ $$ F= \frac {138/(4-1)}{32/(12-4)}=11.5 $$

p-value = 0.003 < α=5%. Reject the null hypothesis.

The data suggests that the consumption of wine varies significantly across different household income levels.

Incidentally, regression analysis with dummy variables could also be used instead of ANOVA to determine the size and the direction of the differences in the mean values. For instance, for the previous example: $$Consumption = α + β × Income \, Class, $$ $$\text{Where α is the intercept and coefficient, β, quantifies the effect size.}$$

Previous Next

Use the Search Bar to find content on MarketingMind.

Contact | Privacy Statement | Disclaimer: Opinions and views expressed on www.ashokcharan.com are the author’s personal views, and do not represent the official views of the National University of Singapore (NUS) or the NUS Business School | © Copyright 2013-2025 www.ashokcharan.com. All Rights Reserved.

Login to gain unrestricted access to MarketingMind. (Access fee USD 37)

Gain unrestricted access to MarketingMind. Fee: USD 37

MarketingMind

Basic Statistics

Exhibit 34.23 Household consumption of wine in litres/year, for 12 respondents, 3 in each income group — low, mid, low upper and high upper.

Gain unrestricted access to MarketingMind.

Fee: USD 37