Sample Size — Comparative Studies

A comparative study compares two or more groups to establish whether there are statistically significant differences between the groups with respect to some parameter(s).

Examples include control group and test group (controlled store tests), users and non-users of a brand and before-and-after analysis.

The study tests the null hypothesis (H0) that the groups exhibit no differences in responses. The alternative or research hypothesis (HA) represents a difference between groups, i.e. rejection of the status quo.

This section focuses on sampling. Given the non-null difference that the analyst wants to detect, what is the size of the sample(s) required to come to a conclusion at specified levels of accuracy.

Comparison of two means (independent): In addition to α and β (type I and type II errors), the sample size estimate depends on the standard deviation of the population and the smallest effect or non-null difference that the analyst wants to detect.

The formula for sample size for two groups of equal sample sizes:

$$ n = \frac {2 (z_α + z_β)^2 }{(δ/σ)^2} $$

δ = |μ0 − μ1| is the detectable difference in the mean.

σ: population variance.

zα: standardized value associated with α, the level of significance.

zβ: standardized value associated with β, the type II error.

For α = .05, zα = 1.96; for β = .20, zβ =0.84.

$$ M (multiplier) = 2 (z_α+z_β)^2=2(1.96 +0.84)^2=15.68 ≈16 $$ $$ n=\frac{M}{∆^2},\;\;\;where\; ∆ =\frac{|μ_0- μ_1|}{σ}=δ/σ $$

Δ is the standardized difference between means, or the effect size (ES). It is measured in units of the standard deviation and represents the magnitude of difference in means.

The multiplier, M, varies with power as depicted in Exhibit 33.7.

(1 – β)
One sampleTwo sample

Exhibit 33.7 Multiplier for α =.05, for different power settings.

The detectable difference, |μ0 − μ1|= σ√(M/n)

For the conventional setting of power = 0.8, M = 16:

$$ n=\frac{16}{∆^2}, |μ_0- μ_1|=4σ/\sqrt{n} $$

This is the sample size for each of the two groups. If there is only one sample, the multiplier is 8, and n = 8/Δ2.

Taking the earlier example of the weight of fresh recruits into the army, if the magnitude of the difference of practical relevance is 400gm or more, and the standard deviation is 3.2 kg, in that case:

$$ ∆=\frac{400}{3200}=0.125; \;n=\frac{8}{∆^2} =512 $$

We need a sample of 512 recruits to detect a difference of 400gm in weight.

Comparison of two means (dependent): For paired observations such as before and after, the sample size formula is:

$$ n = \frac{(z_α + z_β)^2}{(δ/σ)^2} $$

Note the multiplier, M, has been halved, and, importantly, σ is the standard deviation of the differences within pairs, which is rarely known in advance.

Comparison of two proportions: The sample size required to compare two proportions, p0 and p1:

$$ n = \frac{M p(1 - p)}{(p_0 - p_1)^2} \;\;\;where \; M \;is \;the \;multiplier \;and \;p = (p_0+ p_1)/2 $$

Taking the conservative estimate p = 0.5 results in upper limit on the required sample size:

$$n = \frac {M}{4×(p_0 - p_1)^2} $$

Furthermore, if power is set at the conventional level of 0.8, the multiplier M becomes 16:

$$n = \frac {4}{(p_0 - p_1)^2} $$

For small proportions, p<.05, use: n = 4/(√p0 - √p1)2

Previous     Next

Note: To find content on MarketingMind type the acronym ‘MM’ followed by your query into the search bar. For example, if you enter ‘mm consumer analytics’ into Chrome’s search bar, relevant pages from MarketingMind will appear in Google’s result pages.