Covariance and Correlation

Covariation is a measure of the linear relationship between two variables. It tells us how much two variables vary together, and is computed as follows:

$$Cov(X,Y)=E[(X-μ_X )(Y-μ_Y )]=\sum_i P(X=x_i),P(Y=y_i)[(x_i-μ_X)(y_i-μ_Y)]$$ $$=E(XY)-μ_X E(Y)-μ_Y E(X)+μ_X μ_Y$$ $$\mathbf{Cov(X,Y)= E(XY)-μ_X μ_Y}$$ $$Cov(X,X)= E[(X-μ_X)^2]=Var(X)$$

The covariance is somewhat difficult to interpret since it is scale dependent. Bigger covariance does not necessarily mean the stronger the relationship. For clearer interpretation, we remove scale dependency by standardizing the covariance. The resulting measure is referred to as correlation: $$\mathbf{Corr(X,Y)=\frac{Cov(X,Y)}{σ_X σ_Y}}$$

Correlation is essentially a way of determining the existence of a relationship between two variables, and it provides a measure of the strength of the relationship. Points to note:

Correlation is unit‐free.
Corr(X, Y) is always between ˗1.0 and 1.0.
Corr(X, Y) > 0: X and Y are positively correlated. Both variables tend to move in tandem.
Corr(X, Y) < 0: X and Y are negatively correlated. An increase in one variable is associated with a decrease in the other.
Corr(X,Y) = 1.0: perfect positive linear relationship.
Corr(X,Y) = 0: no linear relationship between X and Y.
Corr(X,Y) = −1.0: perfect negative linear relationship.

Previous Next

Use the Search Bar to find content on MarketingMind.

Login to gain unrestricted access to MarketingMind. (Access fee USD 37)

Gain unrestricted access to MarketingMind. Fee: USD 37

MarketingMind

Basic Statistics

Gain unrestricted access to MarketingMind.

Fee: USD 37