The choice of variables is crucial to the validity and utility of a regression model.
The omission of relevant independent variable can severely distort the
findings, particularly when the variable relates to a distinct influence, yet coincides (correlates)
with some other independent variable.
For example, in Singapore, consumer promotions of some soft drink brands occur mainly
during the Hungry Ghost festival. If the festive seasonality is not incorporated into the regression
model, the discount price elasticity of these brands is greatly exaggerated as they soak up the impact
of the festive season.
The inclusion of irrelevant variables is also a concern. It reduces the model’s
parsimony and may also mask or replace the effects of more useful variables.
A separate issue pertains to errors in the measurement of variables, missing data points
for instance. Where it is necessary to include the data, analyst make use of a variety of techniques to
clean data and incorporate missing information.