Data mining is the process of scouring and analysing large datasets, and extracting patterns from the data. Data mining techniques combine methods from statistics and machine learning, with database management, to predict behaviours and trends. Data mining allows marketers to take proactive, knowledge-driven decisions. Application areas include:
Tools used for data mining include neural networks, decision trees, association rule learning, rule induction, genetic algorithms, nearest neighbour, cluster analysis, classification, and regression. Some of these tools are described below.
Rule induction is an area of machine learning in which formal rules are extracted from a set of observations. The rules extracted may represent a full scientific model of the data, or merely represent local patterns in the data.
The rules are usually stated as expressions of the form:
Association rule learning is a method for discovering interesting relationships (association rules based
on the concept of strong rules) among variables in databases. It deploys a
range of algorithms to identify strong rules in databases using different
measures of “interestingness”. For example, shopping basket analysis
of loyalty panel data is used to discover interesting relationships between
products such as
(i.e. shoppers who buy cheese and bread also tend to buy wine). Information of this nature may be used for merchandising (e.g. special displays) and promotional activities.
Association rule learning is also used in a variety of other applications including web usage mining, intrusion detection, continuous production, and bioinformatics.
Genetic algorithms optimization techniques are based on the concepts of genetic combination, mutation, and natural selection. Potential solutions are encoded as “chromosomes” that can combine and mutate. Survival within a modelled “environment” depends on fitness or performance of each individual chromosome in the population. These “evolutionary” algorithms are well-suited for solving nonlinear problems. Examples of applications include speech recognition, robotics, planning and scheduling, optimizing portfolio investments and so on.
Classification techniques identify the categories where a new observation belongs, based on a set of variables and a training data set containing observations whose category membership is known. The classification rules are derived from the training data set, and the algorithm is referred to as a classifier. Applications include assigning an email into “spam” or “non-spam”, or predicting customer behaviour in terms of purchasing, consumption, churn and so on.
Because they use training sets, classification techniques are described as supervised learning. Cluster analysis on the other hand, is unsupervised learning.
Nearest neighbour is a technique that classifies records in a database based on their similarity.
Cluster analysis is a statistical technique used to form groups of objects with similar characteristics into clusters (segments). In cluster analysis the variables used for clustering are known in advance. Refer to Chapter Segmentation for details on the application of cluster analysis for market segmentation.
Note: To find content on MarketingMind type the acronym ‘MM’ followed by your query into the search bar. For example, if you enter ‘mm consumer analytics’ into Chrome’s search bar, relevant pages from MarketingMind will appear in Google’s result pages.
Is marketing education fluffy too?
Marketing simulators impart much needed combat experiences, equipping practitioners with the skills to succeed in the consumer market battleground. They combine theory with practice, linking the classroom with the consumer marketplace.