Escolar Documentos
Profissional Documentos
Cultura Documentos
Market Segmentation
Pavel Brusilovsky
Objectives
There appear to be more algorithms for clustering data than data to analyze
•The nature of Cluster Analysis is data exploration that conducted in repetitive fashion.
Clusterization is not a single grouping, but the process of getting well interpretable groups of
objects under consideration.
•Supervised classification, for example, Discriminant Analysis, Naïve Bayes Classifier, Support
Vector Machines, etc.
–Have class label information
•Simple segmentation
–Doctors’ segmentation by specialty, assuming that each doctor’s specialty is known
–Customer segmentation by sex, education level, geography and response rate (assuming that
these customer attributes are known)
•Results of a query (groupings are the outcome of an external specification)
Supervised vs. Unsupervised
Cluster analysis is a product of at least two different quantitative fields: statistics and machine
learning
Machine learning
- Unsupervised is a learning from raw data (no examples of correct classification). In other words,
class label information is unavailable.
- No measure of success
- Heuristic arguments for judgments
- Lots of methods developed
- Supervised is a learning from data where the correct classification of examples is given (class
label information is available)
Market segmentation
–The better the segments chosen for targeting by a particular organization, the more successful
the organization is in the marketplace. The objectives are accurately predict the needs of
customers and improve the profitability.
•Demographics
–Age
–Gender
–Education
–Income
–Home ownership, etc.
•Psychographics
–Lifestyle
–Attitude
–Beliefs
–Personality
–Buying motives, etc.
•Brand Loyalty
•Geography
–State
–ZIP
–City size
–Rural vs. Urban, etc.
•Help marketers discover distinct groups in their customer bases, and then use this knowledge to
develop targeted marketing programs
•The underlying definition of cluster analysis procedures mimic the goals of market segmentation:
–to identify groups of respondents that minimizes differences among members of the same group
•highly internally homogeneous groups
–while maximizing differences between different groups
•highly externally heterogeneous groups
•Identifiability
–Can we see clear differences between segments?
•Substantiality
–Are the segments large enough to warrant separate marketing targeting?
•Accessibility
–Can we reach our customers?
•Stability
–Do our segments stable over a certain period of time?
•Responsiveness
–Is the response to our marketing effort segment specific?
•Actionability
–Do the segmentation provides direction of marketing efforts?
Types of Clustering
•Partitional clustering
–A division of objects into non-overlapping subsets (clusters) such that each object is in exactly
one cluster
•Hierarchical clustering
–A set of nested clusters organized as a hierarchical tree
•Optimal number of clusters is determined as a result of LCCA, using rigorous statistical tests
•No decisions have to be made about the scaling of the observed variables
•Variables maybe continuous, nominal, ordinal, count, or any combination of these
- Is clustering a theory?
- A theory could be true or false
Unlike a theory, a clustering is neither true nor false, and should be judged largely on the
interpretability and usefulness of results
- No measure of success
However, a clustering may be useful for suggesting a theory, which could then be tested
References
- Leonard Kaufman and Peter Rousseeuw (2005), Finding Groups in Data: An Introduction to
Cluster Analysis, Wiley Series in Probability and Statistics, 337 p.
- Mark Aldenderfer and Roger Blashfield (1984), Cluster Analysis (Quantitative Applications in the
Social Sciences), SAGE Publications, Inc., 90 p.
- Brian Everitt, Sabine Landau and Morven Leese (2001) Cluster Analysis, Oxford University
Press, 248 p.
- Marketing Segmentation (http://www.beckmanmarketing8e.nelson.com/ppt/chapter03.pps)
•Producer and distributor of health and beauty products launched a new product. The product can
be ordered only on the website.
•In six month an internet survey was conducted. Only three simple questions were asked:
–How many adults are in your household?
–How many of them adopted the product?
–How many of them did not adopt the product?
•When the total number of adopters and non-adopters is less than the number of adults in a
household, the difference is treated as the number of unknowns. There are some other situations
when the number of unknown makes sense to introduce.
•The client asked us to analyze the survey data (obviously it is not the most
informative survey BI Solutions dealt with).
•The objectives of the study was to extract as much as possible useful information from the
survey data in order to understand the distribution and the usage of the product among
households, associate with each household a corresponding likelihood of adoption, and develop
methodology to employ this info in the marketing programs.
•Methodology: synergy of cluster analysis of proportional data and intuitive
segmentation.
Clustering of households
Next steps
•Customer profiling
–Data enrichment
•Data enrichment (ZIP level census data)
•Estimation of the likelihood of the product adoption by data mining predictive analysts / scoring
households with unknown purchasing behaviour
•Identifying customers with high likelihood of the product adoption for targeting
Search