You’ve already conducted surveys and collected data about your customers’ preferences (read ‘Smart Surveys’ post). At first sight, the data looked scattered, and each survey response varied in some way. You faced a new challenge: to find groups of customers with similar preferences for certain product attributes, and to define your market segments.

Market segmentation is one of the tools to succeed in the market and match your competitors, who have also started to recover from the crisis. Luckily, using analytics, you can identify the groups of customers that desire your unique product or service. All markets have segments. Even commodity products, like laundry detergent, have segments: travelers prefer small packages for portability, while people with budget constraints prefer larger sizes to benefit from cheaper per-ounce prices. Identifying your market segment helps you focus your core competencies on the relevant markets and use your company resources more efficiently.

You can do market segmentation in various ways. I’ll describe the most exciting method, which can help you in a way that no other method can. I’m talking about clustering. The greatest thing about clustering is that you can discover market segments without knowing anything about your market or any fine points of the business. In clustering, all you need is a database, such as survey results that someone has already collected, or even unsorted records from a supermarket cash machine. In this so-called ‘post-hoc’ market segmentation, you have little knowledge about the type and quantity of the segments in a particular market. The post- hoc segmentation approach determines the segments after research has been conducted and data collected. For example, a new product or service could require post-hoc segmentation techniques, because there has been no earlier consumer experience of the product.

There are two clustering methods: hierarchical and partitioning. In addition, hierarchical clustering can be agglomerative or divisive. Think of agglomerative clustering as of sticking together pieces of baker’s duff. Divisive clustering is tearing the duff apart. The baker’s duff itself is a metaphor for the database. Whether divisive or agglomerative, you have to decide what minimal cluster size is acceptable to you. Usually the answer comes by itself when you see the first results.

Partitioning clustering is simpler in this sense. In partitioning, you have to determine the number of clusters you want to find in the data. Much like hierarchical clustering, you can change the number of clusters as many times as you like, until the size of the cluster does not satisfy you.

There are plenty of clustering methods, and they differ in speed and accuracy. Since data volumes have grown with development of computers, new methods keep appearing on the scene. For example, a hierarchical clustering method called Ward’s was known for a long time, but for basic marketing analysis IBM promoted its own hierarchical method, called TwoStep. Overall, hierarchical methods are slower than partitioning, except for IBM’s TwoStep, which is implemented in IBM’s SPSS software. The TwoStep clustering method can create clusters from a single data scan. Although the method has certain limitations – for example, it can be sensitive to the order of cases in the data – but it is very easy to use.

Ward’s method is another popular example of agglomerative hierarchical clustering. The result of the clustering is a hierarchical structure similar to a family tree, called a dendrogram. In dendrograms, numerous tree branches merge into fewer and thicker branches until the last two merge to form a tree root. In Ward’s, we start with individual elements and merge them together into clusters. During the merging process, our goal is to lose as little information as possible. For example, if we group two individuals into a cluster, the cluster will not be as precise as each individual on its own. The information lost during clustering is sometimes referred to as the merging cost. To reduce the merging cost, Ward’s method minimizes the error sum of squares.

The most popular partitioning clustering method is K-means. In K-means, you simply specify K, the number of final clusters you expect, and run the algorithm. The algorithm consists of steps, which continue to iterate until a stable solution is reached. This is the moment when individual survey respondents converge into specific groups and cease to change groups. Firstly, the K-means determines so-called ‘centroid coordinates’ that define the center of the new segment using a weighted mean, similar to finding a center of mass in physics. The algorithm then calculates the distance of each individual object to the centroid, and forms groups based on the shortest distance. When the clusters, or market segments, form, it is up to you to accept the segmentation results or repeat the analysis using other parameters. There’s no silver bullet here.

Imagine you decided to be in the coffee-machine business. You conducted a market survey and did some conjoint analysis. You ended up with a database full of part-worths from individual survey respondents. Now is the time to apply clustering and to find your market segments. You designed the survey to reveal preferences for females and males who work at home or in an office, weighted against the price, speed, and size of the coffee machine. The clustering algorithm, applied to part-worths (or willingness to pay), found two distinct sets of respondents: females working in an office with a high willingness to pay for the speed of the coffee machine and low part-worths for the price itself; and male working from home with high part-worths for the machine volume and a low willingness to pay for an expensive product. Using this knowledge, you can identify your market segments: fast and premium coffee machines for women at work, and high-capacity, budget machines for men working at home.

Always collect data as much as possible. Collect, record, store and analyze. Your efforts will pay off.