Telecom Case Study – Customer Segmentation
For the last few articles we have been working on a telecom case study to create customer segments (Part 1, Part 2 and Part 3). In this case, you are the head of customer insights and marketing at a telecom company, ConnectFast Inc. Recall, in the first part, you have created cluster centroids through iterative calculation of Euclidean distances. Remember, the objective of iterative calculations was to adjust the centroids to place them at the center of the cluster members (see Part 1). Have a look at the animation below (you have seen this data in part 1 of this series); with each iteration Standard Sum of Error (SSE) is reducing. For the time being, don’t worry about the calculation of SSE but try to understand its purpose. In the animation, we started with 29 as the original value of SSE for the original random seeds and converged to stabilized SSE of 7 for the final iteration (further iterations won’t change SSE or positions of cluster centroids). This is absolutely the objective to iteratively reduce SSE till it gets stabilized – and voila! You have found your cluster centroids / black holes (see Part 1). As discussed in the previous article most machine-learning algorithms try to iteratively converge to an optimal solution. For cluster analysis the idea is to minimize SSE iteratively. I hope you have noticed, this is somewhat similar to Archimedes’ method of converging to the value of π (discussed in the previous article).
Coming back to the case study, you are at the final stages of customer segmentation exercise to form clusters based on customers’ services usage behavior. As a telecom company ConnectFast offers several services on top of their existing cellphone plan (with prepaid and postpaid billing), some of them are listed below
- National/international calling
- National/international roaming
- 2G/3G/4G internet plans
- National /international data roaming
Before moving further, let us try to generate some intuitive feel for customer segmentation using cluster analysis. For simplicity let us consider just 3 different services (i.e. variables: international/ national roaming, and 3G) with 4 levels each (i.e. attributes: non-usage, low, medium and heavy usage). This is displayed in the adjacent figure. Theoretically, there could be (4)3 or 64 maximum clusters that can be formed. However, after our analysis for customer segmentation we have generated just 4 clusters (displayed as orange customer segments). Let us take a pause and think about it for a while, there are 64 difference locations where customers could be found based on their services usage behavior. However, the major density of customers is located at 4 clusters detected through cluster analysis. I hope you could see some relationship with universe and galaxies (discussed in Part 1) here, the mass is concentrated in limited areas with majority of white space.
For ten variables that you will be using in your analysis for ConnectFast, with 4 levels each you could have a little more than one million clusters i.e. (4)10. Now one of the biggest challenges with cluster analysis, as also discussed in a previous article, is to choose the right number of clusters before the analysis. That is you need to know the exact number of clusters that you are going to form before you run your cluster analysis through K-mean algorithm (K is the number of clusters one wants to form or the number of initial cluster seeds you provide to the algorithm). The best solution to the above challenge is a mix of analytical methods and business acumen to arrive at the initial number of cluster seeds. Business acumen is something you generate over a period of time by developing intuitive feel about the business. In the next segment, let us focus on the analytical procedure to form optimal customer segments.
Optimizing the Number of Clusters
One of the useful analytical methods to choose the optimum value of K is to plot stabilized SSEs vs. the number of cluster seeds (i.e. different value of K in the K-mean clustering). An illustration for this is shown in the adjacent figure (Graph A). This is the graph you have got for your own analysis with the ConnectFast’s data and 10 variables. On a technical note, an outer loop that performs cluster analysis with incremental value of K generates the values for this kind of plot. You have plotted the same with number of clusters seeds on the horizontal axis and minimum or stabilized SSE on the Y axis. There is a significant drop in the value of SSE as you have moved from 9 initial cluster seeds to 10. Your business sense justifies the presence of 10 customer segments hence you have decided to stick with it.
You are feeling good because you know you were lucky with the output plot. The definitive clues you have got from plotting SSEs and cluster numbers may not have been as clear i.e. you might have got a smooth line as shown in the adjacent plot (Graph B). In this case you may have to rely completely upon your business sense. Now you are left with a final task for customer segmentation of naming these clusters based on their attributes.
You have completed the task of naming the 10 customers segments. The customer segments are arranged in the descending order of value to the company. The following are the highest and lowest value customer segments
1) Affluent corporate – very high spenders, have more than 4 services activated, high-usage on most services, predominantly senior management in large corporates, frequent foreign/domestic travelers, and high profit to the company
10) Stingy prepaids – low spenders, barely use one service, run their phone on minimum prepaid amount, mostly enjoy free incoming calls, and high cost to the company
After naming the customer segments you have performed some quick analysis on some of the company’s key performance Indicators (KPI). In your analysis you have found some crucial information that you will share with the CEO and the COO of the company to redefine the company’s business strategy.
An Application of Customer Segmentation
For the last few years there is a special emphasis on customer attrition or churn rate – a concern for the industry after implementation of number portability by the telecom regulators. The chief operating officer (COO) of the company was set on the task to keep a close check on the churn rate as a major part of his responsibility when he joined four years ago. There is a constant communication to product managers on the field to keep a close tab on customer churn. On top of the things their effort is certainly showing positive influence as the churn rate is gradually decreasing (shown in the adjoining figure). However, if we analyze the churn rate across different segments we will get a completely different picture.
Let us have a look at customer churn rate across two segments with the highest and lowest value to the company. Shockingly, the churn rate for ‘Affluent corporate’ (the highest value customer) is steadily increasing at a worrying pace. On the other hand, ‘Stingy prepaids’ are enjoying the hospitality of the company and showing steady decline in 18 months vintage churn rate. The rates for these two segments are counterbalancing the overall churn. This is certainly a strong evidence for the management to modify their business strategy and focus on right business metrics.
What we have seen in the above case study is not very unusual. These signals about portfolio deterioration often go unnoticed until it is too late for dynamic customer portfolios – where the customers are moving in and out at a high velocity and volume. Creation of static frames or cohorts along with customer segmentation is a very helpful analytical tool to keep a close tab on building a healthy customer base. See you soon with a new topic.