{"id":9519,"date":"2017-01-08T14:54:07","date_gmt":"2017-01-08T09:24:07","guid":{"rendered":"http:\/\/ucanalytics.com\/blogs\/?p=9519"},"modified":"2017-01-28T14:19:00","modified_gmt":"2017-01-28T08:49:00","slug":"cluster-analysis-learn-by-doing-analytics-challenge-part-1","status":"publish","type":"post","link":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/","title":{"rendered":"Cluster Analysis Puzzle &#8211; Learn by Doing! (Part 1)"},"content":{"rendered":"<div id=\"attachment_9520\" style=\"width: 315px\" class=\"wp-caption alignright\"><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis.jpg\"><img aria-describedby=\"caption-attachment-9520\" data-attachment-id=\"9520\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/twins-and-cluster-analysis\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis.jpg?fit=427%2C598&amp;ssl=1\" data-orig-size=\"427,598\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Twins and Cluster Analysis\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis.jpg?fit=214%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis.jpg?fit=427%2C598&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-9520\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis.jpg?resize=305%2C427\" width=\"305\" height=\"427\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis.jpg?w=427&amp;ssl=1 427w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis.jpg?resize=179%2C250&amp;ssl=1 179w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis.jpg?resize=214%2C300&amp;ssl=1 214w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis.jpg?resize=21%2C30&amp;ssl=1 21w\" sizes=\"(max-width: 305px) 100vw, 305px\" data-recalc-dims=\"1\" \/><\/a><p id=\"caption-attachment-9520\" class=\"wp-caption-text\">Twins and Cluster Analysis &#8211; by Roopam<\/p><\/div>\n<hr \/>\n<p>Cluster analysis is a powerful analytical technique to group or segment identical elements i.e. customers, products etc. In this series of articles, you will explore nuances of cluster analysis and its applications. Analytics challenges, on YOU CANalytics, are designed like puzzles where your participation is extremely important to move things forward. Hence, please share your thoughts and answers in the discussion section at the bottom.<\/p>\n<p>Cluster analysis has several business applications where it plays a pivotal role:<\/p>\n<p><strong>&#8211; Lifestyle or psychographic segments for marketing:<\/strong> grouping customers into clusters based on their interests, and belief systems. This, in turn, helps the marketers offer the right product to the right customer.<\/p>\n<p><strong>&#8211; Product grouping<\/strong>: group products into relevant\u00a0categories bases on product attributes. For instance, clubbing\u00a0movies into different genres i.e. action, rom-com, horror etc.<\/p>\n<p><strong>&#8211; News\/content categorization<\/strong>: identification of categories for media contents based on text mining, and organizing contents. Google does this extensively to show you the right content and news.<\/p>\n<p>Cluster analysis is an unsupervised analytical technique. This does not mean that it runs on its own without any supervision. On the contrary, it requires the\u00a0analysts to have an extremely good understanding of the business context and problem. This is essential to choose the right set of input variables for segmentation. This requires a greater degree of creativity and cognizance from the analysts. Let us explore how the initial choice of input variables can be critical for cluster analysis by creating a link between&#8230;<\/p>\n<h2><span style=\"color: #3366ff;\">Twins Paradox and Cluster Analysis<\/span><\/h2>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Ram_Aur_Shyam_poster_18458.jpg\"><img data-attachment-id=\"9542\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/ram_aur_shyam_poster_18458\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Ram_Aur_Shyam_poster_18458.jpg?fit=300%2C324&amp;ssl=1\" data-orig-size=\"300,324\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Ram_Aur_Shyam_poster_18458\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Ram_Aur_Shyam_poster_18458.jpg?fit=278%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Ram_Aur_Shyam_poster_18458.jpg?fit=300%2C324&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\" wp-image-9542 alignright\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Ram_Aur_Shyam_poster_18458.jpg?resize=264%2C285\" alt=\"\" width=\"264\" height=\"285\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Ram_Aur_Shyam_poster_18458.jpg?w=300&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Ram_Aur_Shyam_poster_18458.jpg?resize=231%2C250&amp;ssl=1 231w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Ram_Aur_Shyam_poster_18458.jpg?resize=278%2C300&amp;ssl=1 278w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Ram_Aur_Shyam_poster_18458.jpg?resize=28%2C30&amp;ssl=1 28w\" sizes=\"(max-width: 264px) 100vw, 264px\" data-recalc-dims=\"1\" \/><\/a>The quintessential Bollywood script, in the 1970s and\u00a080s, involved identical twins getting separated at birth to be reunited as adults. These twins get completely different upbringings. There was enough drama before the reunion that entertained the audience without fail. The lost-and-found siblings thus became the most trusted formula in the history of Bollywood cinema. By the way, identical twins\u00a0are the closest example in nature for genetic similarity. When Bollywood was delivering hit after hit using the formula, a researcher on the other side of the globe was studying several identical twins separated\u00a0at birth.<\/p>\n<p>In his study, Thomas Bouchard Jr. of the University of Minnesota analyzed identical twins adopted by different families.\u00a0The question was to identify the role of nature (genetics) and nurture (upbringing) on personality.\u00a0Most of these twins didn&#8217;t meet each other after their separation at birth till they were fully-grown adults.<\/p>\n<h4><span style=\"color: #3366ff;\">Results from the Twins Study<\/span><\/h4>\n<p>There are several interesting stories and patterns in this research. In one instance, twin brothers, James Lewis and James Springer, were living almost similar lives oblivious to each other&#8217;s existence. Both of them married and divorced their first wives named Linda to marry their second wives named Betty. They both named their childhood pets Toy. Guess what were the names of their firstborn sons? James Alan Lewis and James Allan Springer. In another instance, another twins, Oskar and\u00a0Jack, were raised by families as dissimilar as chalk and cheese. Oskar was raised as a Nazi youth in Germany and Jack as a Jewish boy in the Caribbean. Despite this, on the day of their reunion as adults they unknowingly showed up wearing identical clothes. Apparently, independently these brothers had developed a similar taste in fashion.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/twins.jpg\"><img data-attachment-id=\"9597\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/twins\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/twins.jpg?fit=345%2C468&amp;ssl=1\" data-orig-size=\"345,468\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"twins\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/twins.jpg?fit=221%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/twins.jpg?fit=345%2C468&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\" wp-image-9597 alignright\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/twins.jpg?resize=268%2C364\" alt=\"\" width=\"268\" height=\"364\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/twins.jpg?w=345&amp;ssl=1 345w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/twins.jpg?resize=184%2C250&amp;ssl=1 184w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/twins.jpg?resize=221%2C300&amp;ssl=1 221w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/twins.jpg?resize=22%2C30&amp;ssl=1 22w\" sizes=\"(max-width: 268px) 100vw, 268px\" data-recalc-dims=\"1\" \/><\/a>These stories, however, do not prove that an identical twin is always a mirror image of her co-twin in terms of personality. \u00a0There are enough instances where twins have completely different personalities to each other. Thomas Bouchard in his study was interested in understanding the role of proximity or upbringing on twins developing a similar personality.\u00a0In a much wider analysis of a large number of twins, he found completely counter-intuitive results. The results suggest that proximity or upbringing plays absolutely no role in the development of personality or interests. A\u00a0twin has the same probability of having a similar personality to his co-twin irrespective of whether they were brought up together or apart. The most significant parameter to develop a personality, according to the twins study, is nature or genes and not nurture or upbringing.<\/p>\n<h4><span style=\"color: #3366ff;\">How is this Related to Cluster Analysis?<\/span><\/h4>\n<p>So, how is this related to the choice of variables in cluster analysis? Ok at this point, I must remind you that cluster analysis, unlike supervised machine learning methods, does not identify a small set of significant variables from the large list of input variables. Hence, it segments the population based on all the input variables. This is where analysts need to be careful with cluster analysis while choosing the appropriate input variables based on the problem at hand. Let me try to explain this using the results from the twins study.<\/p>\n<p>Imagine based on the twins study, we have \u00a0200 different pairs of twins. 100 of these pairs were brought up together and 100 were brought up by different adopted parents. Also assume, 30% pairs of twins\u00a0in\u00a0either of these groups share a similar personality. Now, if you cluster these pairs of twins based on either proximity variables (i.e. shared houses, schools, parenting etc.) or personality variables (i.e. shared interests, attitudes, beliefs), you will get\u00a0completely different sets of clusters. Neither of these sets of clusters are wrong but are appropriate based on the problem at hand. Cluster analysis is extensively used for customer segmentation and profiling. Customer segmentation is not very different from clustering the twins in our example. Therefore the choice of input variables determines the customer segmentation. This is similar to the choice of either proximity or personality variables for the twins.<\/p>\n<p>Now, let&#8217;s take a nose dive towards the cluster analysis puzzle:<\/p>\n<h2><span style=\"color: #3366ff;\">Analytics Challenge &#8211; Cluster Analysis<\/span><\/h2>\n<p>In this analytics challenge, we will use the k-mean clustering algorithm. We have discussed the k-mean algorithm in detail in this\u00a0<strong><a href=\"http:\/\/ucanalytics.com\/blogs\/category\/marketing-analytics\/telecom-case-study-example\/\">cluster analysis case study for telecom<\/a><\/strong>. You may want to read this case study to brush up your concepts for k-mean clustering.<\/p>\n<p>First thing first, for this cluster analysis puzzle we will simulate some data with 2 input variables (x and y). We will use R to do all our calculation.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nset.seed(57)\r\nx = c(rnorm(30,0,1),rnorm(30,10,1),rnorm(30,20,1))\r\ny = c(rnorm(30,0,1),rnorm(30,10,1),rnorm(30,20,1))\r\na = as.data.frame(cbind(x,y))\r\nplot(a)\r\nrm(x,y)<\/pre>\n<p>This dataset, as you could see in the plot, has three well-separated clusters.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Cluster-analysis-challenge-1.jpeg\"><img data-attachment-id=\"9578\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/cluster-analysis-challenge-1\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Cluster-analysis-challenge-1-e1483773216799.jpeg?fit=700%2C447&amp;ssl=1\" data-orig-size=\"700,447\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Cluster analysis challenge 1\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Cluster-analysis-challenge-1-e1483773216799.jpeg?fit=300%2C192&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Cluster-analysis-challenge-1-e1483773216799.jpeg?fit=640%2C409&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-9578 size-full\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Cluster-analysis-challenge-1-e1483773216799.jpeg?resize=640%2C409\" width=\"640\" height=\"409\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Cluster-analysis-challenge-1-e1483773216799.jpeg?w=700&amp;ssl=1 700w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Cluster-analysis-challenge-1-e1483773216799.jpeg?resize=250%2C160&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Cluster-analysis-challenge-1-e1483773216799.jpeg?resize=300%2C192&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Cluster-analysis-challenge-1-e1483773216799.jpeg?resize=30%2C19&amp;ssl=1 30w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Now, let&#8217;s see how the k-mean algorithm will perform on this. We will use the initial value for the number of clusters to be equal to 3 i.e. k=3. For this problem, this is easy since we can clearly see 3 different clusters. However, don&#8217;t expect the choice of k to be so simple for most datasets.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nset.seed(42)\r\nkmean = kmeans(a,3)\r\nplot(a, col=kmean$cluster,pch=16)\r\nlegend(-3,23,c('cluster 1','cluster 2','cluster 3'),pch= 16,col=c(&quot;black&quot;,&quot;green&quot;,&quot;red&quot;))\r\n<\/pre>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-2.jpeg\"><img data-attachment-id=\"9579\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/cluster-analysis-challenge-2\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-2-e1483773283742.jpeg?fit=693%2C444&amp;ssl=1\" data-orig-size=\"693,444\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"cluster analysis challenge 2\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-2-e1483773283742.jpeg?fit=300%2C192&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-2-e1483773283742.jpeg?fit=640%2C410&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-9579 size-full\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-2-e1483773283742.jpeg?resize=640%2C410\" width=\"640\" height=\"410\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-2-e1483773283742.jpeg?w=693&amp;ssl=1 693w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-2-e1483773283742.jpeg?resize=250%2C160&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-2-e1483773283742.jpeg?resize=300%2C192&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-2-e1483773283742.jpeg?resize=30%2C19&amp;ssl=1 30w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Not bad, the k-mean algorithm has clustered the data in an expected manner. Now, let&#8217;s try to run the same algorithm with a different choice of initial seed.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nset.seed(57)\r\nkmean1 = kmeans(a,3)\r\nplot(a, col=kmean1$cluster,pch=16)\r\nlegend(-3,23,c('cluster 1','cluster 2','cluster 3'),pch= 16,col=c(&quot;black&quot;,&quot;green&quot;,&quot;red&quot;))\r\n<\/pre>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-3.jpeg\"><img data-attachment-id=\"9577\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/cluster-analysis-challenge-3\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-3-e1483773333805.jpeg?fit=693%2C447&amp;ssl=1\" data-orig-size=\"693,447\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"cluster analysis challenge 3\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-3-e1483773333805.jpeg?fit=300%2C194&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-3-e1483773333805.jpeg?fit=640%2C413&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-9577 size-full\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-3-e1483773333805.jpeg?resize=640%2C413\" width=\"640\" height=\"413\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-3-e1483773333805.jpeg?w=693&amp;ssl=1 693w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-3-e1483773333805.jpeg?resize=250%2C161&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-3-e1483773333805.jpeg?resize=300%2C194&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/cluster-analysis-challenge-3-e1483773333805.jpeg?resize=30%2C19&amp;ssl=1 30w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a>Ouch! this doesn&#8217;t look right. Now we have got completely different clusters. Here are a few questions for discussion. Post your opinions\/answers in the discussion section.<\/p>\n<ol>\n<li>What has gone wrong in the second analysis?<\/li>\n<li>Suggest a few strategies that will help to avoid such spurious and contradictory results.<\/li>\n<li>Choosing an appropriate value of k or number of clusters is essential to get the right results from cluster analysis. How do you choose the value of k\u00a0in k-mean clustering?<\/li>\n<\/ol>\n<h4><span style=\"color: #3366ff;\">Sign-off Note<\/span><\/h4>\n<p>Don&#8217;t treat these questions like an exam but be creative and imaginative while answering them. Remember, there are no wrong answers here. And who knows, we may find some completely novel ways to do cluster analysis through our discussion.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Cluster analysis is a powerful analytical technique to group or segment identical elements i.e. customers, products etc. In this series of articles, you will explore nuances of cluster analysis and its applications. Analytics challenges, on YOU CANalytics, are designed like puzzles where your participation is extremely important to move things forward. Hence, please share your<\/p>\n<p><a class=\"excerpt-more blog-excerpt\" href=\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/\">Read More&#8230;<\/a><\/p>\n","protected":false},"author":1,"featured_media":9554,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[75,81],"tags":[],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v17.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Cluster Analysis Puzzle - Learn by Doing! (Part 1) &ndash; YOU CANalytics |<\/title>\n<meta name=\"description\" content=\"Learn cluster analysis by doing it yourself, This analytics challenge will expose you to important concepts in cluster analysis in the form of puzzles.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Cluster Analysis Puzzle - Learn by Doing! (Part 1) &ndash; YOU CANalytics |\" \/>\n<meta property=\"og:description\" content=\"Learn cluster analysis by doing it yourself, This analytics challenge will expose you to important concepts in cluster analysis in the form of puzzles.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/\" \/>\n<meta property=\"og:site_name\" content=\"YOU CANalytics |\" \/>\n<meta property=\"article:author\" content=\"roopam\" \/>\n<meta property=\"article:published_time\" content=\"2017-01-08T09:24:07+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2017-01-28T08:49:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis-1.jpg?fit=427%2C233&#038;ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"427\" \/>\n\t<meta property=\"og:image:height\" content=\"233\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Roopam Upadhyay\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Organization\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\",\"name\":\"YOU CANalytics\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/\",\"sameAs\":[],\"logo\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#logo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120\",\"contentUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120\",\"width\":607,\"height\":120,\"caption\":\"YOU CANalytics\"},\"image\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#logo\"}},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#website\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/\",\"name\":\"YOU CANalytics |\",\"description\":\"Explore the Power of Data Science\",\"publisher\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ucanalytics.com\/blogs\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis-1.jpg?fit=427%2C233&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis-1.jpg?fit=427%2C233&ssl=1\",\"width\":427,\"height\":233},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#webpage\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/\",\"name\":\"Cluster Analysis Puzzle - Learn by Doing! (Part 1) &ndash; YOU CANalytics |\",\"isPartOf\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#primaryimage\"},\"datePublished\":\"2017-01-08T09:24:07+00:00\",\"dateModified\":\"2017-01-28T08:49:00+00:00\",\"description\":\"Learn cluster analysis by doing it yourself, This analytics challenge will expose you to important concepts in cluster analysis in the form of puzzles.\",\"breadcrumb\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ucanalytics.com\/blogs\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Cluster Analysis Puzzle &#8211; Learn by Doing! (Part 1)\"}]},{\"@type\":\"Article\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#webpage\"},\"author\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6\"},\"headline\":\"Cluster Analysis Puzzle &#8211; Learn by Doing! (Part 1)\",\"datePublished\":\"2017-01-08T09:24:07+00:00\",\"dateModified\":\"2017-01-28T08:49:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#webpage\"},\"wordCount\":1346,\"commentCount\":7,\"publisher\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\"},\"image\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis-1.jpg?fit=427%2C233&ssl=1\",\"articleSection\":[\"Analytics Challenge\",\"Cluster Analysis - Analytics Challenge\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#respond\"]}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6\",\"name\":\"Roopam Upadhyay\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g\",\"caption\":\"Roopam Upadhyay\"},\"description\":\"This blog contains my personal views and thoughts on predictive Analytics and big data. - Roopam Upadhyay\",\"sameAs\":[\"roopam\"],\"url\":\"https:\/\/ucanalytics.com\/blogs\/author\/roopam\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Cluster Analysis Puzzle - Learn by Doing! (Part 1) &ndash; YOU CANalytics |","description":"Learn cluster analysis by doing it yourself, This analytics challenge will expose you to important concepts in cluster analysis in the form of puzzles.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/","og_locale":"en_US","og_type":"article","og_title":"Cluster Analysis Puzzle - Learn by Doing! (Part 1) &ndash; YOU CANalytics |","og_description":"Learn cluster analysis by doing it yourself, This analytics challenge will expose you to important concepts in cluster analysis in the form of puzzles.","og_url":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/","og_site_name":"YOU CANalytics |","article_author":"roopam","article_published_time":"2017-01-08T09:24:07+00:00","article_modified_time":"2017-01-28T08:49:00+00:00","og_image":[{"width":427,"height":233,"url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis-1.jpg?fit=427%2C233&ssl=1","type":"image\/jpeg"}],"twitter_misc":{"Written by":"Roopam Upadhyay","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Organization","@id":"https:\/\/ucanalytics.com\/blogs\/#organization","name":"YOU CANalytics","url":"https:\/\/ucanalytics.com\/blogs\/","sameAs":[],"logo":{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/#logo","inLanguage":"en-US","url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120","contentUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120","width":607,"height":120,"caption":"YOU CANalytics"},"image":{"@id":"https:\/\/ucanalytics.com\/blogs\/#logo"}},{"@type":"WebSite","@id":"https:\/\/ucanalytics.com\/blogs\/#website","url":"https:\/\/ucanalytics.com\/blogs\/","name":"YOU CANalytics |","description":"Explore the Power of Data Science","publisher":{"@id":"https:\/\/ucanalytics.com\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ucanalytics.com\/blogs\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#primaryimage","inLanguage":"en-US","url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis-1.jpg?fit=427%2C233&ssl=1","contentUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis-1.jpg?fit=427%2C233&ssl=1","width":427,"height":233},{"@type":"WebPage","@id":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#webpage","url":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/","name":"Cluster Analysis Puzzle - Learn by Doing! (Part 1) &ndash; YOU CANalytics |","isPartOf":{"@id":"https:\/\/ucanalytics.com\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#primaryimage"},"datePublished":"2017-01-08T09:24:07+00:00","dateModified":"2017-01-28T08:49:00+00:00","description":"Learn cluster analysis by doing it yourself, This analytics challenge will expose you to important concepts in cluster analysis in the form of puzzles.","breadcrumb":{"@id":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ucanalytics.com\/blogs\/"},{"@type":"ListItem","position":2,"name":"Cluster Analysis Puzzle &#8211; Learn by Doing! (Part 1)"}]},{"@type":"Article","@id":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#article","isPartOf":{"@id":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#webpage"},"author":{"@id":"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6"},"headline":"Cluster Analysis Puzzle &#8211; Learn by Doing! (Part 1)","datePublished":"2017-01-08T09:24:07+00:00","dateModified":"2017-01-28T08:49:00+00:00","mainEntityOfPage":{"@id":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#webpage"},"wordCount":1346,"commentCount":7,"publisher":{"@id":"https:\/\/ucanalytics.com\/blogs\/#organization"},"image":{"@id":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis-1.jpg?fit=427%2C233&ssl=1","articleSection":["Analytics Challenge","Cluster Analysis - Analytics Challenge"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/ucanalytics.com\/blogs\/cluster-analysis-learn-by-doing-analytics-challenge-part-1\/#respond"]}]},{"@type":"Person","@id":"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6","name":"Roopam Upadhyay","image":{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/#personlogo","inLanguage":"en-US","url":"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g","caption":"Roopam Upadhyay"},"description":"This blog contains my personal views and thoughts on predictive Analytics and big data. - Roopam Upadhyay","sameAs":["roopam"],"url":"https:\/\/ucanalytics.com\/blogs\/author\/roopam\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Twins-and-Cluster-Analysis-1.jpg?fit=427%2C233&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3L0jT-2tx","jetpack-related-posts":[{"id":1116,"url":"https:\/\/ucanalytics.com\/blogs\/customer-segmentation-cluster-analysis-telecom-case-study-example\/","url_meta":{"origin":9519,"position":0},"title":"Customer Segmentation &#038; Cluster Analysis &#8211; Telecom Case Study Example (Part 1)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"Galaxies and Cluster Analysis I live in Mumbai (Bombay), the financial capital of India and one of the largest cities in the world. One of the problems of living in a large city is that you rarely see stars in the night sky. The limited sky one can see through\u2026","rel":"","context":"In &quot;Marketing Analytics&quot;","block_context":{"text":"Marketing Analytics","link":"https:\/\/ucanalytics.com\/blogs\/category\/marketing-analytics\/"},"img":{"alt_text":"The Night Sky - by Roopam","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/11\/sky-1.jpg?fit=768%2C1024&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/11\/sky-1.jpg?fit=768%2C1024&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/11\/sky-1.jpg?fit=768%2C1024&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/11\/sky-1.jpg?fit=768%2C1024&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":1259,"url":"https:\/\/ucanalytics.com\/blogs\/customer-segmentation-cluster-analysis-telecom-case-study-part-2\/","url_meta":{"origin":9519,"position":1},"title":"Customer Segmentation &#038; Cluster Analysis \u2013 Telecom Case Study Example(Part 2)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"In one of\u00a0the previous articles, we have started with a case study example from the telecom sector. We learned about cluster analysis using black holes as an analogy. In that article, we used Euclidean distance to form customer segments. Let us continue with the same case study and learn about\u2026","rel":"","context":"In &quot;Marketing Analytics&quot;","block_context":{"text":"Marketing Analytics","link":"https:\/\/ucanalytics.com\/blogs\/category\/marketing-analytics\/"},"img":{"alt_text":"Euclid - by Roopam","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/12\/unnamed.jpg?fit=524%2C615&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1385,"url":"https:\/\/ucanalytics.com\/blogs\/customer-segmentation-outliers-telecom-case-study-part-3\/","url_meta":{"origin":9519,"position":2},"title":"Cluster Analysis and Outliers \u2013 Telecom Case Study Example (Part 3)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"Outliers \"I refuse to join any club that would have me as a member.\" - Groucho Marx This witty statement came from (according to me) one of the funniest men in the history of American cinema \u2013 Julius Henry Marx better known as Groucho Marx. Groucho was certainly a very\u2026","rel":"","context":"In &quot;Marketing Analytics&quot;","block_context":{"text":"Marketing Analytics","link":"https:\/\/ucanalytics.com\/blogs\/category\/marketing-analytics\/"},"img":{"alt_text":"Groucho - by Roopam","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/01\/photo.jpg?fit=768%2C1024&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/01\/photo.jpg?fit=768%2C1024&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/01\/photo.jpg?fit=768%2C1024&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/01\/photo.jpg?fit=768%2C1024&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":1532,"url":"https:\/\/ucanalytics.com\/blogs\/customer-segmentation\/","url_meta":{"origin":9519,"position":3},"title":"Telecom Case (Part 4) &#8211; Customer Segmentation and Application","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"Telecom Case Study \u2013 Customer Segmentation For the last few articles we have been working on a telecom case study to create customer segments (Part 1, Part 2 and Part 3). In this case, you are the head of customer insights and marketing at a telecom company, ConnectFast Inc. Recall,\u2026","rel":"","context":"In &quot;Marketing Analytics&quot;","block_context":{"text":"Marketing Analytics","link":"https:\/\/ucanalytics.com\/blogs\/category\/marketing-analytics\/"},"img":{"alt_text":"Customer Segmentation - by Roopam","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/02\/photo1.jpg?fit=742%2C1024&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/02\/photo1.jpg?fit=742%2C1024&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/02\/photo1.jpg?fit=742%2C1024&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/02\/photo1.jpg?fit=742%2C1024&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":9649,"url":"https:\/\/ucanalytics.com\/blogs\/cluster-analysis-puzzle-initial-random-seeds-learn-by-doing-part-2\/","url_meta":{"origin":9519,"position":4},"title":"Cluster Analysis Puzzle : Initial Random Seeds &#8211; Learn by Doing! (Part 2)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"This is a continuation of the\u00a0cluster analysis puzzle.\u00a0In this puzzle, we had noticed different results for k-mean clusters in different runs. Some of you (Emily. Ramya, Alard, and Pintu) have pointed out initial random seeds as the reason for this inconsistency. Now, this inconsistency of results is a big problem\u2026","rel":"","context":"In &quot;Analytics Challenge&quot;","block_context":{"text":"Analytics Challenge","link":"https:\/\/ucanalytics.com\/blogs\/category\/analytics-challenge\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Freedom-1.jpg?fit=632%2C372&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Freedom-1.jpg?fit=632%2C372&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/01\/Freedom-1.jpg?fit=632%2C372&ssl=1&resize=525%2C300 1.5x"},"classes":[]},{"id":1251,"url":"https:\/\/ucanalytics.com\/blogs\/murder-cases-evidence-and-logical-rigor-addendum\/","url_meta":{"origin":9519,"position":5},"title":"Murder Cases, Evidence and Logical Rigor &#8211; Addendum","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"I know this article should be a continuation of our telecom case study on customer segmentation and cluster analysis. Though it\u2019s not intended, but possibly is apt that the articles on cluster analysis are separated by articles on other topics \u2013 forming them into perfect clusters. Last time, we had\u2026","rel":"","context":"In &quot;Analytics Graffiti&quot;","block_context":{"text":"Analytics Graffiti","link":"https:\/\/ucanalytics.com\/blogs\/category\/analytics-graffiti\/"},"img":{"alt_text":"Return to the Dark Ages - by Roopam","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/12\/peace.jpg?fit=357%2C489&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts\/9519"}],"collection":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/comments?post=9519"}],"version-history":[{"count":0,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts\/9519\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/media\/9554"}],"wp:attachment":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/media?parent=9519"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/categories?post=9519"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/tags?post=9519"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}