{"id":8700,"date":"2016-09-10T14:36:25","date_gmt":"2016-09-10T09:06:25","guid":{"rendered":"http:\/\/ucanalytics.com\/blogs\/?p=8700"},"modified":"2018-07-10T11:28:05","modified_gmt":"2018-07-10T05:58:05","slug":"principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4","status":"publish","type":"post","link":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/","title":{"rendered":"Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4)"},"content":{"rendered":"<div id=\"attachment_8701\" style=\"width: 301px\" class=\"wp-caption alignright\"><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Death-Principal-Component-Analysis.jpg\"><img aria-describedby=\"caption-attachment-8701\" data-attachment-id=\"8701\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/death-principal-component-analysis\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Death-Principal-Component-Analysis.jpg?fit=495%2C675&amp;ssl=1\" data-orig-size=\"495,675\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Death &#038; Principal Component Analysis\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Death-Principal-Component-Analysis.jpg?fit=220%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Death-Principal-Component-Analysis.jpg?fit=495%2C675&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-8701\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Death-Principal-Component-Analysis.jpg?resize=291%2C397\" alt=\"Death &amp; Principal Component Analysis\" width=\"291\" height=\"397\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Death-Principal-Component-Analysis.jpg?w=495&amp;ssl=1 495w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Death-Principal-Component-Analysis.jpg?resize=183%2C250&amp;ssl=1 183w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Death-Principal-Component-Analysis.jpg?resize=220%2C300&amp;ssl=1 220w\" sizes=\"(max-width: 291px) 100vw, 291px\" data-recalc-dims=\"1\" \/><\/a><p id=\"caption-attachment-8701\" class=\"wp-caption-text\">Death and Principal Component Analysis &#8211; by Roopam<\/p><\/div>\n<hr \/>\n<p>Principal component analysis is a wonderful technique for data reduction without losing critical information. Yes, you could reduce the size of 2GB data to a few MBs without losing a lot of information. This is like a mp3 version of music. Many, including some experienced data scientists, find principal component analysis (PCA) difficult to understand. However, I believe that after reading this article you will understand PCA and appreciate that it is a highly intuitive and powerful data science technique with several business applications. This article is a continuing part of our <strong><a href=\"http:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/\">regression case study example<\/a> <\/strong>where you are helping an investment firm make money through price arbitrage. In this article, I will help you gain the intuitive understanding of principal component analysis by highlighting both practical applications and the underlying mathematical fundamentals. Principal component analysis is also extremely useful while dealing with multicollinearity in regression models. In the subsequent article, we will use this property of PCA for the development of a model to estimate property price.<\/p>\n<p>Before we explore further nuances of principal component analysis, in the true tradition of YOU CANalytics, let&#8217;s digress a bit and create links between:<\/p>\n<h2><span style=\"color: #3366ff;\">Principal Component Analysis and Death<\/span><\/h2>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Human-Body-Chemical-Composition.jpg\"><img data-attachment-id=\"8737\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/human-body-chemical-composition\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Human-Body-Chemical-Composition.jpg?fit=439%2C625&amp;ssl=1\" data-orig-size=\"439,625\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Human Body Chemical Composition\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Human-Body-Chemical-Composition.jpg?fit=211%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Human-Body-Chemical-Composition.jpg?fit=439%2C625&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-8737 alignright\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Human-Body-Chemical-Composition.jpg?resize=283%2C404\" alt=\"Human Body Chemical Composition\" width=\"283\" height=\"404\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Human-Body-Chemical-Composition.jpg?w=439&amp;ssl=1 439w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Human-Body-Chemical-Composition.jpg?resize=176%2C250&amp;ssl=1 176w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Human-Body-Chemical-Composition.jpg?resize=211%2C300&amp;ssl=1 211w\" sizes=\"(max-width: 283px) 100vw, 283px\" data-recalc-dims=\"1\" \/><\/a>What happens when people die? Where do they go? All of us have pondered this at some point or other. There are several religious and a few non-religious theories to explain events after death. A theory which makes sense to me is: after death, we go back to our fundamental elements. Here, I am not talking about metaphysical or divine elements but chemistry. There are close to 120 known elements including lithium, oxygen, uranium, hydrogen, argon, and carbon. These elements form the periodic table we learned in the chemistry lessons in high-school. Despite the choice of roughly 120 elements, the human body does not have all of them in the same proportion. Incidentally, six elements, as shown in the chart, constitute close to 99% of the human body. Think of these elements as the principal components of the human body. The remaining ~1% of the human body has a little over 20 other elements. This means, around three-fourth of the elements in the periodic table are completely absent in the human body.<\/p>\n<p>We noticed that about 1% of the human body has formed with ~ 20 elements. The idea with PCA is that if you remove these 20 elements you will lose just 1% of the essence of the human body. This will also mean that your information load will decline by ~77% (20\/26). These ideas will form the basis of our understanding of principal component analysis as we progress with our pricing case study example. You could find the previous parts at this link: <a href=\"http:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/\">regression case study example<\/a>.<\/p>\n<h2><span style=\"color: #3366ff;\">Principal Component Analysis &#8211; Case Study Example<\/span><\/h2>\n<p>You had got a stern message from your client last evening. They want you to turn around the price estimation model soon so that they can integrate it into their business operations. Luckily, you have made a good progress while preparing your data for the regression modeling. The original data had some outliers and missing observations which you have decided to drop for this analysis. You can download the cleaned data from this link: <a href=\"http:\/\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Regression-Clean-Data.csv\">regression-clean-data<\/a>. In this analysis, you are trying to estimate the response variable (house price) through the other numeric and categoric predictor variables. This estimation will be used by your client, a property investment firm, to make money through price arbitrage.<\/p>\n<p>In the previous part using <strong><a href=\"http:\/\/ucanalytics.com\/blogs\/bivariate-analysis-leverage-regression-case-study-example-part-3\/\">bivariate analysis<\/a><\/strong>, we noticed that there is a significantly high correlation between some of the numeric predictor variables. This, you know, will create a problem of multicollinearity while regression model development. One of the most effective methods for elimination of multicollinearity is principal component analysis. I suggest you go back and read that article to get a good understanding of results of PCA (<a href=\"http:\/\/ucanalytics.com\/blogs\/bivariate-analysis-leverage-regression-case-study-example-part-3\/\">bivariate analysis<\/a>). You have six numeric predictor variables in your dataset i.e.<\/p>\n<div>\n<table>\n<tbody>\n<tr>\n<td style=\"border-color: #bababa; background-color: #d1efff;\">\n<ul>\n<li><strong>Dist_Taxi<\/strong> &#8211; distance to nearest taxi stand from the property<\/li>\n<li><strong>Dist_Market<\/strong> &#8211; distance to nearest grocery market from the property<\/li>\n<li><strong>Dist_Hospital<\/strong> &#8211; distance to nearest hospital from the property<\/li>\n<li><strong>Carpet<\/strong> &#8211; carpet area of the property in square feet<\/li>\n<li><strong>Builtup<\/strong> &#8211; built-up area of the property in square feet<\/li>\n<li><strong>Rainfall<\/strong> &#8211; annual rainfall in the area where property is located<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<\/div>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Principal-Component-Analysis.jpg\"><img data-attachment-id=\"8816\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/principal-component-analysis\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Principal-Component-Analysis.jpg?fit=435%2C626&amp;ssl=1\" data-orig-size=\"435,626\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"principal-component-analysis\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Principal-Component-Analysis.jpg?fit=208%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Principal-Component-Analysis.jpg?fit=435%2C626&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-8816 alignright\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Principal-Component-Analysis.jpg?resize=228%2C328\" alt=\"principal-component-analysis\" width=\"228\" height=\"328\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Principal-Component-Analysis.jpg?w=435&amp;ssl=1 435w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Principal-Component-Analysis.jpg?resize=174%2C250&amp;ssl=1 174w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Principal-Component-Analysis.jpg?resize=208%2C300&amp;ssl=1 208w\" sizes=\"(max-width: 228px) 100vw, 228px\" data-recalc-dims=\"1\" \/><\/a>Before we jump to PCA, think of these 6 variables collectively as the human body and the components generated from PCA as elements (oxygen, hydrogen, carbon etc.). When you did the principal component analysis of these 6 variables you noticed that just 3 components can explain ~90% of these variables i.e. (37.7 + 33.4 + 16.6 = 87.7%). This means that you could reduce these 6 variables to 3 principal components by losing just 10% of the information. That is not a bad bargain 50% reduction in variables at the cost of 10% information. Moreover, these components will never have any multicollinearity between them as they are orthogonal or perfectly uncorrelated.<\/p>\n<p>Later in this article, we will come back to how we have derived these components from our dataset. For now, let&#8217;s decipher the inner workings of principal component analysis. We will explore more about orthogonal components and variable reduction.<\/p>\n<p>Let&#8217;s create a loose parallel to understand orthogonal rotation and loss of information which is at the core of PCA. A word of caution, this example is not how principal component analysis works but it will help you appreciate the inner workings of PCA. When you rotate your cell phone orthogonally (this is a fancy way of saying make it perpendicular) you kind of reduce the size of a landscape picture. Essentially, you are losing some information. Remember in the tilted states (i.e. positions between orthogonal states of the phone) the picture doesn&#8217;t change, hence we and PCA are not interested in non-orthogonal positions of the phone. PCA will find all the orthogonal positions for you with the quantum of information each position captures. Eventually, you will pick the positions of the phone that provide you with maximum information.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Orthogonal-rotation.jpg\"><img data-attachment-id=\"8863\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/orthogonal-rotation\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Orthogonal-rotation.jpg?fit=1083%2C451&amp;ssl=1\" data-orig-size=\"1083,451\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"orthogonal-rotation\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Orthogonal-rotation.jpg?fit=300%2C125&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Orthogonal-rotation.jpg?fit=640%2C266&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\" wp-image-8863 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Orthogonal-rotation.jpg?resize=639%2C266\" alt=\"orthogonal-rotation\" width=\"639\" height=\"266\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Orthogonal-rotation.jpg?w=1083&amp;ssl=1 1083w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Orthogonal-rotation.jpg?resize=250%2C104&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Orthogonal-rotation.jpg?resize=300%2C125&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Orthogonal-rotation.jpg?resize=768%2C320&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Orthogonal-rotation.jpg?resize=1024%2C426&amp;ssl=1 1024w\" sizes=\"(max-width: 639px) 100vw, 639px\" data-recalc-dims=\"1\" \/><\/a>For other kinds of pictures, i.e. a profile picture, the upright position is better than the horizontal position for the phone. Treat the pictures as data and principal component analysis is trying to find orthogonal positions (distinct components) for the phone to capture maximum information. Unlike the 2-dimensional world of the phone rotation, PCA rotates the axis in the n-dimensional world of the data. Here, n is equal to the number of variables. For our data with 6 variables, we have 6 orthogonal axes possible.<\/p>\n<p>Now, you are ready to test these concepts and the mathematical theory behind them using the predictor variables in our dataset. The first thing we will do is extract principal components using R. Then we will also notice how Eigenvalues and Eigenvectors (yes those dreaded concepts we learned in matrix algebra in high school) are used to find the level of information and orthogonal axes.<\/p>\n<p>To begin, let us prepare our data for PCA: (you can find the complete code on this link: <a href=\"http:\/\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Principal-Component-Analysis-R-Code.txt\">principal-component-analysis-r-code<\/a>)<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\ndata = read.csv('http:\/\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Regression-Clean-Data.csv')<\/pre>\n<p>Now, tag the numeric predictor variables in the dataset for the principal component analysis.<\/p>\n<pre class=\"brush: r; first-line: 2; title: ; notranslate\" title=\"\">\r\nnumeric_predictors=c('Dist_Taxi','Dist_Market','Dist_Hospital','Carpet','Builtup','Rainfall')\r\nData_for_PCA = data[,numeric_predictors]<\/pre>\n<p>Now, that the data is ready for analysis. Let&#8217;s load a package called FactoMineR in R to run the principal component analysis.<\/p>\n<pre class=\"brush: r; first-line: 4; title: ; notranslate\" title=\"\">\r\nif (!require('FactoMineR')) install.packages('FactoMineR') \r\npca=PCA(Data_for_PCA)\r\n<\/pre>\n<p>This command will generate a graph similar to this.<\/p>\n<p>&nbsp;<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Pricipal-Component-Analysis-2.jpg\"><img data-attachment-id=\"8884\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/pricipal-component-analysis-2\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Pricipal-Component-Analysis-2.jpg?fit=1134%2C871&amp;ssl=1\" data-orig-size=\"1134,871\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"pricipal-component-analysis-2\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Pricipal-Component-Analysis-2.jpg?fit=300%2C230&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Pricipal-Component-Analysis-2.jpg?fit=640%2C492&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-8884 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Pricipal-Component-Analysis-2.jpg?resize=640%2C492\" alt=\"pricipal-component-analysis-2\" width=\"640\" height=\"492\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Pricipal-Component-Analysis-2.jpg?w=1134&amp;ssl=1 1134w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Pricipal-Component-Analysis-2.jpg?resize=250%2C192&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Pricipal-Component-Analysis-2.jpg?resize=300%2C230&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Pricipal-Component-Analysis-2.jpg?resize=768%2C590&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Pricipal-Component-Analysis-2.jpg?resize=1024%2C787&amp;ssl=1 1024w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Here, distance to taxi, market, and hospital have formed a composite variable (comp 1) which explains 37.7% information in data. Another, orthogonal axis (comp 2) explains the remaining 33.4% of variation through the composite of carpet and built-up area. Rainfall is not a part of comp 1 or comp 2 but is a 3rd orthogonal component. You can get this information in a much more friendly tabular form to display composition of all the variables. Use this command in R.<\/p>\n<pre class=\"brush: r; first-line: 6; title: ; notranslate\" title=\"\">\r\npca$eig\r\n<\/pre>\n<table border=\"2\">\n<tbody>\n<tr style=\"background-color: #4d6edb; height: 71px;\">\n<td style=\"height: 71px;\" width=\"80\"><strong><span style=\"color: #ffffff;\">Components<\/span><\/strong><\/td>\n<td style=\"height: 71px;\" width=\"87\"><strong><span style=\"color: #ffffff;\">Eigenvalue<\/span><\/strong><\/td>\n<td style=\"height: 71px;\" width=\"177\"><strong><span style=\"color: #ffffff;\">% Variance Explained<\/span><\/strong><\/td>\n<td style=\"height: 71px;\" width=\"92\"><strong><span style=\"color: #ffffff;\">Cumulative Variance (%)<\/span><\/strong><\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"background-color: #fff3cf; width: 80px; height: 23px;\" width=\"80\">comp 1<\/td>\n<td style=\"height: 23px;\" width=\"87\">2.262<\/td>\n<td style=\"height: 23px;\" width=\"177\">37.7%<\/td>\n<td style=\"height: 23px;\" width=\"92\">37.7%<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"background-color: #fff3cf; width: 80px; height: 23px;\" width=\"80\">comp 2<\/td>\n<td style=\"height: 23px;\" width=\"87\">2.004<\/td>\n<td style=\"height: 23px;\" width=\"177\">33.4%<\/td>\n<td style=\"height: 23px;\" width=\"92\">71.1%<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"background-color: #fff3cf; width: 80px; height: 23px;\" width=\"80\">comp 3<\/td>\n<td style=\"height: 23px;\" width=\"87\">0.996<\/td>\n<td style=\"height: 23px;\" width=\"177\">16.6%<\/td>\n<td style=\"height: 23px;\" width=\"92\">87.7%<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"background-color: #fff3cf; width: 80px; height: 23px;\" width=\"80\">comp 4<\/td>\n<td style=\"height: 23px;\" width=\"87\">0.564<\/td>\n<td style=\"height: 23px;\" width=\"177\">9.4%<\/td>\n<td style=\"height: 23px;\" width=\"92\">97.1%<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"background-color: #fff3cf; width: 80px; height: 23px;\" width=\"80\">comp 5<\/td>\n<td style=\"height: 23px;\" width=\"87\">0.174<\/td>\n<td style=\"height: 23px;\" width=\"177\">2.9%<\/td>\n<td style=\"height: 23px;\" width=\"92\">100.0%<\/td>\n<\/tr>\n<tr style=\"height: 23.1375px;\">\n<td style=\"background-color: #fff3cf; width: 80px; height: 23.1375px;\" width=\"80\">comp 6<\/td>\n<td style=\"height: 23.1375px;\" width=\"87\">0.001<\/td>\n<td style=\"height: 23.1375px;\" width=\"177\">0.0%<\/td>\n<td style=\"height: 23.1375px;\" width=\"92\">100.0%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>We have got the percentage of data (variance) explained by each component. Did you notice the second column named &#8216;Eigenvalue&#8217;? The eigenvalue is used in the principal component analysis to calculate the % variance explained. For instance, if you divide eigenvalue of comp 1 with the sum of all the eigenvalues {i.e. 2.262\/(2.262+2.004&#8230;+0.001)} you will get 37.7%. Alright, so eigenvalue does have some cool properties.<\/p>\n<p>Ok, what about eigenvectors? eigenvectors are the vector locations of these components. Matrix multiplication of our original dataset with &#8216;eigenvector-1&#8217; will generate the dataset for comp 1. Similarly, you can generate data for other components as well. This has essentially, rotated our data for predictor variables on orthogonal (eigenvector) axes.<\/p>\n<p>We can find the loading of our predictor variables on these components through a correlation matrix constructed with these commands in R.<\/p>\n<pre class=\"brush: r; first-line: 7; title: ; notranslate\" title=\"\">\r\nCorrelation_Matrix=as.data.frame(round(cor(Data_for_PCA,pca$ind$coord)^2*100,0))\r\nCorrelation_Matrix[with(Correlation_Matrix, order(-Correlation_Matrix[,1])),]\r\n<\/pre>\n<p>This correlation matrix tells us that 88% of the distance to the hospital is loaded on comp 1. 100% of both carpet and built-up area is loaded on comp 2.<\/p>\n<table cellspacing=\"2\">\n<tbody>\n<tr style=\"background-color: #4d6edb;\">\n<td width=\"140\"><\/td>\n<td width=\"80\"><strong><span style=\"color: #ffffff;\">comp 1<\/span><\/strong><\/td>\n<td width=\"80\"><strong><span style=\"color: #ffffff;\">comp 2<\/span><\/strong><\/td>\n<td width=\"80\"><strong><span style=\"color: #ffffff;\">comp 3<\/span><\/strong><\/td>\n<td width=\"80\"><strong><span style=\"color: #ffffff;\">comp 4<\/span><\/strong><\/td>\n<td width=\"80\"><strong><span style=\"color: #ffffff;\">comp 5<\/span><\/strong><\/td>\n<td width=\"80\"><strong><span style=\"color: #ffffff;\">comp 6<\/span><\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 140px; background-color: #ffe552;\" width=\"140\"><strong><span style=\"color: #000000;\">Dist_Hospital<\/span><\/strong><\/td>\n<td width=\"80\">88%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">2%<\/td>\n<td width=\"80\">10%<\/td>\n<td width=\"80\">0%<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 140px; background-color: #ffe552;\" width=\"140\"><strong><span style=\"color: #000000;\">Dist_Taxi<\/span><\/strong><\/td>\n<td width=\"80\">76%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">1%<\/td>\n<td width=\"80\">17%<\/td>\n<td width=\"80\">6%<\/td>\n<td width=\"80\">0%<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 140px; background-color: #ffe552;\" width=\"140\"><strong><span style=\"color: #000000;\">Dist_Market<\/span><\/strong><\/td>\n<td width=\"80\">61%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">38%<\/td>\n<td width=\"80\">1%<\/td>\n<td width=\"80\">0%<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 140px; background-color: #ffe552;\" width=\"140\"><strong><span style=\"color: #000000;\">Rainfall<\/span><\/strong><\/td>\n<td width=\"80\">1%<\/td>\n<td width=\"80\">1%<\/td>\n<td width=\"80\">98%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">0%<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 140px; background-color: #ffe552;\" width=\"140\"><strong><span style=\"color: #000000;\">Carpet<\/span><\/strong><\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">100%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">0%<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 140px; background-color: #ffe552;\" width=\"140\"><strong><span style=\"color: #000000;\">Builtup<\/span><\/strong><\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">100%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">0%<\/td>\n<td width=\"80\">0%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>If we remove component 4 to 6 from our data we will lose a little over 10% of the information. This also means that ~39% of the information available in &#8216;distance to market&#8217; will be lost with component 4 &amp; 5.<\/p>\n<p>Now, you are feeling much more confident that you have addressed the issues of multicollinearity in the numeric predictor variables. You want to start with the development of regression models to estimate house prices.<\/p>\n<h4><span style=\"color: #3366ff;\">Sign-off Note<\/span><\/h4>\n<p>In the human body after removing ~20 elements, we lost just 1% of the essence of the body. An important question, is this 1% responsible for something critical? You can&#8217;t ignore this question even while developing models with the principal components. Is it that the loss of 39% information captured in &#8216;distance to market&#8217; will reduce the estimation power of our regression model? May be or may be not, we will see this when we will develop our first estimation model in the next post.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Principal component analysis is a wonderful technique for data reduction without losing critical information. Yes, you could reduce the size of 2GB data to a few MBs without losing a lot of information. This is like a mp3 version of music. Many, including some experienced data scientists, find principal component analysis (PCA) difficult to understand.<\/p>\n<p><a class=\"excerpt-more blog-excerpt\" href=\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/\">Read More&#8230;<\/a><\/p>\n","protected":false},"author":1,"featured_media":8703,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[80],"tags":[],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v17.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4) &ndash; YOU CANalytics |<\/title>\n<meta name=\"description\" content=\"Principal component analysis is a wonderful technique for data reduction without losing critical information. Read this article to understand PCA.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4) &ndash; YOU CANalytics |\" \/>\n<meta property=\"og:description\" content=\"Principal component analysis is a wonderful technique for data reduction without losing critical information. Read this article to understand PCA.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/\" \/>\n<meta property=\"og:site_name\" content=\"YOU CANalytics |\" \/>\n<meta property=\"article:author\" content=\"roopam\" \/>\n<meta property=\"article:published_time\" content=\"2016-09-10T09:06:25+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-07-10T05:58:05+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/principal-component-analysis-Death-Profile.jpg?fit=495%2C329&#038;ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"495\" \/>\n\t<meta property=\"og:image:height\" content=\"329\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Roopam Upadhyay\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Organization\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\",\"name\":\"YOU CANalytics\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/\",\"sameAs\":[],\"logo\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#logo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120\",\"contentUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120\",\"width\":607,\"height\":120,\"caption\":\"YOU CANalytics\"},\"image\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#logo\"}},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#website\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/\",\"name\":\"YOU CANalytics |\",\"description\":\"Explore the Power of Data Science\",\"publisher\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ucanalytics.com\/blogs\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/principal-component-analysis-Death-Profile.jpg?fit=495%2C329&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/principal-component-analysis-Death-Profile.jpg?fit=495%2C329&ssl=1\",\"width\":495,\"height\":329},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#webpage\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/\",\"name\":\"Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4) &ndash; YOU CANalytics |\",\"isPartOf\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#primaryimage\"},\"datePublished\":\"2016-09-10T09:06:25+00:00\",\"dateModified\":\"2018-07-10T05:58:05+00:00\",\"description\":\"Principal component analysis is a wonderful technique for data reduction without losing critical information. Read this article to understand PCA.\",\"breadcrumb\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ucanalytics.com\/blogs\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4)\"}]},{\"@type\":\"Article\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#webpage\"},\"author\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6\"},\"headline\":\"Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4)\",\"datePublished\":\"2016-09-10T09:06:25+00:00\",\"dateModified\":\"2018-07-10T05:58:05+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#webpage\"},\"wordCount\":1726,\"commentCount\":21,\"publisher\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\"},\"image\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/principal-component-analysis-Death-Profile.jpg?fit=495%2C329&ssl=1\",\"articleSection\":[\"Pricing Case Study Example\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#respond\"]}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6\",\"name\":\"Roopam Upadhyay\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g\",\"caption\":\"Roopam Upadhyay\"},\"description\":\"This blog contains my personal views and thoughts on predictive Analytics and big data. - Roopam Upadhyay\",\"sameAs\":[\"roopam\"],\"url\":\"https:\/\/ucanalytics.com\/blogs\/author\/roopam\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4) &ndash; YOU CANalytics |","description":"Principal component analysis is a wonderful technique for data reduction without losing critical information. Read this article to understand PCA.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/","og_locale":"en_US","og_type":"article","og_title":"Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4) &ndash; YOU CANalytics |","og_description":"Principal component analysis is a wonderful technique for data reduction without losing critical information. Read this article to understand PCA.","og_url":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/","og_site_name":"YOU CANalytics |","article_author":"roopam","article_published_time":"2016-09-10T09:06:25+00:00","article_modified_time":"2018-07-10T05:58:05+00:00","og_image":[{"width":495,"height":329,"url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/principal-component-analysis-Death-Profile.jpg?fit=495%2C329&ssl=1","type":"image\/jpeg"}],"twitter_misc":{"Written by":"Roopam Upadhyay","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Organization","@id":"https:\/\/ucanalytics.com\/blogs\/#organization","name":"YOU CANalytics","url":"https:\/\/ucanalytics.com\/blogs\/","sameAs":[],"logo":{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/#logo","inLanguage":"en-US","url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120","contentUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120","width":607,"height":120,"caption":"YOU CANalytics"},"image":{"@id":"https:\/\/ucanalytics.com\/blogs\/#logo"}},{"@type":"WebSite","@id":"https:\/\/ucanalytics.com\/blogs\/#website","url":"https:\/\/ucanalytics.com\/blogs\/","name":"YOU CANalytics |","description":"Explore the Power of Data Science","publisher":{"@id":"https:\/\/ucanalytics.com\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ucanalytics.com\/blogs\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#primaryimage","inLanguage":"en-US","url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/principal-component-analysis-Death-Profile.jpg?fit=495%2C329&ssl=1","contentUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/principal-component-analysis-Death-Profile.jpg?fit=495%2C329&ssl=1","width":495,"height":329},{"@type":"WebPage","@id":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#webpage","url":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/","name":"Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4) &ndash; YOU CANalytics |","isPartOf":{"@id":"https:\/\/ucanalytics.com\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#primaryimage"},"datePublished":"2016-09-10T09:06:25+00:00","dateModified":"2018-07-10T05:58:05+00:00","description":"Principal component analysis is a wonderful technique for data reduction without losing critical information. Read this article to understand PCA.","breadcrumb":{"@id":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ucanalytics.com\/blogs\/"},{"@type":"ListItem","position":2,"name":"Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4)"}]},{"@type":"Article","@id":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#article","isPartOf":{"@id":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#webpage"},"author":{"@id":"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6"},"headline":"Principal Component Analysis: Step-by-Step Guide using R- Regression Case Study Example (Part 4)","datePublished":"2016-09-10T09:06:25+00:00","dateModified":"2018-07-10T05:58:05+00:00","mainEntityOfPage":{"@id":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#webpage"},"wordCount":1726,"commentCount":21,"publisher":{"@id":"https:\/\/ucanalytics.com\/blogs\/#organization"},"image":{"@id":"https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/principal-component-analysis-Death-Profile.jpg?fit=495%2C329&ssl=1","articleSection":["Pricing Case Study Example"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/ucanalytics.com\/blogs\/principal-component-analysis-step-step-guide-r-regression-case-study-example-part-4\/#respond"]}]},{"@type":"Person","@id":"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6","name":"Roopam Upadhyay","image":{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/#personlogo","inLanguage":"en-US","url":"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g","caption":"Roopam Upadhyay"},"description":"This blog contains my personal views and thoughts on predictive Analytics and big data. - Roopam Upadhyay","sameAs":["roopam"],"url":"https:\/\/ucanalytics.com\/blogs\/author\/roopam\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/principal-component-analysis-Death-Profile.jpg?fit=495%2C329&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3L0jT-2gk","jetpack-related-posts":[{"id":9018,"url":"https:\/\/ucanalytics.com\/blogs\/step-step-regression-models-pricing-case-study-example-part-5\/","url_meta":{"origin":8700,"position":0},"title":"Step by Step Regression Modeling Using Principal Component Analysis &#8211; Case Study Example (Part 5)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"This is a continuation of our case study example to estimate property pricing. In this part, you will learn nuances of regression modeling by building three different regression models and compare their results.\u00a0We will also use results of the principal component analysis, discussed in the last part, to develop a\u2026","rel":"","context":"In &quot;Pricing Case Study Example&quot;","block_context":{"text":"Pricing Case Study Example","link":"https:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Sumo-and-Regression-Model.jpg?fit=918%2C384&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Sumo-and-Regression-Model.jpg?fit=918%2C384&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Sumo-and-Regression-Model.jpg?fit=918%2C384&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Sumo-and-Regression-Model.jpg?fit=918%2C384&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":8388,"url":"https:\/\/ucanalytics.com\/blogs\/regression-analysis-pricing-case-study-example-part-1\/","url_meta":{"origin":8700,"position":1},"title":"Regression Analysis &#8211; Pricing Case Study Example (Part 1)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"How to figure out if you are paying the right price for the property you are about to purchase? Welcome to a new data science case study example on YOU CANalytics to identify the right housing price. Pricing is a highly important and\u00a0specialized function for any business. A right price\u2026","rel":"","context":"In &quot;Pricing Case Study Example&quot;","block_context":{"text":"Pricing Case Study Example","link":"https:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/07\/Connect-the-Dots.jpg?fit=397%2C603&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]},{"id":8649,"url":"https:\/\/ucanalytics.com\/blogs\/bivariate-analysis-leverage-regression-case-study-example-part-3\/","url_meta":{"origin":8700,"position":2},"title":"Bivariate Analysis &#038; Leverage &#8211; Regression Case Study Example (Part 3)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"Welcome back to the\u00a0case study example for regression analysis where you are helping an investment firm make money through property price arbitrage. In the last two parts (Part 1 & Part 2) you started with the univariate analysis to identify patterns in the data including missing data and outliers. In\u2026","rel":"","context":"In &quot;Pricing Case Study Example&quot;","block_context":{"text":"Pricing Case Study Example","link":"https:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-Case-Study-Example.jpg?fit=1156%2C720&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-Case-Study-Example.jpg?fit=1156%2C720&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-Case-Study-Example.jpg?fit=1156%2C720&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-Case-Study-Example.jpg?fit=1156%2C720&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-Case-Study-Example.jpg?fit=1156%2C720&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":9145,"url":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/","url_meta":{"origin":8700,"position":3},"title":"Data Simulation for Regression Modeling &#8211; Pricing Case Study Example (Part 6)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"\"Data! Data! Data!\" he cried impatiently. \"I can't make bricks without clay.\" - Sherlock Holmes This is a continuation of our regression case study example. In the previous parts, we have learned, as Sherlock Holmes says, to make bricks i.e. develop regression models. In this part, we will learn how\u2026","rel":"","context":"In &quot;Pricing Case Study Example&quot;","block_context":{"text":"Pricing Case Study Example","link":"https:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Potter-1.jpg?fit=403%2C301&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]},{"id":3405,"url":"https:\/\/ucanalytics.com\/blogs\/association-analysis-retail-case-study-example-part-4\/","url_meta":{"origin":8700,"position":4},"title":"Association Analysis &#8211; Retail Case Study Example (Part 4)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"This is a continuation of the case study example of\u00a0marketing analytics we have been\u00a0discussing for the last few articles. You can find the previous parts at the following links\u00a0(\u00a0Part 1,\u00a0Part 2,\u00a0and Part 3). \u00a0In the last part, we discussed exploratory data analysis (EDA:\u00a0Part 3). In this article\u00a0we will talk about\u2026","rel":"","context":"In &quot;Marketing Analytics&quot;","block_context":{"text":"Marketing Analytics","link":"https:\/\/ucanalytics.com\/blogs\/category\/marketing-analytics\/"},"img":{"alt_text":"A-Beautiful-Mind","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/07\/A-Beautiful-Mind.jpg?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/07\/A-Beautiful-Mind.jpg?resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/07\/A-Beautiful-Mind.jpg?resize=525%2C300 1.5x"},"classes":[]},{"id":5632,"url":"https:\/\/ucanalytics.com\/blogs\/step-by-step-graphic-guide-to-forecasting-through-arima-modeling-in-r-manufacturing-case-study-example\/","url_meta":{"origin":8700,"position":5},"title":"Step-by-Step Graphic Guide to Forecasting through ARIMA Modeling using R &#8211; Manufacturing Case Study Example (Part 4)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"This article is a continuation of our manufacturing case study example to\u00a0forecast tractor sales through time series and ARIMA models. You can\u00a0find the previous parts at the following links: Part 1\u00a0: Introduction to time series modeling & forecasting Part 2: Time series decomposition to decipher patterns and trends before forecasting\u2026","rel":"","context":"In &quot;Manufacturing Case Study Example&quot;","block_context":{"text":"Manufacturing Case Study Example","link":"https:\/\/ucanalytics.com\/blogs\/category\/manufacturing-case-study-example\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/06\/photo-1.jpg?fit=412%2C336&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts\/8700"}],"collection":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/comments?post=8700"}],"version-history":[{"count":0,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts\/8700\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/media\/8703"}],"wp:attachment":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/media?parent=8700"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/categories?post=8700"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/tags?post=8700"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}