{"id":9145,"date":"2016-10-30T13:07:06","date_gmt":"2016-10-30T07:37:06","guid":{"rendered":"http:\/\/ucanalytics.com\/blogs\/?p=9145"},"modified":"2016-12-06T09:50:23","modified_gmt":"2016-12-06T04:20:23","slug":"data-simulation-regression-modeling-pricing-case-study-example-part-6","status":"publish","type":"post","link":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/","title":{"rendered":"Data Simulation for Regression Modeling &#8211; Pricing Case Study Example (Part 6)"},"content":{"rendered":"<blockquote>\n<p style=\"text-align: left;\"><strong>&#8220;Data! Data! Data!&#8221; he cried impatiently. &#8220;I can&#8217;t make bricks without clay.&#8221;<\/strong><\/p>\n<p style=\"text-align: right;\"><span style=\"color: #3366ff;\"><strong>&#8211; Sherlock Holmes<\/strong><\/span><\/p>\n<\/blockquote>\n<div id=\"attachment_9392\" style=\"width: 329px\" class=\"wp-caption alignright\"><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Data-Simulation-Creation-\u2013-by-Roopam.jpg\"><img aria-describedby=\"caption-attachment-9392\" data-attachment-id=\"9392\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/data-simulation-creation-by-roopam\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Data-Simulation-Creation-\u2013-by-Roopam.jpg?fit=377%2C538&amp;ssl=1\" data-orig-size=\"377,538\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"data-simulation-creation-by-roopam\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Data-Simulation-Creation-\u2013-by-Roopam.jpg?fit=210%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Data-Simulation-Creation-\u2013-by-Roopam.jpg?fit=377%2C538&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-9392\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Data-Simulation-Creation-\u2013-by-Roopam.jpg?resize=319%2C455\" alt=\"data-simulation-creation-by-roopam\" width=\"319\" height=\"455\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Data-Simulation-Creation-\u2013-by-Roopam.jpg?w=377&amp;ssl=1 377w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Data-Simulation-Creation-\u2013-by-Roopam.jpg?resize=175%2C250&amp;ssl=1 175w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Data-Simulation-Creation-\u2013-by-Roopam.jpg?resize=210%2C300&amp;ssl=1 210w\" sizes=\"(max-width: 319px) 100vw, 319px\" data-recalc-dims=\"1\" \/><\/a><p id=\"caption-attachment-9392\" class=\"wp-caption-text\">Data Simulation &amp; Creation &#8211; by Roopam<\/p><\/div>\n<p>This is a continuation of our regression case study example. In the previous parts, we have learned, as Sherlock Holmes says, to make bricks i.e. develop regression models. In this part, we will learn how to make clay from scratch i.e. create raw data through data simulation. Data simulation is a great way to learn the deeper nuances of modeling and analysis. Since\u00a0we cook up our own data in data simulation, we can test if the model is capturing the patterns we had hid in the data. In this article, we will simulate the data we had used for the last 5 parts of this case study example. We will also go back and see if our models had decoded our encryption. You can find all the parts of this case study at this link : <strong><a href=\"http:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/\">Regression Case Study Example<\/a><\/strong>.<\/p>\n<p>After reading this article, I suggest you simulate more datasets and analyze them to learn nuances of analysis and regression. I have learned a great deal about data science through data simulation. Data simulation empowers analysts, in an extremely small way, to play God. Let&#8217;s see how.<\/p>\n<h2><span style=\"color: #3366ff;\">Data Simulation and Creation Myths<\/span><\/h2>\n<p>How did the universe and humans come to being? In the modern world, it is scientifically accepted that the Big Bang created the universe and evolution produced humans from primitive life forms. But before these scientific theories were developed and tested, most cultures had their mythical stories about the creation of the universe, the earth, and humans by mythical characters. These stories are referred to as creation myths.<\/p>\n<p>For instance, Genesis, the first book of the Bible, describes that God created the entire universe in seven days. The story goes that God created day\u00a0and night on the first day followed by other stuff in the universe on the subsequent days. He made animals and humans on the sixth day and rested on the seventh. Based on references in the Bible, the age of the universe is estimated close to 6000 years. The actual age of the universe calculated by scientists is roughly 14 billion years. That&#8217;s a massive calculation error when estimating the age of the universe using the Bible.<\/p>\n<div id=\"attachment_9264\" style=\"width: 322px\" class=\"wp-caption alignleft\"><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Drawing-Hands.jpg\"><img aria-describedby=\"caption-attachment-9264\" data-attachment-id=\"9264\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/drawing-hands\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Drawing-Hands.jpg?fit=498%2C416&amp;ssl=1\" data-orig-size=\"498,416\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"drawing-hands\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Drawing-Hands.jpg?fit=300%2C251&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Drawing-Hands.jpg?fit=498%2C416&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-9264\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Drawing-Hands.jpg?resize=312%2C261\" alt=\"Drawing Hands - M.C. Escher\" width=\"312\" height=\"261\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Drawing-Hands.jpg?w=498&amp;ssl=1 498w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Drawing-Hands.jpg?resize=250%2C209&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Drawing-Hands.jpg?resize=300%2C251&amp;ssl=1 300w\" sizes=\"(max-width: 312px) 100vw, 312px\" data-recalc-dims=\"1\" \/><\/a><p id=\"caption-attachment-9264\" class=\"wp-caption-text\">Drawing Hands &#8211; M.C. Escher<\/p><\/div>\n<p>Ok so let&#8217;s forget about science for now and imagine that this creation myth is true. God created the universe 6000 years ago. Now, let&#8217;s also imagine that God has decided\u00a0today to analyze his creation piece by piece. Will he find everything the same as when he had created the universe? The answer is no since all these elements of the universe are interlinked in a complicated fashion with each other. The creation, in this case, has taken a shape of its own. Complex and intertwined systems have this tendency,<\/p>\n<p>Data simulation empowers analysts to create reasonably complicated datasets through random numbers. Also, like God&#8217;s creation, if the data is complicated, it has the power to surprise analysts at the time of analysis like natural systems. Mathematical simulations\u00a0are used by scientists to study complicated natural phenomena like turbulence, financial markets, weather patterns, quantum mechanics, chaos etc. Monte Carlo simulations are used extensively in risk modeling.<\/p>\n<p>I must also say that simulation is good fun. After all, how often do you get the chance to play God? Let&#8217;s go back to our case study example and simulate some data.<\/p>\n<h2><span style=\"color: #3366ff;\">Data Simulation for Regression Case Study Example<\/span><\/h2>\n<p>We are simulating the data we had used in this <strong><a href=\"http:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/\">Regression Case Study Example<\/a><\/strong>. If you could recall, we had 8 predictors or independent variables in this regression dataset, and a numeric response or dependent variable i.e. house price.<\/p>\n<table style=\"height: 279px; width: 654px;\" border=\"1\">\n<tbody>\n<tr style=\"height: 23px;\">\n<td style=\"width: 257px; height: 23px; background-color: #62a3d9;\"><strong><span style=\"color: #ffffff;\">Variable Type<\/span><\/strong><\/td>\n<td style=\"width: 133px; height: 23px; background-color: #62a3d9;\"><strong><span style=\"color: #ffffff;\">Variable Name<\/span><\/strong><\/td>\n<td style=\"width: 248px; height: 23px; background-color: #62a3d9;\"><strong><span style=\"color: #ffffff;\">Features<\/span><\/strong><\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"width: 257px; height: 138.333px;\" rowspan=\"6\">Numeric independent variable<\/td>\n<td style=\"width: 133px; height: 23px;\">Dist_Taxi<\/td>\n<td style=\"width: 248px; height: 69.3334px;\" rowspan=\"3\">Distance to taxi stand, market and hospital are correlated<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"width: 133px; height: 23px;\">Dist_Market<\/td>\n<\/tr>\n<tr style=\"height: 23.3334px;\">\n<td style=\"width: 133px; height: 23.3334px;\">Dist_Hospital<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"width: 133px; height: 23px;\">Carpet Area<\/td>\n<td style=\"width: 248px; height: 46px;\" rowspan=\"2\">Carpet and built-up area are highly correlated<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"width: 133px; height: 23px;\">Built-up Area<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"width: 133px; height: 23px;\">Rainfall<\/td>\n<td style=\"width: 248px; height: 23px;\">\u00a0Random variable<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"width: 257px; height: 46px;\" rowspan=\"2\">Categoric independent variable<\/td>\n<td style=\"width: 133px; height: 23px;\">Parking<\/td>\n<td style=\"width: 248px; height: 23px;\">\u00a04 categories<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"width: 133px; height: 23px;\">City_Category<\/td>\n<td style=\"width: 248px; height: 23px;\">\u00a03 categories<\/td>\n<\/tr>\n<tr style=\"height: 23px;\">\n<td style=\"width: 257px; height: 23px;\">Numeric dependent variable<\/td>\n<td style=\"width: 133px; height: 23px;\">House_Price<\/td>\n<td style=\"width: 248px; height: 23px;\">To predict with dependent variables<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Based on the requirements, these are our objectives for data simulation:<\/p>\n<ol>\n<li>Create 3 correlated variables: distance to taxi, market, and hospital using a correlation matrix<\/li>\n<li>Create 2 highly correlated variables: carpet area and built-up area<\/li>\n<li>Create categorical variables: parking and city category. Also, create a random variable rainfall.<\/li>\n<li>Create a dependent variable with a defined relationship with some of the independent variables<\/li>\n<\/ol>\n<h5><span style=\"color: #3366ff;\">Objective 1. Create Correlated Variables by Cholesky Decomposition<\/span><\/h5>\n<p>The first thing we need to define is the correlation matrix for the 3 numeric variables i.e.\u00a0distance to taxi, market, and hospital.<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=Correlation%5C+Matrix%5C+%28Expected%29+%3D+%5Cbegin%7Bpmatrix%7D1.00+%26+0.45+%26+0.80%5C%5C+0.45+%26+1.00+%26+0.6%5C%5C+0.80+%26+0.6+%26+1.00%5Cend%7Bpmatrix%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"Correlation&#92; Matrix&#92; (Expected) = &#92;begin{pmatrix}1.00 &amp; 0.45 &amp; 0.80&#92;&#92; 0.45 &amp; 1.00 &amp; 0.6&#92;&#92; 0.80 &amp; 0.6 &amp; 1.00&#92;end{pmatrix}\" class=\"latex\" \/><\/pre>\n<p>Cholesky decomposition is a powerful mechanism to generate correlated variables from the random numbers as displayed in this schematic.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg\"><img data-attachment-id=\"9306\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/cholesky-decomposition-correlation\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg?fit=1838%2C711&amp;ssl=1\" data-orig-size=\"1838,711\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"cholesky-decomposition-correlation\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg?fit=300%2C116&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg?fit=640%2C248&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-9306 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg?resize=640%2C248\" alt=\"cholesky-decomposition-correlation\" width=\"640\" height=\"248\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg?w=1838&amp;ssl=1 1838w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg?resize=250%2C97&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg?resize=300%2C116&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg?resize=768%2C297&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg?resize=1024%2C396&amp;ssl=1 1024w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Cholesky-Decomposition-Correlation.jpg?w=1280 1280w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Essentially, first, you decompose the expected correlation matrix through Cholesky decomposition. Then multiplication of \u00a0the Cholesky component with random numbers generates the desired data set.<\/p>\n<p>This R code generates Cholesky components of the correlation matrix as required.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nCorrlation.Matix = matrix(cbind(1,.45,.8, .45,1,.6, .8,.6,1),nrow=3)\r\nCholesky = t(chol(Corrlation.Matix))\r\n<\/pre>\n<p>Now we will multiply the Cholesky component with 3 normally distributed random variables to produce correlated variables i.e.\u00a0distance to taxi, market, and hospital.<\/p>\n<pre class=\"brush: r; first-line: 3; title: ; notranslate\" title=\"\">\r\nset.seed(2)\r\nrandom.normal = matrix(rnorm(3*930,8000,2500), nrow=3, ncol=930)\r\nData = as.data.frame(t(Cholesky %*% random.normal))\r\nnames(Data) = c(&quot;Dist_Taxi&quot;,&quot;Dist_Market&quot;,&quot;Dist_Hospital&quot;)\r\n<\/pre>\n<p>Okay, so let&#8217;s see how the transformation worked by calculation of correlation matrices. First, let&#8217;s estimate correlation matrix for the random variables.<\/p>\n<pre class=\"brush: r; first-line: 7; title: ; notranslate\" title=\"\"> cor(t(random.normal))\r\n<\/pre>\n<p>As expected, there is a\u00a0poor correlation between the 3 random variables.<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=Correlation%5C+Matrix%5C+%28Random%5C+Variables%29+%3D+%5Cbegin%7Bpmatrix%7D1.00+%26+-0.008+%26+0.022%5C%5C+-0.008+%26+1.00+%26+0.052%5C%5C+0.022+%26+0.052+%26+1.00%5Cend%7Bpmatrix%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"Correlation&#92; Matrix&#92; (Random&#92; Variables) = &#92;begin{pmatrix}1.00 &amp; -0.008 &amp; 0.022&#92;&#92; -0.008 &amp; 1.00 &amp; 0.052&#92;&#92; 0.022 &amp; 0.052 &amp; 1.00&#92;end{pmatrix}\" class=\"latex\" \/><\/pre>\n<p>Now, let&#8217;s see the correlation of data generated through the transformation of these random variables by Cholesky decomposition.<\/p>\n<pre class=\"brush: r; first-line: 8; title: ; notranslate\" title=\"\">\r\ncor(Data) <\/pre>\n<p>Not bad, this data has the correlation\u00a0matrix quite similar to the expected correlation matrix.<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=Correlation%5C+Matrix%5C+%28Observed%5C+Data%29+%3D+%5Cbegin%7Bpmatrix%7D1.00+%26+0.444+%26+0.797%5C%5C+0.444+%26+1.00+%26+0.615%5C%5C+0.797+%26+0.615+%26+1.00%5Cend%7Bpmatrix%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"Correlation&#92; Matrix&#92; (Observed&#92; Data) = &#92;begin{pmatrix}1.00 &amp; 0.444 &amp; 0.797&#92;&#92; 0.444 &amp; 1.00 &amp; 0.615&#92;&#92; 0.797 &amp; 0.615 &amp; 1.00&#92;end{pmatrix}\" class=\"latex\" \/>\r\n<\/pre>\n<p>We\u00a0have accomplished our\u00a0first objective for data simulation. Now, let proceed to the second objective.<\/p>\n<h5><span style=\"color: #3366ff;\">Objective 2. Create 2 Highly Correlated Variables<\/span><\/h5>\n<p>We can use Cholesky decomposition to generate these variables. However, let&#8217;s try something different. We will generate the first variable i.e. carpet area through a random normal distribution.<\/p>\n<pre class=\"brush: r; first-line: 9; title: ; notranslate\" title=\"\">\r\n\r\nset.seed(245)\r\nData$carpet = rnorm(930,1500,250)\r\n\r\n<\/pre>\n<p>Now, we will add a tiny bit of noise to carpet area to produce the second variable i.e. built-up area.<\/p>\n<pre class=\"brush: r; first-line: 11; title: ; notranslate\" title=\"\">\r\nData$builtup = Data$carpet+rnorm(930,.2*Data$carpet,.01*Data$carpet)\r\ncor(Data$carpet,Data$builtup)\r\nData = round(Data,0)\r\n<\/pre>\n<p>Since we have added just a small fraction of noise, the correlation between the two variables turned out to be quite high i.e. 0.998. This is an almost perfect correlation. You may want to play around with the noise factor, random normal distribution i.e. rnorm parameters, to see how correlation varies with different inputs.<\/p>\n<p>We will also use this same method for adding noise to our final model.<\/p>\n<h5><span style=\"color: #3366ff;\">Objective 3. Create Categorical Variables\u00a0<\/span><\/h5>\n<p>Now, we will generate categorical variables: parking and city category. For parking, we will use a predefined probability distribution to generate four classes in this categorical variable.<\/p>\n<pre class=\"brush: r; first-line: 14; title: ; notranslate\" title=\"\"> \r\nset.seed(5) \r\nData$parking = as.factor(sample(c(&quot;Open&quot;, &quot;Covered&quot;, &quot;No Parking&quot;, &quot;Not Provided&quot;),size = 930, prob = c(0.4, 0.2, 0.15, 0.25),replace = TRUE)) \r\n<\/pre>\n<p>Similarly, we will also generate 3 classes of city category.<\/p>\n<pre class=\"brush: r; first-line: 16; title: ; notranslate\" title=\"\"> set.seed(20) \r\nData$City_Category = as.factor(sample(c(&quot;CAT A&quot;, &quot;CAT B&quot;, &quot;CAT C&quot;),size = 930, prob = c(0.35, 0.4, 0.25),replace = TRUE))<\/pre>\n<p>Finally, we will generate the last independent variable: rainfall. This variable will have no correlation with the other independent variables or the dependent variable i.e. house price.<\/p>\n<pre class=\"brush: r; first-line: 18; title: ; notranslate\" title=\"\">\r\nset.seed(30)\r\nData$rainfall = round(rnorm(930,80,25),0)*10 <\/pre>\n<h5><span style=\"color: #3366ff;\">Objective 4. Create Dependent Variable with Relationship to Independent Variables\u00a0<\/span><\/h5>\n<p>I am sharing the method I used to generate the data for the regression case study example but I seriously recommend that you play around with different combinations of the relationship between the dependent and independent variable and make different regression models. Trust me, from the practical point of view, there is no better method to learn about aspects of modeling than simulating data and developing models with the simulated data.<\/p>\n<pre class=\"brush: r; first-line: 20; title: ; notranslate\" title=\"\"> \r\nrequire(FactoMineR) \r\npca1 = PCA(Data[,1:5]) \r\nData = cbind(Data,pca1$ind$coord[,1:2]) \r\nData = cbind(Data,model.matrix( ~ parking-1, data=Data),model.matrix( ~ City_Category-1, data=Data))<\/pre>\n<p>For this <strong><a href=\"http:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/\">case study example<\/a><\/strong>, I took the principal components of the dependent variables and used the first two components (Dim.1 and Dim.2) in the model. Moreover, I had converted categorical variables into dummy variables (code line 23) to be used in the model. This is the equation that I had used to create the independent variable (house price).<\/p>\n<pre class=\"brush: r; first-line: 24; title: ; notranslate\" title=\"\">\r\nset.seed(253)\r\nData$price = (round((Data$Dim.1+abs(Data$Dim.1)+1)*1.85+(Data$Dim.2+abs(Data$Dim.2)+1)*1.39 +rnorm(930,30,12),digits = 2)+ (Data$`City_CategoryCAT A`*35+Data$`City_CategoryCAT B`*17+Data$`City_CategoryCAT C`*8)+ (Data$parkingCovered*4+Data$parkingOpen*1.5+Data$`parkingNo Parking`*0.45))<\/pre>\n<pre class=\"brush: r; first-line: 26; title: ; notranslate\" title=\"\"> Data$price = Data$price*10^5 <\/pre>\n<p>I will let you check whether our <a href=\"http:\/\/ucanalytics.com\/blogs\/step-step-regression-models-pricing-case-study-example-part-5\/\"><strong>regression models<\/strong><\/a> in the last part had deciphered this equation.<\/p>\n<h4><span style=\"color: #3366ff;\">Sign-off Note<\/span><\/h4>\n<p>Usually, natural phenomena are captured by humans in the form of data. Simulation empowers humans to generate their own data and hence in a small way allows them to be powerful like nature. If you prefer a more dramatic definition for nature call it God. Enjoy playing God while you simulate and learn more about regression modeling.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;Data! Data! Data!&#8221; he cried impatiently. &#8220;I can&#8217;t make bricks without clay.&#8221; &#8211; Sherlock Holmes This is a continuation of our regression case study example. In the previous parts, we have learned, as Sherlock Holmes says, to make bricks i.e. develop regression models. In this part, we will learn how to make clay from scratch<\/p>\n<p><a class=\"excerpt-more blog-excerpt\" href=\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/\">Read More&#8230;<\/a><\/p>\n","protected":false},"author":1,"featured_media":9382,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[80],"tags":[],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v17.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Data Simulation for Regression Modeling - Pricing Case Study Example (Part 6) &ndash; YOU CANalytics |<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Data Simulation for Regression Modeling - Pricing Case Study Example (Part 6) &ndash; YOU CANalytics |\" \/>\n<meta property=\"og:description\" content=\"&#8220;Data! Data! Data!&#8221; he cried impatiently. &#8220;I can&#8217;t make bricks without clay.&#8221; &#8211; Sherlock Holmes This is a continuation of our regression case study example. In the previous parts, we have learned, as Sherlock Holmes says, to make bricks i.e. develop regression models. In this part, we will learn how to make clay from scratchRead More...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/\" \/>\n<meta property=\"og:site_name\" content=\"YOU CANalytics |\" \/>\n<meta property=\"article:author\" content=\"roopam\" \/>\n<meta property=\"article:published_time\" content=\"2016-10-30T07:37:06+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2016-12-06T04:20:23+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Potter-1.jpg?fit=403%2C301&#038;ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"403\" \/>\n\t<meta property=\"og:image:height\" content=\"301\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Roopam Upadhyay\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Organization\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\",\"name\":\"YOU CANalytics\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/\",\"sameAs\":[],\"logo\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#logo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120\",\"contentUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120\",\"width\":607,\"height\":120,\"caption\":\"YOU CANalytics\"},\"image\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#logo\"}},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#website\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/\",\"name\":\"YOU CANalytics |\",\"description\":\"Explore the Power of Data Science\",\"publisher\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ucanalytics.com\/blogs\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Potter-1.jpg?fit=403%2C301&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Potter-1.jpg?fit=403%2C301&ssl=1\",\"width\":403,\"height\":301},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#webpage\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/\",\"name\":\"Data Simulation for Regression Modeling - Pricing Case Study Example (Part 6) &ndash; YOU CANalytics |\",\"isPartOf\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#primaryimage\"},\"datePublished\":\"2016-10-30T07:37:06+00:00\",\"dateModified\":\"2016-12-06T04:20:23+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ucanalytics.com\/blogs\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Data Simulation for Regression Modeling &#8211; Pricing Case Study Example (Part 6)\"}]},{\"@type\":\"Article\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#webpage\"},\"author\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6\"},\"headline\":\"Data Simulation for Regression Modeling &#8211; Pricing Case Study Example (Part 6)\",\"datePublished\":\"2016-10-30T07:37:06+00:00\",\"dateModified\":\"2016-12-06T04:20:23+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#webpage\"},\"wordCount\":1615,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\"},\"image\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Potter-1.jpg?fit=403%2C301&ssl=1\",\"articleSection\":[\"Pricing Case Study Example\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#respond\"]}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6\",\"name\":\"Roopam Upadhyay\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g\",\"caption\":\"Roopam Upadhyay\"},\"description\":\"This blog contains my personal views and thoughts on predictive Analytics and big data. - Roopam Upadhyay\",\"sameAs\":[\"roopam\"],\"url\":\"https:\/\/ucanalytics.com\/blogs\/author\/roopam\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Data Simulation for Regression Modeling - Pricing Case Study Example (Part 6) &ndash; YOU CANalytics |","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/","og_locale":"en_US","og_type":"article","og_title":"Data Simulation for Regression Modeling - Pricing Case Study Example (Part 6) &ndash; YOU CANalytics |","og_description":"&#8220;Data! Data! Data!&#8221; he cried impatiently. &#8220;I can&#8217;t make bricks without clay.&#8221; &#8211; Sherlock Holmes This is a continuation of our regression case study example. In the previous parts, we have learned, as Sherlock Holmes says, to make bricks i.e. develop regression models. In this part, we will learn how to make clay from scratchRead More...","og_url":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/","og_site_name":"YOU CANalytics |","article_author":"roopam","article_published_time":"2016-10-30T07:37:06+00:00","article_modified_time":"2016-12-06T04:20:23+00:00","og_image":[{"width":403,"height":301,"url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Potter-1.jpg?fit=403%2C301&ssl=1","type":"image\/jpeg"}],"twitter_misc":{"Written by":"Roopam Upadhyay","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Organization","@id":"https:\/\/ucanalytics.com\/blogs\/#organization","name":"YOU CANalytics","url":"https:\/\/ucanalytics.com\/blogs\/","sameAs":[],"logo":{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/#logo","inLanguage":"en-US","url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120","contentUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120","width":607,"height":120,"caption":"YOU CANalytics"},"image":{"@id":"https:\/\/ucanalytics.com\/blogs\/#logo"}},{"@type":"WebSite","@id":"https:\/\/ucanalytics.com\/blogs\/#website","url":"https:\/\/ucanalytics.com\/blogs\/","name":"YOU CANalytics |","description":"Explore the Power of Data Science","publisher":{"@id":"https:\/\/ucanalytics.com\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ucanalytics.com\/blogs\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#primaryimage","inLanguage":"en-US","url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Potter-1.jpg?fit=403%2C301&ssl=1","contentUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Potter-1.jpg?fit=403%2C301&ssl=1","width":403,"height":301},{"@type":"WebPage","@id":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#webpage","url":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/","name":"Data Simulation for Regression Modeling - Pricing Case Study Example (Part 6) &ndash; YOU CANalytics |","isPartOf":{"@id":"https:\/\/ucanalytics.com\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#primaryimage"},"datePublished":"2016-10-30T07:37:06+00:00","dateModified":"2016-12-06T04:20:23+00:00","breadcrumb":{"@id":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ucanalytics.com\/blogs\/"},{"@type":"ListItem","position":2,"name":"Data Simulation for Regression Modeling &#8211; Pricing Case Study Example (Part 6)"}]},{"@type":"Article","@id":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#article","isPartOf":{"@id":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#webpage"},"author":{"@id":"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6"},"headline":"Data Simulation for Regression Modeling &#8211; Pricing Case Study Example (Part 6)","datePublished":"2016-10-30T07:37:06+00:00","dateModified":"2016-12-06T04:20:23+00:00","mainEntityOfPage":{"@id":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#webpage"},"wordCount":1615,"commentCount":1,"publisher":{"@id":"https:\/\/ucanalytics.com\/blogs\/#organization"},"image":{"@id":"https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Potter-1.jpg?fit=403%2C301&ssl=1","articleSection":["Pricing Case Study Example"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/ucanalytics.com\/blogs\/data-simulation-regression-modeling-pricing-case-study-example-part-6\/#respond"]}]},{"@type":"Person","@id":"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6","name":"Roopam Upadhyay","image":{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/#personlogo","inLanguage":"en-US","url":"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g","caption":"Roopam Upadhyay"},"description":"This blog contains my personal views and thoughts on predictive Analytics and big data. - Roopam Upadhyay","sameAs":["roopam"],"url":"https:\/\/ucanalytics.com\/blogs\/author\/roopam\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/10\/Potter-1.jpg?fit=403%2C301&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3L0jT-2nv","jetpack-related-posts":[{"id":8388,"url":"https:\/\/ucanalytics.com\/blogs\/regression-analysis-pricing-case-study-example-part-1\/","url_meta":{"origin":9145,"position":0},"title":"Regression Analysis &#8211; Pricing Case Study Example (Part 1)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"How to figure out if you are paying the right price for the property you are about to purchase? Welcome to a new data science case study example on YOU CANalytics to identify the right housing price. Pricing is a highly important and\u00a0specialized function for any business. A right price\u2026","rel":"","context":"In &quot;Pricing Case Study Example&quot;","block_context":{"text":"Pricing Case Study Example","link":"https:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/07\/Connect-the-Dots.jpg?fit=397%2C603&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]},{"id":8488,"url":"https:\/\/ucanalytics.com\/blogs\/data-preparation-regression-pricing-case-study-example-part-2\/","url_meta":{"origin":9145,"position":1},"title":"Data Preparation for Regression &#8211; Pricing Case Study Example (Part 2)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"In the last post we had started a case study example for regression analysis to help an investment firm make money through property price arbitrage\u00a0(read part 1 :\u00a0regression case study example).\u00a0This is an interactive case study example and required your help to move forward. These are some of your observations\u2026","rel":"","context":"In &quot;Analytics Labs&quot;","block_context":{"text":"Analytics Labs","link":"https:\/\/ucanalytics.com\/blogs\/category\/analytics-labs\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-analysis.jpg?fit=448%2C528&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]},{"id":6112,"url":"https:\/\/ucanalytics.com\/blogs\/data-analytics-challenge-1-solve-the-case-of-a-shady-gambler-clue-2\/","url_meta":{"origin":9145,"position":2},"title":"Data Analytics Challenge 1(Clue # 2) : Solve the Case of a Shady Gambler","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"Sometimes answers lead to more questions. In data science this sometime is true almost everytime. Analytics Challenge - The Shady Gamble In the previous article, you were approached by Scotland Yard to investigate charges against a gambler with dubious character. The allegation against him was that he had loaded his\u2026","rel":"","context":"In &quot;Analytics Challenge&quot;","block_context":{"text":"Analytics Challenge","link":"https:\/\/ucanalytics.com\/blogs\/category\/analytics-challenge\/"},"img":{"alt_text":"Mystery Challenge plot","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/09\/Mystery-Challenge-plot.jpg?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/09\/Mystery-Challenge-plot.jpg?resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/09\/Mystery-Challenge-plot.jpg?resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/09\/Mystery-Challenge-plot.jpg?resize=700%2C400 2x"},"classes":[]},{"id":9018,"url":"https:\/\/ucanalytics.com\/blogs\/step-step-regression-models-pricing-case-study-example-part-5\/","url_meta":{"origin":9145,"position":3},"title":"Step by Step Regression Modeling Using Principal Component Analysis &#8211; Case Study Example (Part 5)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"This is a continuation of our case study example to estimate property pricing. In this part, you will learn nuances of regression modeling by building three different regression models and compare their results.\u00a0We will also use results of the principal component analysis, discussed in the last part, to develop a\u2026","rel":"","context":"In &quot;Pricing Case Study Example&quot;","block_context":{"text":"Pricing Case Study Example","link":"https:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Sumo-and-Regression-Model.jpg?fit=918%2C384&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Sumo-and-Regression-Model.jpg?fit=918%2C384&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Sumo-and-Regression-Model.jpg?fit=918%2C384&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/09\/Sumo-and-Regression-Model.jpg?fit=918%2C384&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":8649,"url":"https:\/\/ucanalytics.com\/blogs\/bivariate-analysis-leverage-regression-case-study-example-part-3\/","url_meta":{"origin":9145,"position":4},"title":"Bivariate Analysis &#038; Leverage &#8211; Regression Case Study Example (Part 3)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"Welcome back to the\u00a0case study example for regression analysis where you are helping an investment firm make money through property price arbitrage. In the last two parts (Part 1 & Part 2) you started with the univariate analysis to identify patterns in the data including missing data and outliers. In\u2026","rel":"","context":"In &quot;Pricing Case Study Example&quot;","block_context":{"text":"Pricing Case Study Example","link":"https:\/\/ucanalytics.com\/blogs\/category\/pricing-case-study-example\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-Case-Study-Example.jpg?fit=1156%2C720&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-Case-Study-Example.jpg?fit=1156%2C720&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-Case-Study-Example.jpg?fit=1156%2C720&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-Case-Study-Example.jpg?fit=1156%2C720&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/08\/Regression-Case-Study-Example.jpg?fit=1156%2C720&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":3973,"url":"https:\/\/ucanalytics.com\/blogs\/model-selection-retail-case-study-example-part-7\/","url_meta":{"origin":9145,"position":5},"title":"Model Selection &#8211; Retail Case Study Example (Part 7)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"Model Selection This is a continuation of our retail case study example for campaign and marketing analytics. In the previous two parts, we discussed a couple of decision tree algorithms (CART and C4.5)\u00a0for classification. Recall a previous case study example on\u00a0banking and risk management where we discussed logistic regression\u00a0which is\u2026","rel":"","context":"In &quot;Marketing Analytics&quot;","block_context":{"text":"Marketing Analytics","link":"https:\/\/ucanalytics.com\/blogs\/category\/marketing-analytics\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/09\/photo.jpg?fit=1200%2C1029&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/09\/photo.jpg?fit=1200%2C1029&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/09\/photo.jpg?fit=1200%2C1029&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/09\/photo.jpg?fit=1200%2C1029&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/09\/photo.jpg?fit=1200%2C1029&ssl=1&resize=1050%2C600 3x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts\/9145"}],"collection":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/comments?post=9145"}],"version-history":[{"count":0,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts\/9145\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/media\/9382"}],"wp:attachment":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/media?parent=9145"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/categories?post=9145"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/tags?post=9145"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}