{"id":10469,"date":"2017-09-27T16:32:41","date_gmt":"2017-09-27T11:02:41","guid":{"rendered":"http:\/\/ucanalytics.com\/blogs\/?p=10469"},"modified":"2018-09-16T17:39:16","modified_gmt":"2018-09-16T12:09:16","slug":"gradient-descent-logistic-regression-simplified-step-step-visual-guide","status":"publish","type":"post","link":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/","title":{"rendered":"Gradient Descent for Logistic Regression Simplified &#8211; Step by Step Visual Guide"},"content":{"rendered":"<hr \/>\n<p><img data-attachment-id=\"10470\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/star-trek\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/star-trek.jpg?fit=639%2C852&amp;ssl=1\" data-orig-size=\"639,852\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"star trek\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/star-trek.jpg?fit=225%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/star-trek.jpg?fit=639%2C852&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-10470 alignright\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/star-trek.jpg?resize=300%2C400\" alt=\"\" width=\"300\" height=\"400\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/star-trek.jpg?w=639&amp;ssl=1 639w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/star-trek.jpg?resize=188%2C250&amp;ssl=1 188w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/star-trek.jpg?resize=225%2C300&amp;ssl=1 225w\" sizes=\"(max-width: 300px) 100vw, 300px\" data-recalc-dims=\"1\" \/><\/p>\n<p>If you want to gain a sound understanding of machine learning then you must know gradient descent optimization.\u00a0In this article, you will get a detailed and intuitive understanding of gradient descent to solve machine learning algorithms. The entire tutorial uses images and visuals to make things easy to grasp.<\/p>\n<p>Here, we will use an example from sales and marketing to identify customers who will purchase perfumes. Gradient descent, by the way, is a numerical method to solve such business problems using machine learning algorithms such as regression, neural networks, deep learning etc. Moreover, in this article, you will build an end-to-end logistic regression model using gradient descent. Do check out this earlier article on <strong><a href=\"http:\/\/ucanalytics.com\/blogs\/intuitive-machine-learning-gradient-descent-simplified\/\" target=\"_blank\" rel=\"noopener\">gradient descent for estimation through linear regression\u00a0<\/a><\/strong><\/p>\n<p>But before that let&#8217;s boldly go where no man has gone before and explore a few linkages between&#8230;<\/p>\n<h2><span style=\"color: #3366ff;\">Gradient Descent &amp;\u00a0Star Trek<\/span><\/h2>\n<p>Star Trek is a science fiction TV series created by Gene Roddenberry. The original show was first aired in the mid-1960s. Since then Star Trek has become one of the largest\u00a0franchises for Hollywood. Till date, Star Trek has seven different television series, and thirteen motion pictures based on its different avatars.<\/p>\n<p>The original 1960s show is among the first few TV shows I remember from my childhood. Captain Kirk and Spock are cult\u00a0figures. The opening line of Star Trek still gives me goosebumps. It goes like this:<\/p>\n<blockquote><p>Space, the final frontier. These are the voyages of the starship Enterprise. Its five-year mission: to explore strange new worlds, to seek out new life and new civilizations, to boldly go where no man has gone before.<\/p>\n<p style=\"text-align: right;\">&#8211; Opening Dialogue, Star Trek<\/p>\n<\/blockquote>\n<p>Star Trek is about exploring the unknown. Essentially, trekking as a concept is about making a difficult journey to arrive at the destination. You will soon learn that gradient descent, a numeric approach to solve machine learning algorithm, is no different than trekking.<\/p>\n<p>Moreover, Star Trek has this fascinating device called Transporter &#8211; a machine that could teleport Captain Kirk and his crew members to the desired location in no time.\u00a0Keep the ideas of trekking and Transporter in your mind because soon you will play Captain Kirk to solve a gradient descent problem. But before that let&#8217;s define our business problem and solution objectives.<\/p>\n<h2><span style=\"color: #3366ff;\">Market Research Problem &#8211; Logistic Regression<\/span><\/h2>\n<p>You are a market researcher and helping the perfume industry to understand their customer\u00a0segments. You had asked your team to survey 200 buyers (1) and 200 non-buyers (0) of perfumes. This information is stored in the y variable of <a href=\"http:\/\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Data-L-Reg-Gradient-Descent.csv\" target=\"_blank\" rel=\"noopener\"><strong>this marketing dataset<\/strong><\/a>. Moreover, you have asked these surveyees about their monthly expenditure on cosmetics (x<sub>1 <\/sub>&#8211; reported in 100) and their annual income (x<sub>2<\/sub>\u00a0&#8211; reported in 100000). In this plot, you plotted these 400 surveyees on the x<sub>1<\/sub> and x<sub>2<\/sub> axes. Moreover, you have marked the buyers and non-buyers in different colors. You can find the complete code used in this article at <strong><a href=\"http:\/\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Logistic-Regression-R-Code.txt\">Gradient Descent &#8211; Logistic Regression (R Code)<\/a>.<\/strong><\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-Logistic-Regression.jpg\"><img data-attachment-id=\"10517\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/scatter-plot-logistic-regression\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-Logistic-Regression.jpg?fit=1205%2C905&amp;ssl=1\" data-orig-size=\"1205,905\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Scatter Plot &#8211; Logistic Regression\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-Logistic-Regression.jpg?fit=300%2C225&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-Logistic-Regression.jpg?fit=640%2C481&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-10517 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-Logistic-Regression.jpg?resize=640%2C481\" alt=\"\" width=\"640\" height=\"481\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-Logistic-Regression.jpg?w=1205&amp;ssl=1 1205w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-Logistic-Regression.jpg?resize=250%2C188&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-Logistic-Regression.jpg?resize=300%2C225&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-Logistic-Regression.jpg?resize=768%2C577&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-Logistic-Regression.jpg?resize=1024%2C769&amp;ssl=1 1024w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Clearly, the buyers and non-buyers are clustered in separate areas of this plot. Now, you want to create a clear boundary or wall to separate the buyers from non-buyers. This is kind of similar to the wall Donald Trump wants to build between the USA and Mexico. You could still see some Mexicans on the American territory and\u00a0vice-a-versa.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg\"><img data-attachment-id=\"10518\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/scatter-plot-with-boundary-logistic-regression\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg?fit=1369%2C1030&amp;ssl=1\" data-orig-size=\"1369,1030\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Scatter Plot with Boundary- Logistic Regression\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg?fit=300%2C226&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg?fit=640%2C481&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-10518 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg?resize=640%2C482\" alt=\"\" width=\"640\" height=\"482\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg?w=1369&amp;ssl=1 1369w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg?resize=250%2C188&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg?resize=300%2C226&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg?resize=768%2C578&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg?resize=1024%2C770&amp;ssl=1 1024w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Scatter-Plot-with-Boundary-Logistic-Regression.jpg?w=1280 1280w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Another representation of this wall is the density plot as shown below. Notice, in this plot, we have taken both\u00a0x<sub>1<\/sub> and x<sub>2 <\/sub>collectively as x<sub>1<\/sub>\u00a0+ x<sub>2<\/sub>. The reason why we could do this because the data was prepared in such a way to simplify things.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Density-Plot-Logistic-Regression.jpg\"><img data-attachment-id=\"10519\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/density-plot-logistic-regression\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Density-Plot-Logistic-Regression.jpg?fit=1204%2C908&amp;ssl=1\" data-orig-size=\"1204,908\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Density Plot &#8211; Logistic Regression\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Density-Plot-Logistic-Regression.jpg?fit=300%2C226&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Density-Plot-Logistic-Regression.jpg?fit=640%2C483&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-10519 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Density-Plot-Logistic-Regression.jpg?resize=640%2C483\" alt=\"\" width=\"640\" height=\"483\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Density-Plot-Logistic-Regression.jpg?w=1204&amp;ssl=1 1204w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Density-Plot-Logistic-Regression.jpg?resize=250%2C189&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Density-Plot-Logistic-Regression.jpg?resize=300%2C226&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Density-Plot-Logistic-Regression.jpg?resize=768%2C579&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Density-Plot-Logistic-Regression.jpg?resize=1024%2C772&amp;ssl=1 1024w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>As you may have noticed, if you divide this graph in half at x<sub>1<\/sub>\u00a0+ x<sub>2<\/sub>= 140 then on the right-hand side you predominantly have the buyers. Also, the left-hand side has more non-buyers then buyers. However, there is still a bit of infringement of the buyers into the non-buyers&#8217; territory and\u00a0vice-a-versa.<\/p>\n<p>A better representation would be to divide the territories fuzzily or probabilistically to incorporate both cultures and set of people. This is precisely the point all the political leaders like Donald Trump miss. This is possibly also the reason we have territory problems like Gaza Strip, Kashmir etc.<\/p>\n<h2><span style=\"color: #3366ff;\">Logistic Regression Equation and Probability<\/span><\/h2>\n<p>Mathematics, however, allows data to mingle and live in better harmony. To discover this, let&#8217;s plot y as a function of x<sub>1<\/sub> + x<sub>2<\/sub>\u00a0in a simple scatter plot. Don&#8217;t forget y=1 is for the buyers of perfumes and y=0 is for the non-buyers.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot.jpg\"><img data-attachment-id=\"10521\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/logit-plot\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot.jpg?fit=1211%2C911&amp;ssl=1\" data-orig-size=\"1211,911\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Logit Plot\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot.jpg?fit=300%2C226&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot.jpg?fit=640%2C481&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-10521 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot.jpg?resize=640%2C481\" alt=\"\" width=\"640\" height=\"481\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot.jpg?w=1211&amp;ssl=1 1211w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot.jpg?resize=250%2C188&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot.jpg?resize=300%2C226&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot.jpg?resize=768%2C578&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot.jpg?resize=1024%2C770&amp;ssl=1 1024w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><br \/>\nThe reason we have plotted this bland looking scatter plot is that we want to fit a logit function P(y=1) =<img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B1%7D%7B%281%2Be%5E%7B-z%7D%29%7D++&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;frac{1}{(1+e^{-z})}  \" class=\"latex\" \/> to this dataset. Here, P(y=1) is the probability of being a buyer in the entire space of\u00a0x<sub>1<\/sub> + x<sub>2<\/sub>. Moreover, z is a linear combination of x1 and x2 represented as\u00a0<img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=z%3D%5Cbeta_%7B0%7D+%2B+%5Cbeta_%7B1%7D+x_%7B1%7D+%2B%5Cbeta_%7B2%7D+x_%7B2%7D++&#038;bg=ffffff&#038;fg=000&#038;s=1&#038;c=20201002\" alt=\"z=&#92;beta_{0} + &#92;beta_{1} x_{1} +&#92;beta_{2} x_{2}  \" class=\"latex\" \/>.<\/p>\n<p>In addition, for simplicity, we will assume \u03b2<sub>1<\/sub> = \u03b2<sub>2 <\/sub>. This is not a good assumption in a general sense but will work for our dataset which was designed to simplify things.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg\"><img data-attachment-id=\"10520\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/logit-plot-logistic-regression\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg?fit=1482%2C1058&amp;ssl=1\" data-orig-size=\"1482,1058\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Logit Plot &#8211; Logistic Regression\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg?fit=300%2C214&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg?fit=640%2C457&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-10520 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg?resize=640%2C457\" alt=\"\" width=\"640\" height=\"457\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg?w=1482&amp;ssl=1 1482w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg?resize=250%2C178&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg?resize=300%2C214&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg?resize=768%2C548&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg?resize=1024%2C731&amp;ssl=1 1024w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logit-Plot-Logistic-Regression.jpg?w=1280 1280w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Now, all we need to do is find the values of \u03b2<sub>0<\/sub>\u00a0,<sub>\u00a0<\/sub>\u03b2<sub>1<\/sub>\u00a0and \u03b2<sub>2 <\/sub>that will minimize the prediction errors within this data. In other words, this means we want to find the values of\u00a0\u03b2<sub>0<\/sub>\u00a0,<sub>\u00a0<\/sub>\u03b2<sub>1<\/sub>\u00a0and \u03b2<sub>2 <\/sub>so that most, if not all, buyers get high probabilities on this logit function P(y=1). Similarly, most non-buyers must get low probabilities on P(y=1).<\/p>\n<p>The quest for such\u00a0\u03b2 values is the job of gradient descent. Like other quests, as you will soon see, this requires a long journey and lots of walking.<\/p>\n<h2><span style=\"color: #3366ff;\">Gradient Descent Optimization and Trekking<\/span><\/h2>\n<p>Gradient descent works similar to a hiker walking down a hilly terrain. Essentially, this terrain is similar to error or loss functions for machine learning algorithms. The idea with the ML algorithms, as already discussed, is to get to the bottom-most or minimum error\u00a0point by changing ML coefficients \u00a0\u03b2<sub>0<\/sub>\u00a0,<sub>\u00a0<\/sub>\u03b2<sub>1<\/sub>\u00a0and \u03b2<sub>2 <\/sub>. The hilly terrain is spread across the x, and y-axis on the physical world. Similarly, the error function is spread across coefficients of machine learning algorithm i.e.\u00a0\u03b2<sub>0<\/sub>\u00a0,<sub>\u00a0<\/sub>\u03b2<sub>1<\/sub>\u00a0and \u03b2<sub>2 <\/sub>.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg\"><img data-attachment-id=\"10538\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/gradient-descent-hiker-trekking\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?fit=1980%2C814&amp;ssl=1\" data-orig-size=\"1980,814\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Gradient Descent &#8211; Hiker (Trekking)\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?fit=300%2C123&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?fit=640%2C263&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\" wp-image-10538 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?resize=640%2C263\" alt=\"\" width=\"640\" height=\"263\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?w=1980&amp;ssl=1 1980w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?resize=250%2C103&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?resize=300%2C123&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?resize=768%2C316&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?resize=1024%2C421&amp;ssl=1 1024w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?w=1280 1280w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Hiker-Trekking.jpg?w=1920 1920w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>The idea is to find the values of the coefficients such that the error becomes minimum. A better analogy of gradient descent algorithm is through Star Trek, Captain Kirk, and Transporter &#8211; the teleportation device. I promise you will get to say &#8220;<em>Beam Me Up, Scotty<\/em>&#8221; the legendary line Captain Kirk use to instruct Scotty, the operator of Transporter, to teleport him around.<\/p>\n<h2><span style=\"color: #3366ff;\">Captain Kirk to Solve Gradient Descent<\/span><\/h2>\n<p>Assume you are Captain Kirk. A wicked alien has abducted several crew members of the Starship Enterprise including Spock. The alien has given you the last chance to save your crew if only you can solve a problem. The problem involves finding the minimum value of the variable y for all the possible values of x between -\u221e to \u221e.<\/p>\n<pre style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=y+%3D+%5Cfrac%7Bx%5E%7B6%7D%7D%7B6%7D-%5Cfrac%7B3x%5E%7B5%7D%7D%7B5%7D-%5Cfrac%7B5x%5E%7B4%7D%7D%7B4%7D%2B%5Cfrac%7B15x%5E%7B3%7D%7D%7B3%7D%2B%5Cfrac%7B4x%5E%7B2%7D%7D%7B2%7D-12x++&#038;bg=ffffff&#038;fg=000&#038;s=4&#038;c=20201002\" alt=\"y = &#92;frac{x^{6}}{6}-&#92;frac{3x^{5}}{5}-&#92;frac{5x^{4}}{4}+&#92;frac{15x^{3}}{3}+&#92;frac{4x^{2}}{2}-12x  \" class=\"latex\" \/><\/pre>\n<p>Now, you are Captain Kirk and not a mathematician so you will use your own method to find the minimum or lowest value of y by changing the values of x. You have found a landscape in the exact shape of the function with x and y-axes that spreads across the Universe. You will ask Scotty to teleport you to a random location on this landscape and then you will walk down the landscape to find the value of x that will generate the\u00a0lowest value of y. By the way, y here is similar to the error function and x is similar to the ML coefficients i.e\u00a0\u03b2 values.<a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg\"><img data-attachment-id=\"10569\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/polynomial-gradient-descent-capatin-kirk\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg?fit=1372%2C739&amp;ssl=1\" data-orig-size=\"1372,739\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Polynomial &#8211; Gradient Descent Capatin Kirk\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg?fit=300%2C162&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg?fit=640%2C345&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-10569\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg?resize=640%2C345\" alt=\"\" width=\"640\" height=\"345\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg?w=1372&amp;ssl=1 1372w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg?resize=250%2C135&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg?resize=300%2C162&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg?resize=768%2C414&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg?resize=1024%2C552&amp;ssl=1 1024w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Polynomial-Gradient-Descent-Capatin-Kirk.jpg?w=1280 1280w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Now all of us humans, including Captain Kirk, can figure out which way is the downward slope just by walking. It takes a lot more effort to walk upwards than downwards. We could also figure out a flat plane because the effort is same no matter which direction you walk. In the landscape Captain Kirk is walking, there are just 5 flat points with A, B, and C as the 3 bottom points. However, the problem with this terrain is that there are three minimum values &#8211; here A and B are the local minima. C, on the other hand, is the global minimum or the lowest value of y at x=3. Different locations that Captain Kirk starts by being beamed at random will settle him at different minima since he will only walk down.<\/p>\n<p>We will come back to this problem of local minima, but before that let&#8217;s identify the mathematical equivalent of walking up or down which the actual gradient descent optimization will use.<\/p>\n<h2><span style=\"color: #3366ff;\">Gradient Descent and Differential Calculus<\/span><\/h2>\n<p>The mathematical equivalent of human ability to identify downward slope is differentiation of a function also called gradients. The differentiation of the landscape Captain Kirk was walking is:<\/p>\n<pre style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+y%7D%7B%5Cpartial+x%7D+%3D+x%5E%7B5%7D-3x%5E%7B4%7D-5x%5E%7B3%7D%2B15x%5E%7B2%7D%2B4x-12+&#038;bg=ffffff&#038;fg=000&#038;s=4&#038;c=20201002\" alt=\"&#92;frac{&#92;partial y}{&#92;partial x} = x^{5}-3x^{4}-5x^{3}+15x^{2}+4x-12 \" class=\"latex\" \/><\/pre>\n<p>&nbsp;<\/p>\n<p>Now, if you insert x=3.1 in this equation you will get the <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+y%7D%7B%5Cpartial+x%7D++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{&#92;partial y}{&#92;partial x}  \" class=\"latex\" \/> as 4.83. The positive value indicates it&#8217;s an upslope ahead. This mean Captain Kirk, or the pointer for gradient descent algorithm, needs to walk backward or toward the lower values of x. Similarly, at x=2.9, \u00a0the <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+y%7D%7B%5Cpartial+x%7D++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{&#92;partial y}{&#92;partial x}  \" class=\"latex\" \/> = -3.27. The -ve value means it is a downslope ahead, and Captain Kirk will walk forward or towards the higher values of x.<\/p>\n<p>At x=3, the global minima,\u00a0<img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+y%7D%7B%5Cpartial+x%7D++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{&#92;partial y}{&#92;partial x}  \" class=\"latex\" \/> =0. This means there is no slope or you have reached a flat plane. As it turns out, the above polynomial equation can be simplified to this. It was a trick question from the alien.<\/p>\n<pre style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+y%7D%7B%5Cpartial+x%7D+%3D+%28x-1%29%28x%2B1%29%28x-2%29%28x%2B2%29%28x-3%29++&#038;bg=ffffff&#038;fg=000&#038;s=4&#038;c=20201002\" alt=\"&#92;frac{&#92;partial y}{&#92;partial x} = (x-1)(x+1)(x-2)(x+2)(x-3)  \" class=\"latex\" \/><\/pre>\n<p>We can find minimum values of this function without gradient descent by equating this equation to 0. <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+y%7D%7B%5Cpartial+x%7D++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{&#92;partial y}{&#92;partial x}  \" class=\"latex\" \/> =0 for, x = -2, -1, 1, 2, and 3. So, if Captain Kirk knew some math he could have avoided all the walking.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg\"><img data-attachment-id=\"10550\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/derivative-solution\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg?fit=1412%2C749&amp;ssl=1\" data-orig-size=\"1412,749\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Derivative solution\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg?fit=300%2C159&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg?fit=640%2C339&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-10550 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg?resize=640%2C339\" alt=\"\" width=\"640\" height=\"339\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg?w=1412&amp;ssl=1 1412w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg?resize=250%2C133&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg?resize=300%2C159&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg?resize=768%2C407&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg?resize=1024%2C543&amp;ssl=1 1024w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Derivative-solution.jpg?w=1280 1280w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a>Let&#8217;s use this knowledge to find the machine learning prediction solution to our market research data.<\/p>\n<h2><span style=\"color: #3366ff;\">Solution &#8211; Logistics Regression<\/span><\/h2>\n<p>In the market research data, you are trying to fit the logit function to find the probability of perfume buyers P(y=1). We don&#8217;t want to write P(y=1) many times hence we will define a simpler notation : P(y=1)= <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cltimes++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;ltimes  \" class=\"latex\" \/>.<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=P%28y%3D1%29%3D%5Cfrac%7B1%7D%7B1%2Be%5E%7B-z%7D%7D%C2%A0%3D%5Cltimes+&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"P(y=1)=&#92;frac{1}{1+e^{-z}}\u00a0=&#92;ltimes \" class=\"latex\" \/><\/pre>\n<p>We also \u00a0know that z in the above equation is a linear function of x values with\u00a0\u00a0\u03b2 coefficients i.e.<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=z%3D%5Cbeta_%7B0%7D+%2B+%5Cbeta_%7B1%7D+x_%7B1%7D+%2B%5Cbeta_%7B2%7D+x_%7B2%7D++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"z=&#92;beta_{0} + &#92;beta_{1} x_{1} +&#92;beta_{2} x_{2}  \" class=\"latex\" \/><\/pre>\n<p>Now, to solve this logit function we need to minimize the error or loss function with respect to the \u03b2 coefficients. The loss function ( <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cell+f++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;ell f  \" class=\"latex\" \/> ) for a logistic regression problem is:<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cell+f%3D-%281-y%29%5Ccdot+ln%281-%5Cltimes%29-y%5Ccdot+ln%28%5Cltimes%29++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;ell f=-(1-y)&#92;cdot ln(1-&#92;ltimes)-y&#92;cdot ln(&#92;ltimes)  \" class=\"latex\" \/><\/pre>\n<p>This loss function ( <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cell+f++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;ell f  \" class=\"latex\" \/> ) is cleverly designed to make the problem a convex optimization problem which is a fancy way to say that the function is like a bowl. Like a bowl, this function has just one base or global minimum value and there are no local minima. This plot shows the loss function for our dataset &#8211; notice how it is like a bowl. This convex function has solved a big problem that Captain Kirk faced of having several local minima.<br \/>\n<a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-3D-plot.jpg\"><img data-attachment-id=\"10529\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/logistic-regression-loss-function-3d-plot\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-3D-plot.jpg?fit=1146%2C1048&amp;ssl=1\" data-orig-size=\"1146,1048\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Logistic Regression &#8211; Loss Function &#8211; 3D plot\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-3D-plot.jpg?fit=300%2C274&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-3D-plot.jpg?fit=640%2C585&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-10529 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-3D-plot.jpg?resize=640%2C585\" alt=\"\" width=\"640\" height=\"585\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-3D-plot.jpg?w=1146&amp;ssl=1 1146w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-3D-plot.jpg?resize=250%2C229&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-3D-plot.jpg?resize=300%2C274&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-3D-plot.jpg?resize=768%2C702&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-3D-plot.jpg?resize=1024%2C936&amp;ssl=1 1024w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Remember, for simplicity, we are assuming \u03b2<sub>1<\/sub> = \u03b2<sub>2<\/sub>\u00a0in the above plot so that we could use x<sub>1<\/sub> and x<sub>2<\/sub> collectively as\u00a0x<sub>1<\/sub> + x<sub>2<\/sub>. The data was also prepared to keep this assumption but despite this, the real\u00a0\u03b2<sub>1<\/sub>\u00a0and \u03b2<sub>2<\/sub>\u00a0will have different values if you will solve for independent x<sub>1<\/sub> and x<sub>2<\/sub>.<\/p>\n<h2><span style=\"color: #3366ff;\">Getting to the bottom<\/span><\/h2>\n<p>Now, you want to solve logit equation by minimizing the loss function by changing\u00a0\u03b2<sub>1<\/sub>\u00a0and \u03b2<sub>0<\/sub>. To identify the gradient or slope of the function at each point we need to identify the\u00a0derivatives of the loss function with respect to\u00a0\u03b2<sub>1<\/sub>\u00a0and \u03b2<sub>0<\/sub>. Since we assumed \u03b2<sub>1<\/sub> = \u03b2<sub>2 <\/sub>we are essentially working with one variable (x<sub>1<\/sub> + x<sub>2<\/sub>) instead of two (x<sub>1<\/sub>\u00a0&amp; x<sub>2<\/sub>).<\/p>\n<p>It turned out that derivatives are simply (I have shown the derivation of this at the bottom of this article after the Sign-Off Note &#8211; and trust me these results are not that difficult to derive so do check them out) :<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+%5Cell+f%7D%7B%5Cpartial+%5Cbeta_%7B1%7D%7D%3D%28%5Cltimes-y%29%5Ccdot+%28x_%7B1%7D%2Bx_%7B2%7D%29+%C2%A0+&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;frac{&#92;partial &#92;ell f}{&#92;partial &#92;beta_{1}}=(&#92;ltimes-y)&#92;cdot (x_{1}+x_{2}) \u00a0 \" class=\"latex\" \/><\/pre>\n<p>and,<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+%5Cell+f%7D%7B%5Cpartial+%5Cbeta_%7B0%7D%7D%3D%28%5Cltimes-y%29++&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;frac{&#92;partial &#92;ell f}{&#92;partial &#92;beta_{0}}=(&#92;ltimes-y)  \" class=\"latex\" \/><\/pre>\n<p>By using these values you can now send Captain Kirk to find the bottom of this loss function bowl. Also, while running the gradient descent code it is prudent to normalize the x values in the data. This will reduce Captain Kirk&#8217;s walking time or make the algorithm run faster. It turned out that the loss function bowl has the bottom at\u00a0\u03b2<sub>0<\/sub> =\u00a0-0.0315 and\u00a0\u03b2<sub>1<\/sub> = 2.073 for normalized x1+x2 values.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg\"><img data-attachment-id=\"10527\" data-permalink=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/logistic-regression-loss-function-2d-contour\/\" data-orig-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg?fit=1900%2C1000&amp;ssl=1\" data-orig-size=\"1900,1000\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Logistic Regression &#8211; Loss Function &#8211; 2D Contour\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg?fit=300%2C158&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg?fit=640%2C337&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" class=\" wp-image-10527 aligncenter\" src=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg?resize=640%2C337\" alt=\"\" width=\"640\" height=\"337\" srcset=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg?w=1900&amp;ssl=1 1900w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg?resize=250%2C132&amp;ssl=1 250w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg?resize=300%2C158&amp;ssl=1 300w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg?resize=768%2C404&amp;ssl=1 768w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg?resize=1024%2C539&amp;ssl=1 1024w, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Logistic-Regression-Loss-Function-2D-Contour.jpg?w=1280 1280w\" sizes=\"(max-width: 640px) 100vw, 640px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>The normalized x<sub>1<\/sub>+x<sub>2<\/sub> value\u00a0\u03b2 values can be easily modified to actual values of x<sub>1<\/sub> and x<sub>2<\/sub> and in that case non-normalized \u03b2<sub>0<\/sub> = -15.4438 and\u00a0\u03b2<sub>1<\/sub> = 0.1095. I will leave it up to you to figure out the logic of denormalizing the \u03b2 values.<\/p>\n<p>If you will run the gradient descent without assuming\u00a0\u03b2<sub>1<\/sub> = \u03b2<sub>2 <\/sub>then\u00a0\u03b2<sub>0<\/sub> =-15.4233, \u00a0\u03b2<sub>1<\/sub> = 0.1090, and \u03b2<sub>2<\/sub> =\u00a00.1097. I suggest you try all these solutions using this code:\u00a0<strong><a href=\"http:\/\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Gradient-Descent-Logistic-Regression-R-Code.txt\">Gradient Descent &#8211; Logistic Regression (R Code)<\/a>.<\/strong><\/p>\n<h4><span style=\"color: #3366ff;\">Sign off Note<\/span><\/h4>\n<p>The original script of the Star Trek series was written without the mention of Transporter or the teleporting device. It was later realized that the lower budget of the show would not allow filming of the starship landing on unknown planets. They then devised a smaller vessel ship that was also beyond the budget of the show. Finally, a low-cost teleportation device (Transporter) was conceived. Thank heavens for the low-budget creativity because of which we have Transporter &#8211; a fascinating device!<\/p>\n<h4><span style=\"color: #0000ff;\">Derivation of Gradients for Gradient Descent Function<\/span><\/h4>\n<p>You just need to know the following four basic derivatives to derive the lowest value of the loss function i.e.<\/p>\n<p>1: Derivative of x with respect to itself is<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial+x%7Dx%3D1++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{&#92;partial}{&#92;partial x}x=1  \" class=\"latex\" \/><\/pre>\n<p>2:\u00a0Derivative of x<sup>2<\/sup> with respect to x is<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial+x%7Dx%5E%7B2%7D%3D2+%5Ctimes+x++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{&#92;partial}{&#92;partial x}x^{2}=2 &#92;times x  \" class=\"latex\" \/><\/pre>\n<p>3: Derivative of e<sup>x<\/sup> with respect to x is<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial+x%7De%5E%7Bx%7D%3De%5E%7Bx%7D++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{&#92;partial}{&#92;partial x}e^{x}=e^{x}  \" class=\"latex\" \/><\/pre>\n<p>4: And, finally derivative of ln(x) is<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial+x%7Dln%28x%29%3D%5Cfrac%7B1%7D%7Bx%7D++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{&#92;partial}{&#92;partial x}ln(x)=&#92;frac{1}{x}  \" class=\"latex\" \/><\/pre>\n<p>Loss function is defined as<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cell+f%3D-%281-y%29%5Ccdot+ln%281-%5Cltimes%29-y%5Ccdot+ln%28%5Cltimes%29++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;ell f=-(1-y)&#92;cdot ln(1-&#92;ltimes)-y&#92;cdot ln(&#92;ltimes)  \" class=\"latex\" \/><\/pre>\n<p>The logit function which feeds into the loss function is<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cltimes%3DP%28y%3D1%29%3D%5Cfrac%7B1%7D%7B1%2Be%5E%7B-z%7D%7D%C2%A0+&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;ltimes=P(y=1)=&#92;frac{1}{1+e^{-z}}\u00a0 \" class=\"latex\" \/><\/pre>\n<p>And finally, z is a linear function of x values with\u00a0\u00a0\u03b2 coefficients which feeds in \u00a0the logit function<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=z%3D%5Cbeta_%7B0%7D+%2B+%5Cbeta_%7B1%7D+x_%7B1%7D+%2B%5Cbeta_%7B2%7D+x_%7B2%7D++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"z=&#92;beta_{0} + &#92;beta_{1} x_{1} +&#92;beta_{2} x_{2}  \" class=\"latex\" \/><\/pre>\n<p>Now, we want to find the derivative or slope of loss function with respect to\u00a0\u03b2 coefficients i.e.<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+%5Cell+f%7D%7B%5Cpartial+%5Cbeta%7D++&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;frac{&#92;partial &#92;ell f}{&#92;partial &#92;beta}  \" class=\"latex\" \/><\/pre>\n<p>This equation can be expanded to individual components of loss function ( <img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cell+f+&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"&#92;ell f \" class=\"latex\" \/> ), logit, and z<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+%5Cell+f%7D%7B%5Cpartial+%5Cbeta_%7B1%7D%7D+%3D%5Cfrac%7B%5Cpartial+%5Cell+f%7D%7B%5Cpartial+%5Cltimes+%7D%5Ccdot%5Cfrac%7B%5Cpartial+%5Cltimes+%7D%7B%5Cpartial+z%7D%5Ccdot%5Cfrac%7B%5Cpartial+z%7D%7B%5Cpartial+%5Cbeta_%7B1%7D%7D++&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;frac{&#92;partial &#92;ell f}{&#92;partial &#92;beta_{1}} =&#92;frac{&#92;partial &#92;ell f}{&#92;partial &#92;ltimes }&#92;cdot&#92;frac{&#92;partial &#92;ltimes }{&#92;partial z}&#92;cdot&#92;frac{&#92;partial z}{&#92;partial &#92;beta_{1}}  \" class=\"latex\" \/><\/pre>\n<p>Let&#8217;s calculate the individual components of this formula. The first component is<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+%5Cell+f%7D%7B%5Cpartial+%5Cltimes%7D+%3D+%5Cfrac%7B1-y%7D%7B1-%5Cltimes+%7D-%5Cfrac%7By%7D%7B%5Cltimes%7D++&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;frac{&#92;partial &#92;ell f}{&#92;partial &#92;ltimes} = &#92;frac{1-y}{1-&#92;ltimes }-&#92;frac{y}{&#92;ltimes}  \" class=\"latex\" \/><\/pre>\n<p>The second component:<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+%5Cltimes+%7D%7B%5Cpartial+z%7D%3D%5Cltimes%5Ccdot+%281-%5Cltimes%29++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{&#92;partial &#92;ltimes }{&#92;partial z}=&#92;ltimes&#92;cdot (1-&#92;ltimes)  \" class=\"latex\" \/><\/pre>\n<p>Since,<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+%5Cltimes+%7D%7B%5Cpartial+z%7D%3D+%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial+z%7D%28%7B1%2Be%5E%7B-z%7D%7D%29%5E%7B-1%7D%3De%5E%7B-z%7D%5Ccdot+%28%7B1%2Be%5E%7B-z%7D%7D%29%5E%7B-2%7D%3D%5Cltimes%5Ccdot+%281-%5Cltimes%29++&#038;bg=ffffff&#038;fg=000&#038;s=2&#038;c=20201002\" alt=\"&#92;frac{&#92;partial &#92;ltimes }{&#92;partial z}= &#92;frac{&#92;partial}{&#92;partial z}({1+e^{-z}})^{-1}=e^{-z}&#92;cdot ({1+e^{-z}})^{-2}=&#92;ltimes&#92;cdot (1-&#92;ltimes)  \" class=\"latex\" \/><\/pre>\n<p>Finally,\u00a0the third component<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+z%7D%7B%5Cpartial+%5Cbeta_%7B1%7D%7D%3Dx_%7B1%7D+%3B%5C+%5Cfrac%7B%5Cpartial+z%7D%7B%5Cpartial+%5Cbeta_%7B2%7D%7D%3Dx_%7B2%7D%3B%5C+%5Cfrac%7B%5Cpartial+z%7D%7B%5Cpartial+%5Cbeta_%7B0%7D%7D%3D1++&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;frac{&#92;partial z}{&#92;partial &#92;beta_{1}}=x_{1} ;&#92; &#92;frac{&#92;partial z}{&#92;partial &#92;beta_{2}}=x_{2};&#92; &#92;frac{&#92;partial z}{&#92;partial &#92;beta_{0}}=1  \" class=\"latex\" \/><\/pre>\n<p>Hence, the product of three components will provide us with the derivative of the loss function with respect to the beta coefficients<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+%5Cell+f%7D%7B%5Cpartial+%5Cbeta_%7B1%7D%7D%3D+%28%5Cfrac%7B1-y%7D%7B1-%5Cltimes+%7D-%5Cfrac%7By%7D%7B%5Cltimes%7D%29+%5Ccdot%C2%A0%28%5Cltimes%5Ccdot+%281-%5Cltimes%29%29+%5Ccdot+x_%7B1%7D%3D%28%5Cltimes-y%29%5Ccdot+x_%7B1%7D+%C2%A0+&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;frac{&#92;partial &#92;ell f}{&#92;partial &#92;beta_{1}}= (&#92;frac{1-y}{1-&#92;ltimes }-&#92;frac{y}{&#92;ltimes}) &#92;cdot\u00a0(&#92;ltimes&#92;cdot (1-&#92;ltimes)) &#92;cdot x_{1}=(&#92;ltimes-y)&#92;cdot x_{1} \u00a0 \" class=\"latex\" \/><\/pre>\n<p>Similarly,<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+%5Cell+f%7D%7B%5Cpartial+%5Cbeta_%7B2%7D%7D%3D%28%5Cltimes-y%29%5Ccdot+x_%7B2%7D+%C2%A0+&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;frac{&#92;partial &#92;ell f}{&#92;partial &#92;beta_{2}}=(&#92;ltimes-y)&#92;cdot x_{2} \u00a0 \" class=\"latex\" \/><\/pre>\n<p>and,<\/p>\n<pre><img decoding=\"async\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cfrac%7B%5Cpartial+%5Cell+f%7D%7B%5Cpartial+%5Cbeta_%7B0%7D%7D%3D%28%5Cltimes-y%29++&#038;bg=ffffff&#038;fg=000&#038;s=3&#038;c=20201002\" alt=\"&#92;frac{&#92;partial &#92;ell f}{&#92;partial &#92;beta_{0}}=(&#92;ltimes-y)  \" class=\"latex\" \/><\/pre>\n<p>That&#8217;s it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you want to gain a sound understanding of machine learning then you must know gradient descent optimization.\u00a0In this article, you will get a detailed and intuitive understanding of gradient descent to solve machine learning algorithms. The entire tutorial uses images and visuals to make things easy to grasp. Here, we will use an example<\/p>\n<p><a class=\"excerpt-more blog-excerpt\" href=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/\">Read More&#8230;<\/a><\/p>\n","protected":false},"author":1,"featured_media":10480,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[87,84],"tags":[],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v17.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Gradient Descent for Logistic Regression Simplified - Step by Step Visual Guide &ndash; YOU CANalytics |<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Gradient Descent for Logistic Regression Simplified - Step by Step Visual Guide &ndash; YOU CANalytics |\" \/>\n<meta property=\"og:description\" content=\"If you want to gain a sound understanding of machine learning then you must know gradient descent optimization.\u00a0In this article, you will get a detailed and intuitive understanding of gradient descent to solve machine learning algorithms. The entire tutorial uses images and visuals to make things easy to grasp. Here, we will use an exampleRead More...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/\" \/>\n<meta property=\"og:site_name\" content=\"YOU CANalytics |\" \/>\n<meta property=\"article:author\" content=\"roopam\" \/>\n<meta property=\"article:published_time\" content=\"2017-09-27T11:02:41+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-09-16T12:09:16+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Star-Trek-GD-main.jpg?fit=997%2C587&#038;ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"997\" \/>\n\t<meta property=\"og:image:height\" content=\"587\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Roopam Upadhyay\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"14 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Organization\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\",\"name\":\"YOU CANalytics\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/\",\"sameAs\":[],\"logo\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#logo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120\",\"contentUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120\",\"width\":607,\"height\":120,\"caption\":\"YOU CANalytics\"},\"image\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#logo\"}},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#website\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/\",\"name\":\"YOU CANalytics |\",\"description\":\"Explore the Power of Data Science\",\"publisher\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ucanalytics.com\/blogs\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Star-Trek-GD-main.jpg?fit=997%2C587&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Star-Trek-GD-main.jpg?fit=997%2C587&ssl=1\",\"width\":997,\"height\":587},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#webpage\",\"url\":\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/\",\"name\":\"Gradient Descent for Logistic Regression Simplified - Step by Step Visual Guide &ndash; YOU CANalytics |\",\"isPartOf\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#primaryimage\"},\"datePublished\":\"2017-09-27T11:02:41+00:00\",\"dateModified\":\"2018-09-16T12:09:16+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ucanalytics.com\/blogs\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Gradient Descent for Logistic Regression Simplified &#8211; Step by Step Visual Guide\"}]},{\"@type\":\"Article\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#webpage\"},\"author\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6\"},\"headline\":\"Gradient Descent for Logistic Regression Simplified &#8211; Step by Step Visual Guide\",\"datePublished\":\"2017-09-27T11:02:41+00:00\",\"dateModified\":\"2018-09-16T12:09:16+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#webpage\"},\"wordCount\":2393,\"commentCount\":26,\"publisher\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#organization\"},\"image\":{\"@id\":\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Star-Trek-GD-main.jpg?fit=997%2C587&ssl=1\",\"articleSection\":[\"Gradient Descent\",\"Machine Learning and Artificial Intelligence\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#respond\"]}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6\",\"name\":\"Roopam Upadhyay\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/ucanalytics.com\/blogs\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g\",\"caption\":\"Roopam Upadhyay\"},\"description\":\"This blog contains my personal views and thoughts on predictive Analytics and big data. - Roopam Upadhyay\",\"sameAs\":[\"roopam\"],\"url\":\"https:\/\/ucanalytics.com\/blogs\/author\/roopam\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Gradient Descent for Logistic Regression Simplified - Step by Step Visual Guide &ndash; YOU CANalytics |","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/","og_locale":"en_US","og_type":"article","og_title":"Gradient Descent for Logistic Regression Simplified - Step by Step Visual Guide &ndash; YOU CANalytics |","og_description":"If you want to gain a sound understanding of machine learning then you must know gradient descent optimization.\u00a0In this article, you will get a detailed and intuitive understanding of gradient descent to solve machine learning algorithms. The entire tutorial uses images and visuals to make things easy to grasp. Here, we will use an exampleRead More...","og_url":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/","og_site_name":"YOU CANalytics |","article_author":"roopam","article_published_time":"2017-09-27T11:02:41+00:00","article_modified_time":"2018-09-16T12:09:16+00:00","og_image":[{"width":997,"height":587,"url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Star-Trek-GD-main.jpg?fit=997%2C587&ssl=1","type":"image\/jpeg"}],"twitter_misc":{"Written by":"Roopam Upadhyay","Est. reading time":"14 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Organization","@id":"https:\/\/ucanalytics.com\/blogs\/#organization","name":"YOU CANalytics","url":"https:\/\/ucanalytics.com\/blogs\/","sameAs":[],"logo":{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/#logo","inLanguage":"en-US","url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120","contentUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2015\/11\/YOU-CANalytics-Logo.jpg?fit=607%2C120","width":607,"height":120,"caption":"YOU CANalytics"},"image":{"@id":"https:\/\/ucanalytics.com\/blogs\/#logo"}},{"@type":"WebSite","@id":"https:\/\/ucanalytics.com\/blogs\/#website","url":"https:\/\/ucanalytics.com\/blogs\/","name":"YOU CANalytics |","description":"Explore the Power of Data Science","publisher":{"@id":"https:\/\/ucanalytics.com\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ucanalytics.com\/blogs\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#primaryimage","inLanguage":"en-US","url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Star-Trek-GD-main.jpg?fit=997%2C587&ssl=1","contentUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Star-Trek-GD-main.jpg?fit=997%2C587&ssl=1","width":997,"height":587},{"@type":"WebPage","@id":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#webpage","url":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/","name":"Gradient Descent for Logistic Regression Simplified - Step by Step Visual Guide &ndash; YOU CANalytics |","isPartOf":{"@id":"https:\/\/ucanalytics.com\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#primaryimage"},"datePublished":"2017-09-27T11:02:41+00:00","dateModified":"2018-09-16T12:09:16+00:00","breadcrumb":{"@id":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ucanalytics.com\/blogs\/"},{"@type":"ListItem","position":2,"name":"Gradient Descent for Logistic Regression Simplified &#8211; Step by Step Visual Guide"}]},{"@type":"Article","@id":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#article","isPartOf":{"@id":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#webpage"},"author":{"@id":"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6"},"headline":"Gradient Descent for Logistic Regression Simplified &#8211; Step by Step Visual Guide","datePublished":"2017-09-27T11:02:41+00:00","dateModified":"2018-09-16T12:09:16+00:00","mainEntityOfPage":{"@id":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#webpage"},"wordCount":2393,"commentCount":26,"publisher":{"@id":"https:\/\/ucanalytics.com\/blogs\/#organization"},"image":{"@id":"https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Star-Trek-GD-main.jpg?fit=997%2C587&ssl=1","articleSection":["Gradient Descent","Machine Learning and Artificial Intelligence"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/ucanalytics.com\/blogs\/gradient-descent-logistic-regression-simplified-step-step-visual-guide\/#respond"]}]},{"@type":"Person","@id":"https:\/\/ucanalytics.com\/blogs\/#\/schema\/person\/55961a1cea272ecdf290cb387be069b6","name":"Roopam Upadhyay","image":{"@type":"ImageObject","@id":"https:\/\/ucanalytics.com\/blogs\/#personlogo","inLanguage":"en-US","url":"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/dd1aa0b0e813f7639800bcfad6a554f1?s=96&d=mm&r=g","caption":"Roopam Upadhyay"},"description":"This blog contains my personal views and thoughts on predictive Analytics and big data. - Roopam Upadhyay","sameAs":["roopam"],"url":"https:\/\/ucanalytics.com\/blogs\/author\/roopam\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2017\/09\/Star-Trek-GD-main.jpg?fit=997%2C587&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3L0jT-2IR","jetpack-related-posts":[{"id":7923,"url":"https:\/\/ucanalytics.com\/blogs\/intuitive-machine-learning-gradient-descent-simplified\/","url_meta":{"origin":10469,"position":0},"title":"Intuitive Machine Learning : Gradient Descent Simplified","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"How do machines learn? They learn the same way as humans. Humans learn from experience and so do machines. For machines, the experience is in the form of data. Machines use powerful algorithms to make sense of the data. They identify underlining patterns within the data to learn things about\u2026","rel":"","context":"In &quot;Gradient Descent&quot;","block_context":{"text":"Gradient Descent","link":"https:\/\/ucanalytics.com\/blogs\/category\/gradient-descent\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/03\/Rplot01.png?fit=1000%2C600&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/03\/Rplot01.png?fit=1000%2C600&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/03\/Rplot01.png?fit=1000%2C600&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2016\/03\/Rplot01.png?fit=1000%2C600&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":11290,"url":"https:\/\/ucanalytics.com\/blogs\/math-of-deep-learning-neural-networks-simplified-part-2\/","url_meta":{"origin":10469,"position":1},"title":"Math of Deep Learning Neural Networks &#8211; Simplified (Part 2)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"Welcome back to this series of articles on deep learning and neural networks. In the last part, you learned how training a\u00a0deep learning network is similar to a plumbing job. This time you will learn the math of deep learning. We will continue to use the plumbing analogy to simplify\u2026","rel":"","context":"In &quot;Deep Learning Neural Networks&quot;","block_context":{"text":"Deep Learning Neural Networks","link":"https:\/\/ucanalytics.com\/blogs\/category\/deep-learning-neural-networks\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2018\/08\/Deep-Learning-Neural-Networks-and-Plumbing-Job.jpg?fit=796%2C597&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2018\/08\/Deep-Learning-Neural-Networks-and-Plumbing-Job.jpg?fit=796%2C597&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2018\/08\/Deep-Learning-Neural-Networks-and-Plumbing-Job.jpg?fit=796%2C597&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2018\/08\/Deep-Learning-Neural-Networks-and-Plumbing-Job.jpg?fit=796%2C597&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":11578,"url":"https:\/\/ucanalytics.com\/blogs\/deep-learning-models-simplified-part-3\/","url_meta":{"origin":10469,"position":2},"title":"Deep Learning Models Simplified (Part 3)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"Facebook was a major sensation and a source of great amusement in a British\u00a0country house in the early 20th century. It was such a big hit that it got a special mention in a newspaper published in the year 1902. Facebook, then, of course, had a completely different meaning than\u2026","rel":"","context":"In &quot;Deep Learning Neural Networks&quot;","block_context":{"text":"Deep Learning Neural Networks","link":"https:\/\/ucanalytics.com\/blogs\/category\/deep-learning-neural-networks\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2018\/10\/Popeye-Deep-Learning.jpg?fit=960%2C686&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2018\/10\/Popeye-Deep-Learning.jpg?fit=960%2C686&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2018\/10\/Popeye-Deep-Learning.jpg?fit=960%2C686&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2018\/10\/Popeye-Deep-Learning.jpg?fit=960%2C686&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":3973,"url":"https:\/\/ucanalytics.com\/blogs\/model-selection-retail-case-study-example-part-7\/","url_meta":{"origin":10469,"position":3},"title":"Model Selection &#8211; Retail Case Study Example (Part 7)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"Model Selection This is a continuation of our retail case study example for campaign and marketing analytics. In the previous two parts, we discussed a couple of decision tree algorithms (CART and C4.5)\u00a0for classification. Recall a previous case study example on\u00a0banking and risk management where we discussed logistic regression\u00a0which is\u2026","rel":"","context":"In &quot;Marketing Analytics&quot;","block_context":{"text":"Marketing Analytics","link":"https:\/\/ucanalytics.com\/blogs\/category\/marketing-analytics\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/09\/photo.jpg?fit=1200%2C1029&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/09\/photo.jpg?fit=1200%2C1029&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/09\/photo.jpg?fit=1200%2C1029&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/09\/photo.jpg?fit=1200%2C1029&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2014\/09\/photo.jpg?fit=1200%2C1029&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":756,"url":"https:\/\/ucanalytics.com\/blogs\/case-study-example-banking-logistic-regression-3\/","url_meta":{"origin":10469,"position":4},"title":"Logistic Regression &#8211; Banking Case Study Example (Part 3)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"The Beautiful Formula Mathematicians often conduct competitions for the most beautiful formulae of all. The first position, almost every time, goes to the formula discovered by Leonhard Euler. Displayed below is the formula. $latex e^{i\\pi }+1=0 &s=2$ This formula is phenomenal because it is a combination of the five most\u2026","rel":"","context":"In &quot;Banking Risk Case Study Example&quot;","block_context":{"text":"Banking Risk Case Study Example","link":"https:\/\/ucanalytics.com\/blogs\/category\/risk-analytics\/banking-risk-case-study-example\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/10\/The-Beautiful-Equation-Copy.jpg?fit=1200%2C879&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/10\/The-Beautiful-Equation-Copy.jpg?fit=1200%2C879&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/10\/The-Beautiful-Equation-Copy.jpg?fit=1200%2C879&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/10\/The-Beautiful-Equation-Copy.jpg?fit=1200%2C879&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/10\/The-Beautiful-Equation-Copy.jpg?fit=1200%2C879&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":110,"url":"https:\/\/ucanalytics.com\/blogs\/credit-scorecards-logistic-regression-part-5\/","url_meta":{"origin":10469,"position":5},"title":"Credit Scorecards &#8211; Logistic Regression (part 5 of 7)","author":"Roopam Upadhyay","date":false,"format":false,"excerpt":"A Primer on Logistic Regression - Are you Happy? A few years ago, my wife and I took a couple of weeks\u2019 vacation to England and Scotland. Just before boarding the British Airway\u2019s plane, an air-hostess informed us that we were upgraded to business class. Jolly good! What a wonderful\u2026","rel":"","context":"In &quot;Credit Risk Analytics Series&quot;","block_context":{"text":"Credit Risk Analytics Series","link":"https:\/\/ucanalytics.com\/blogs\/category\/risk-analytics\/credit-risk-analytics-series\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/07\/5-Logit-Happy.jpg?fit=614%2C386&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/07\/5-Logit-Happy.jpg?fit=614%2C386&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/ucanalytics.com\/blogs\/wp-content\/uploads\/2013\/07\/5-Logit-Happy.jpg?fit=614%2C386&ssl=1&resize=525%2C300 1.5x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts\/10469"}],"collection":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/comments?post=10469"}],"version-history":[{"count":0,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/posts\/10469\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/media\/10480"}],"wp:attachment":[{"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/media?parent=10469"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/categories?post=10469"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ucanalytics.com\/blogs\/wp-json\/wp\/v2\/tags?post=10469"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}