YOU CANalytics | Information Value (IV) & Weight of Evidence (WOE)

Information Value (IV) and Weight of Evidence (WOE) – A Case Study from Banking (Part 4)

· Roopam Upadhyay 77 Comments

This is a continuation of our banking case study for scorecards development. In this part, we will discuss information value (IV) and weight of evidence. These concepts are useful for variable selection while developing credit scorecards. We will also learn how to use weight of evidence (WOE) in logistic regression modeling. The following are the links where you can find the previous three parts (Part 1), (Part 2) & (Part 3).

Experts in Expensive Suits

Mr. Expert – by Roopam

A couple of weeks ago I was watching this show called ‘Brain Games’ on the National Geographic Channel. In one of the segments, they had a comedian dressed up as a television news reporter. He had a whole television camera crew along with him. He was informing the people coming out of a mall in California that Texas has decided to form an independent country, not part of the United States. Additionally, while on camera he was asking for their opinion on the matter. After the initial amusement, people took him seriously and started giving their serious viewpoints. This is the phenomenon psychologists describe as ‘expert fallacy’ or obeying authority, no matter how irrational the authorities seem. Later after learning the truth, the people on this show agreed that they believed this comedian because he was in an expensive suit with a TV crew.

Nate Silver in his book The Signal and The Noise described a similar phenomenon. He analyzed the forecasts made by the panel of experts on the TV program The McLaughlin Group. The forecasts turned out to be true only in 50% cases; you could have forecasted the same by tossing a coin. We do take experts in expensive suits seriously, don’t we? These are not few-off examples. Men in suits or uniforms come in all different forms – from army generals to security personnel in malls. We take them all very seriously.

We have just discovered that rather than accept an expert’s opinion, it would be better to look at the value of the information and make decisions oneself. Let us continue with the theme and try to explore how to assign the value to information using information value and weight of evidence. Then we will create a simple logistic regression model using WOE (weight of evidence). However, before that let us recapture the case study we are working on.

Case Study Continues ..

This is a continuation of our case study on CyndiCat bank. The bank had disbursed 60816 auto loans with around 2.5% of the bad rate in the quarter between April–June 2012. We did some exploratory data analysis (EDA) using tools of data visualization in the first two parts (Part 1) & (Part 2). In the previous article, we have developed a simple logistic regression model with just age as the variable (Part 3). This time, we will continue from where we left in the previous article and use weight of evidence (WOE) for age to develop a new model. Additionally, we will also explore the predictive power of the variable (age) through information value.

Information Value (IV) and Weight of Evidence (WOE)

Information value is a very useful concept for variable selection during model building. The roots of information value, I think, are in information theory proposed by Claude Shannon. The reason for my belief is the similarity information value has with a widely used concept of entropy in information theory. Chi Square value, an extensively used measure in statistics, is a good replacement for IV (information value). However, IV is a popular and widely used measure in the industry. The reason for this is some very convenient rules of thumb for variables selection associated with IV – these are really handy as you will discover later in this article. The formula for information value is shown below.

What distribution good/bad mean will soon be clear when we will calculate IV for our case study. This is probably an opportune moment to define Weight of Evidence (WOE), which is the log component in information value.

Hence, IV can further be written as the following.

If you examine both information value and weight of evidence carefully then you will notice that both these values will break down when either the distribution good or bad goes to zero. A mathematician will hate it. The assumption, a fair one, is that this will never happen while a scorecard development because of the reasonable sample size. A word of caution, if you are developing non-standardized scorecards with smaller sample size use IV carefully.

Back to the Case Study

In the previous article, we have created coarse classes for the variable age in our case study. Now, let us calculate both information value and weight of evidence for these coarse classes.

Let us examine this table. Here, distribution of loans is the ratio of loans for a coarse class to total loans. For the group 21-30, this is 4821/60801 = 0.079. Similarly, distribution bad (DB) = 206/1522 = .135 and distribution good = 4615/59279 (DG) = 0.078. Additionally, DG-DB = 0.078 – 0.135 = – 0.057. Further, WOE = ln(0.078/0.135) = -0.553.

Download the attached Excel to understand this calculation : Information Value (IV) and Weight of Evidence (WOE)

Finally, component of IV for this group is (-0.057)*(-0.553) = 0.0318. Similarly, calculate the IV components for all the other coarse classes. Adding these components will produce the IV value of 0.1093 (last column of the table). Now the question is how to interpret this value of IV? The answer is the rule of thumb described below.

Information Value	Predictive Power
< 0.02	useless for prediction
0.02 to 0.1	Weak predictor
0.1 to 0.3	Medium predictor
0.3 to 0.5	Strong predictor
>0.5	Suspicious or too good to be true

Typically, variables with medium and strong predictive powers are selected for model development. However, some school of thoughts would advocate just the variables with medium IVs for a broad-based model development. Notice, the information value for age is 0.1093 hence it is barely falling in the medium predictors’ range.

Logistic Regression with Weight of Evidence (WOE)

Finally, let us create a logistic regression model with weight of evidence of the coarse classes as the value for the independent variable age. The following are the results generated through a statistical software.

Logistic Regression Results (Age Groups and Bad Rates)
Predictor	Coefficient	Std. Er	Z	P	Odds Ratio
Constant	-3.66223	0.0263162	-139.16	0
WOE_Age	-1	0.0796900	-12.55	0	0.37

If we estimate the value of bad rate for the age group 21-30 using the above information.

This is precisely the value we have obtained the last time (See the previous part) and is consistent with the bad rate for the group.

Sign-off note

I wish there was an instrument similar to information value available with us to estimate the value of information coming from so called experts. However, next time when an expert on a business channel gives you the advice to buy a certain stock, take that advice with a pinch of salt.

Read the remaining part of credit scoring series

Part 1: Data visualization for scoring

Part 2: Creating ratio variables for better scoring

Part 3: Logistic regression

Part 5: Reject inference

Part 6: Population stability index for scorecard monitoring

References
1. Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring – Naeem Siddiqi
2. Credit Scoring for Risk Managers: The Handbook for Lenders – Elizabeth Mays and Niall Lynas

« Logistic Regression – Banking Case Study Example (Part 3)

77 thoughts on “Information Value (IV) and Weight of Evidence (WOE) – A Case Study from Banking (Part 4)”

Shivi Bhatia says:

April 4, 2014 at 9:54 pm

very nice explanation, terrific. not much literature is available over the web on IV and WOE. Also the examples selected help to understand the concepts easily

Reply
Nhlanhla Tambo says:

April 11, 2014 at 2:46 pm

How would you perform logistic regression with WOE in MS Excel and what would you use as you dependent variable?

Reply
- Roopam Upadhyay says:
  
  April 11, 2014 at 3:08 pm
  
  In my opinion, Excel is not the best platform to perform logistic regression. However, there are Excel plugins that you could use for the purpose, such as XLMiner etc. You could be a little adventurous and use inbuilt Excel-Solver to optimize logistic regression equation – though this will be a little too ambitious for a beginner. Learning logistic regression on Excel also won’t help you much for most applied large data projects because of the constraint of number of rows in Excel. I would recommend you use R or SAS or SPSS.
  
  Reply
  - Saket says:
    
    September 10, 2019 at 5:43 pm
    
    Excellent explanation.
    Could you please elaborate on the below also –
    1) How does WOE ensure Linear Transformation with example ?
    2) How does that particular formula for IV decide whether a variable is important or not? (Basically why that formula? why log in that ?)
    
    Reply
Nhlanhla Tambo says:

April 11, 2014 at 5:52 pm

Thank you for your response.

I have tried to use R using the Riv Package but as am not sure what the independent variable will be for grouped data, I haven’t managed to make this work. i do not have access to SAS or SPSS as they are unaffordable otherwise would have managed. Do you have any articles that you have written or guidelines on the actual regression for grouped data in R?

Thanks

Nhlanhla

Reply
- Roopam Upadhyay says:
  
  April 13, 2014 at 1:58 pm
  
  SAS Enterprise Miner offers interactive-grouping and interactive-binning of independent variables to create weight-of-evidence. However, for R I am not sure if there is a similar package. In this case you will have to create groups or bins through the traditional way of eyeballing the normalized histogram (check out a previous article link). You could also use one variable decision trees to get clues about actual size of the bins. SAS eminer also uses the same logic at the back-end. It’s a little time consuming in R but you will learn more by doing it the hard way.
  
  Reply
Himanshu says:

May 26, 2014 at 2:25 pm

HI Roopam – Suppose we made same logistic model with two numeric variables , AGE and CREDIT_SCORE and I have no missing values for both the variables.Then, Woudn’t the IV and WoE for AGE and CREDIT_SCORE remain same ?

Reply
Samy says:

August 3, 2014 at 7:21 pm

HI Roopam
Would you please inform how to calculate studentized S(C) value of W+ and W-
or studentized S(C) from Dg and DB in your explanation here

some explanations say it is the standard deviation of C or (DG-DB)

Thanks so much

Reply
pankaj says:

August 20, 2014 at 10:35 am

Hi, aptly explained.. though an article ‘use of WOE for binning’ will be a good extension of this article…if its thr already please share the link…thanks!

Reply
sanjay shukla says:

September 3, 2014 at 12:20 pm

Hi – Thank you very much for putting these things together, your blog is very helpful. I have a question about using WOI in building logistic regression model. While fitting the model it seems OK to use WOI as predictor as we have values of dependent variable (good, bad). But how we will score the model when we do not have any information about dependent variable how are we going to calculate WOI? Are we going to use same WOI in scoring that is calculated while fitting the model?

Reply
- Roopam Upadhyay says:
  
  September 20, 2014 at 2:26 pm
  
  Hi Sanjay, Once you have created your logit model using WOE, the same WOEs are used for prediction as well.
  
  Reply
kumar says:

September 21, 2014 at 3:10 pm

Hi,

Very useful blog indeed. How to use WOE for binning…please explain with example

Thanks,

Reply
- Roopam Upadhyay says:
  
  September 22, 2014 at 10:41 am
  
  Hi, WOE is not used for binning; WOE is a numeric representation of bins (i.e. log of odds of each bin). Binning is carried out either through visual analysis of data as described in the previous part of this case study by creation of fine & coarse classes, or by using automated algorithms (like the binning algorithm in SAS E-miner).
  
  Reply
Danny says:

September 24, 2014 at 2:08 am

Hi Roopam – Thanks for the website. It’s very informative. I am new to this…can you explain what’s the value in incorporating WOE into a logistic regression model, vs. just leaving it as is like in your Case Study Part 3? Especially if both methods provide the same P(bad loan)?

Thanks!!

Reply
- Roopam Upadhyay says:
  
  September 26, 2014 at 8:25 am
  
  Through WOE you convert discrete groups to a continuous variable. There are many benefits to this. For instance, you could identify and remove multicollinearity much more easily with continuous variables.
  
  Reply
  - Danny says:
    
    September 29, 2014 at 8:39 am
    
    Got it! Thank you!!
    
    Reply
  - Robson Tavares Nonato says:
    
    August 8, 2015 at 2:39 am
    
    Roopam…We have a paradoxical cenario inside your answer. You written: “WOE convert discrete groups to a continuous variable…” … but we have a continuous variable already (Age)… Further more the original variable have more cardinality… and more cardinality means (in probability basis) less odds to happen multicollinearity. So… just convert to continuous perhaps is not the best answer. I don’t have the right answer too. Perhaps someone reading this may help us. Anybody may do it?
    
    Reply
    - Bob Stato says:
      
      October 8, 2018 at 11:25 pm
      
      You should use a continuous transformation of the age variable – use it raw or find a transformation, e.g. logarithm, until the relationship is linear with the log odds of the event. This allows the use of the age variable without loss of information from discretisation, that occurs due to categorisation (binning/bucketing/grouping) of the variable.
      
      Reply
    - Roopam Upadhyay says:
      
      October 9, 2018 at 12:58 pm
      
      I know it could be a bit confusing that we transformed a continuous variable (age) to discrete bins and then further transformed the discrete bins to WoE i.e. a continuous variable. This, however, will help our modeling process big time. Firstly, it will take care of the non-linear relationship between the original continuous variable and the dependent variable (bad rate). Notably, logistic regression doesn’t work well for non-linear relationships between independent and dependent variables. Secondly, it will produce a scorecard format which is preferred by business users since it is easy to interpret and implement. And finally, by using WoE you reduced the dimensionality of the model since several dummy variables (one-hot encoded variables) are now reduced to much fewer continuous variables. The dummy variables, as you would appreciate, produce a patchy model because it is possible that not all bins of a variable turn out to be significant
      
      Reply
      - Rohit Narain says:
        
        August 10, 2019 at 3:02 pm
        
        Hello Roopam ,
        
        Is WOE and IV methods works on small data sets? Suppose I have a data set of 1800 rows and I have to predict the customer behavior. So will it work properly or not? Or what other options we have if we want to predict customer behavior whether they are good or bad?
Srikanth T says:

September 25, 2014 at 10:49 am

Hi Roopam,

How do I calculate points for a categorical variable in a scorecard. That is how to combine WOE of the attributes and coefficients from logistic regression and what to do about reference category of the categorical variable? Any pointers in this direction wouldbbe helpful.

Thanks
Srikanth

Reply
- Roopam Upadhyay says:
  
  September 26, 2014 at 8:16 am
  
  For calculating score points one transforms logistic equation, with WOE, to score point scales. It is a fairly straight forward process. I think, all the books referenced in the linked article will have a dedicated section on scoring.
  
  Reply
  - Srikanth T says:
    
    September 28, 2014 at 10:24 pm
    
    Hi Roopam,
    
    Is it required that WOE vary monotonically with respect to attributes for a continuous variable?
    
    Thanks
    Srikanth
    
    Reply
    - Srikanth T says:
      
      September 28, 2014 at 10:32 pm
      
      And how do we deal with attributes with zero bad cases.
      
      Reply
      - Roopam Upadhyay says:
        
        September 29, 2014 at 11:31 am
        
        As mentioned in the article, this is an unlikely case for a large dataset. Both, information value and weight of evidence cannot tolerate attributes with zero bad cases. You will have to modify your attributes in such an event.
    - Roopam Upadhyay says:
      
      September 29, 2014 at 11:28 am
      
      Hi Srikanth,
      
      I am not sure about your definition of ‘vary monotonically’. If you mean that they are desired to have a specific trend the answer is yes. This makes sense since you don’t want some random variation (zigzag movement of WOE) to be part of your scoring model. Hope this helped.
      
      Reply
      - Srikanth T says:
        
        September 29, 2014 at 3:41 pm
        
        So let’s say WOE has a V shape with respect to the attributes of a continuous variable. Is it legal to still use this WOE as an input to the logistic regression? Thanks
        Srikanth
Manisha says:

October 3, 2014 at 1:38 am

What do you do if 1 or more of the decile groups of the variable under study have zero percentage good or bad?

Should I reduce the groups from being a decile to lesser or make the IV of the decile group = 0?

Reply
- Roopam Upadhyay says:
  
  October 3, 2014 at 11:39 am
  
  There is no condition that you need to use only deciles for information value, and weight of evidence, anyway while coarse classing you reduce the number of bins. So I recommend you join groups with zero bad/good records to adjacent groups and reduce the number of groups. Assigning the IV of the decile group = 0 is completely wrong since this group has infinite or in other words really large IV.
  
  Reply
Srikanth T says:

October 8, 2014 at 6:45 pm

Hi Roopam,

1) Can WOE have a V shape with respect to the attributes of a continuous variable and it makes business sense. 2) Is it legal to still use this WOE as an input to the logistic regression?
3) When I include WOE variables in logistic reg model, I see that some of the WOE variables have positive coefficient, which is unexpected because going by the definition of WOE, it is reasonable to assume a negative coefficient for a WOE variable. Is this usual?

Thanks
Srikanth

Reply
- Srikanth T says:
  
  October 19, 2014 at 11:56 am
  
  Hi Roopam, please help me with the above questions. Thanks,
  
  Reply
  - Roopam Upadhyay says:
    
    November 12, 2014 at 2:08 pm
    
    Hi Shrikanth,
    
    Sorry for delay in response, was tied up with many things. To answer your first two questions:
    
    Yes WOE can have V or U or inverted U shape. This is a non linear relationship which needs to be handled similar to any other non linear relationship for logistic regression. For instance, age with bad rate could have U shape curve and this is logical.
    
    For your third question you will have to elborate the way you are using WOE, variables, model, software package, and logit coefficients for me to explain the results.
    
    Best,
    Roopam
    
    Reply
    - Srikanth T says:
      
      November 12, 2014 at 10:05 pm
      
      I’m using WOE variables (not raw variables) as IVs in logistic reg. Since WOE is another way of expressing log odds, I assume linearity of logit is taken care of. As WOE variables are being added to the model, there are changes in signs from negative to positive for some WOE variables. Don’t know how to interpret this.
      
      I’m using SAS – proc logistic.
      
      Reply
      - Roopam Upadhyay says:
        
        November 13, 2014 at 9:46 am
        
        Check for multicollinearity in your variables, that’s the most likely cause for what you are observing.
      - Srikanth T says:
        
        November 13, 2014 at 2:20 pm
        
        Check multicollinearity in WOE variables (which are input to my logistic reg) or raw variables?
        
        I checked VIFs of WOE variables and found them to be acceptable (<2).
pk says:

December 9, 2014 at 12:03 am

hi. in your example there are 4 groups and thus 4 woes. are you inputting all of them in your logistic equation ?

Reply
- Roopam Upadhyay says:
  
  December 9, 2014 at 7:32 am
  
  The new variable with WOE is a continuous variable hence you are no longer using descrete groups. This is like developing regression models with any other continuous variable.
  
  Reply
Sid says:

January 9, 2015 at 7:57 pm

Hi Roopam,

I’ve got some output in SAS with my intercept and beta stiamtes for a variety of WOE transformed categroical variables.

I cannot determine the resulting attribute points allocation, is there a formulae?

Ultimately i want a score distribution that is defined by PDO -20 points 160 doubles the odds.

Reply
Sucheta Ray says:

September 4, 2015 at 10:36 am

Hi Roopam,

Just wanted to know is there any range of IV? What can be the maximum and minimum value (0?) of IV?

Reply
Sanjay shukla says:

September 4, 2015 at 11:42 pm

I have found that IV and WOE works well with small number of covariates and it fails badly with increase in variables. Computation of these stats are very expensive. I tried IV in retail domain to do variable selelction for category models (4000 variables and 500k obs). It took 10 hours just for one model and I had to build around 200 models!!!

Reply
David says:

October 14, 2015 at 3:14 pm

Excellent article, thanks. Would an IV for a single variable of 1.8 be suspicious?

This variable in question is a bureau score.

Reply
- Roopam Upadhyay says:
  
  October 14, 2015 at 10:21 pm
  
  Yes, IV of 1.8 is highly suspicious. Since credit bureau is a 3rd party aggregator’s data – it is possible that information about your own bad rate is embedded in this data.
  
  Let’s assume that everything is fine with this data, even then extremely high IV for a variable will make your model highly unstable. Since the entire predictive power is captured in just one variable. Usually you want to avoid such variables in your model.
  
  Reply
Varun Tyagi says:

January 25, 2016 at 6:30 pm

Hi,
Can you please explain how to find woe of dummy variables(0,1) and use it in logistic regression and what to do if monotonicity of groups is not there.

Reply
- Roopam Upadhyay says:
  
  January 26, 2016 at 10:37 am
  
  The process to derive information value (IV) and weight of evidence (WoE) for a binary variable will stay the same as described in this article for multi-nominal groups. About your second question on ‘monotonicity of groups’, the important idea here is to find logical trend between dependent and independent variable.
  
  Reply
Jason says:

February 29, 2016 at 6:36 pm

Hi Roopam,

Very good material! easy to understand and useful techniques! Thank you. I am wondering is it possible to publish the data you used for the case study. So readers can learn your stuff by practice. That will be very helpful. Thanks.

Jason

Reply
Soma B says:

May 2, 2016 at 11:30 pm

Hi, How would you interpret the odds ratio of 0.37 here. Thanks.

Reply
jim says:

August 15, 2016 at 6:23 pm

After converting variable (i.e. var1) into WOE, and using WOE as a predictive variable. How should i interpret the log odds output of var1 from the regression? Will i need to multiply var1 WOE against var1 Log odds of the predictive variable?

Reply
Eyup says:

July 21, 2017 at 8:13 pm

Excellent explanation, thank you very much but

the table says that if IV is bigger than 0.5, it’s misleading.Can you explain why it’s so?

If there’s a satisfying answer, can changing the ranges be a solution.

Reply
- Roopam Upadhyay says:
  
  October 13, 2017 at 9:29 am
  
  High IV corresponds to higher predictive power for just one variable – there are two reasons why you want to be cautious about a high IV for a variable i.e.
  
  1) The high predictive power for the variable could point to ‘too good to be to true’ kind of scenario – one needs to be careful about such relationships and explore the logical reasons behind the high predictive power. Also check whether this relationship will hold true in the future as well.
  
  2) Secondly, even if the the trend is logical, the final model will have a very high dependence on just one variable. It is always better to create a broad based models where many variables share the load of prediction to achieve higher accuracy in the future.
  
  Reply
Scott says:

October 21, 2017 at 1:47 am

Thanks, very informative!

Reply
Panagiotis Ballis-Papanastasiou says:

March 12, 2018 at 6:26 pm

Dear Roopam,

I have the same issue as with part 3. I manage to get the same nuber for the coefficients however the z-statistics are nowhere near. Note that I have verified this with various statistical softwares such as MATLAB, EXCEL ( I have a logit VBA function in EXCEL) and R. So I am pretty sure it is not a software issue rather an input issue.

Because the devil lies in the details could you please tell me exactly (or even post as a separate EXCEL file) how my dependent and independent variables look like? i.e. how many observations they have etc.

If I understood correctly the dependent variables (Y) is bad rates and the dependent (X) now is WOE for age. So both Y and X have 4 observations. If that is the case how are the standard errors so small with so few observations?

I hope this email is clear. If I have confused you please let me know and I will try and make my question more precise.

Thank you
Panagiotis Ballis-Papanastasiou

Reply
- Heysem Kaya says:
  
  May 21, 2018 at 4:43 pm
  
  Dear Panagiotis Ballis-Papanastasiou,
  I downloaded and checked the excel file and I am sure you are mistaken with something very important. The contennts of the excel file are not the raw data but summary (like histogram counts for the 4 bins w.r.t. age), the ctual number of observations is 60801 (see the total number of loans). So what you do is to transform the feature age first into a discrete variable (by age grouping) and then each category value into a continuous value using WOE.
  
  Reply
sherko says:

March 13, 2018 at 11:38 pm

Hellow
please i need to knew credit scoring algorithm from A to Z , i need it how can start how its calculation by hand
Thank you

Reply
Pallavi Singh says:

September 12, 2018 at 1:12 pm

What if we want to bring a trend in woe values,like either values go in increasing manner or in decreasing manner?

Reply
- Roopam Upadhyay says:
  
  September 13, 2018 at 9:04 am
  
  Not sure if I understood your question, please rephrase.
  
  Reply
Cong Tran says:

October 9, 2018 at 8:07 am

Hi Roopam,
Thank you for the informative blog.
I’ve gone through almost all the articles and understood more about modeling.

One question about WOE. How do you interpret a WOE? how to interpret the upward trend in the Excel file?
Does a higher WOE means higher risk?
Cong

Reply
Josh says:

October 10, 2018 at 2:46 pm

Is WoE still suitable for data with a low number of occurences of the target ?

Reply
- Roopam Upadhyay says:
  
  October 10, 2018 at 3:22 pm
  
  No
  
  Reply
  - Josh says:
    
    October 10, 2018 at 3:39 pm
    
    So oversampling or undersampling would not work either?
    
    Reply
    - Roopam Upadhyay says:
      
      October 10, 2018 at 4:50 pm
      
      What is your sample size for the events and non-events? It is advisable to have 1000 observations of each. When you have strong prior business knowledge, even then at least 150/200 observations of each type is required. Over/undersampling are to balance the sample and are not good for this purpose.
      
      Reply
      - Josh says:
        
        October 10, 2018 at 5:22 pm
        
        I mean percentage wise small. The number of events is about 20.000 and the number of non-events is about 20 million…
      - Roopam Upadhyay says:
        
        October 10, 2018 at 5:51 pm
        
        Yes, that’s fine. This is a typical rare event problem.
Aditya says:

January 3, 2019 at 8:42 pm

Mr Upadhyay,
I read your comments on WOE transformation and the linearity, WOE in always inversely to the log odds..couldn’t find this in your answers. A WOE transformation will never will a V shape trend, because bad rate and WOE are inversely proportional!

Reply
Andrey P says:

March 5, 2019 at 3:37 pm

Hi, in the previous article you use logistic regression with dummies. Is it better to use logistic regression with dummies or WoE?

Reply
- Roopam Upadhyay says:
  
  June 5, 2019 at 9:14 pm
  
  Using WoE is better in my opinion because working with continuous variables is much straightforward than several discrete dummy variables for regression.
  
  Reply
Andrey P says:

June 5, 2019 at 9:00 pm

Can IV be larger than 1

Reply
- Roopam Upadhyay says:
  
  June 5, 2019 at 9:10 pm
  
  Create a perfect fit in the Excel to calculate for yourself the upper bound of IV.
  
  Reply
  - Andrey P says:
    
    June 6, 2019 at 3:08 pm
    
    what do you mean by perfect fit?
    
    Reply
  - Andrey P says:
    
    June 6, 2019 at 3:11 pm
    
    Do you mean there is just 1 category that includes all goods and bads in entire dataset?
    
    Reply
  - Andrey P says:
    
    June 6, 2019 at 3:37 pm
    
    I have created a toy example with 2 categories, the first category includes all goods and the second includes all bads. So when I compute WoE1 I get log(1/0), 0 in denominator because bads_1 = 0. When I compute WoE2 I get 0 in log numerator log(0/1). Is that the perfect fit?
    
    Reply
    - Roopam Upadhyay says:
      
      June 6, 2019 at 8:40 pm
      
      Yes, that’s the max value. Now, add a little impurity to your perfect fit to notice how the upper bound of IV changes.
      
      Reply
      - Andrey P says:
        
        June 6, 2019 at 9:37 pm
        
        Got it, IV max value is plus infinity, thank you!)
      - Roopam Upadhyay says:
        
        June 6, 2019 at 9:38 pm
        
        You are welcome!
marisa23 says:

June 20, 2019 at 10:12 am

Hello, in the process of credit scoring, WOE value and IV this is used to get the score for each group, maybe you know where the number below is:

Define a target:
Target Score Value (ts): 600
Inverted Target Odds (to): 50

Read as: at my target score 600 the ods should be 1:50

Define slope:
points to double the odds (pdo): 20

thank you

Reply
Argi says:

February 14, 2020 at 8:12 pm

Hi Roopam, thanks for sharing this tutorial.

I would like to ask how do you get just one coefficient: WOEAge = -1 as output after you fit the logistic regression model. Since the classes for the variable age are 4, you will have 4 different weight of evidence input values as independent variables ( one for each class) and therefore I would expect to have 4-1= 3 weight of evidence coefficients at the output (since one of them will be the “reference” point). Instead, you end up with just one coefficient for WOEAge.

Could you explain this as well as what was exactly the fitted model that you used at the logistic regression?

Thank you

Reply
Saswata Ray says:

January 25, 2021 at 10:15 am

Hi Roopam! Thanks for a learner friendly content. 🙂 I come from a non-stats background. I am using IV for my PhD thesis. While writing about it I explained it as “One limitation of IV is that it does not have a standard range (like correlation which ranges between -1 to +1). Hence, heuristics and general guidelines are often used to select features by benchmarking IV values against the maximum IV observed in a particular dataset. In line with such established guidelines (Ref), all predictors which had information value of >=25% of the maximum IV were categorised as “Strong”, between 10-25% were categorised as “Moderate” and less than 10% as “Weak”; I am unable to find reference to cite this. As you know it is mandatory in academic writing to validate it by references. Can you please help me with this? In other words, help me with some publications to refer this section.

Reply
Arghya says:

March 26, 2022 at 8:32 pm

Hi Roopam,

Though I hold experience in credit underwriting, I am a newbie with credit scoring. Your article is of a lot of help for beginners like me. I have some doubts with this part:

1. How does using WOE of the course classes of age variable improve the model??
2. Somewhere else I learnt to use WOE and IV for course classing independent variables.. Is this a correct approach?
3. Can we use IV for reducing the number of independent variables and how does it fair against other options such as PCA, Forward selection, Backward elimination etc. available.

Reply