YOU CANalytics | Thompson Sampling & Artificial Intelligence

This is the final part of the digital marketing case study example. In this case study, you are a digital analytics consultant. Here, you are working with your client Helping Hand, an NGO, to extract the maximum value out of their marketing campaign. For this, you are using principles of reinforcement learning to get the maximum donation from the recipients of the emails. Notably, reinforcement learning helps in the development of intelligent systems and artificial intelligence. In the previous two parts, you had compared the performance of three different ads using A/B testing. Moreover, you had improved the performance of A/B testing through Bayesian statistics. In this part, we will use Bayesian statistics and reinforcement learning to develop an intelligent marketing campaign design.

Thompson Sampling and Reinforcement Learning

In the previous part, your client noticed that you had set an uneven sample design for the three ads sent out for the email campaign.

Advertisement	Number of Recipients	% Recipients
A	5500	20.4%
B	9000	33.3%
C	12500	46.3%

The email recipients received these three ads.

Now, let us try to understand the basis of this uneven sampling. In the process, we will also explore how Thompson Sampling and reinforcement learning are used for development of intelligent systems. For the same, let’s first delve a bit into some sociological and psychological studies that had formed the basis of our uneven sample design. We will learn more about…

Empathy and Emotions

Mother Teresa and Joseph Stalin are two completely different personalities. Mother Teresa was declared a saint in 2016 for her work with the unprivileged in Calcutta. Stalin, on the other hand, is widely accused of human rights abuse.

These two, however, agree on one issue i.e. empathy towards a single human being vs. a lack of empathy towards large groups of humans.

If I look at the mass, I will never act. If I look at the one, I will.

– Mother Teresa

Mother Teresa, in this quote, is highlighting her reasons for helping the poor in Calcutta. Joseph Stalin mentions the same idea in a somewhat more stern and aggressive way but he was not trying to help anyone!

A single death is a tragedy; a million deaths is a statistic.

– Joseph Stalin

Incidentally, both Mother Teresa and Stalin are pointing towards the phenomenon psychologists describe as the identifiable victim effect. The identifiable victim effect is described by Dan Ariely in his fascinating book The Upside of Irrationality.

Identifiable Victim Effect

Dan Ariely, in his book, discussed an experiment to quantify this effect. In this experiment, the experimenter gave a group of students $5 each. Additionally, the experimenters asked each student to voluntarily donate a portion of their money. One set of students read a message like this before making the donation.

Food Shortages in Malawi are affecting more than 3 million children…More than 11 million people in Ethiopia need immediate food assistance [you could contribute for that]

In contrast, the second set of students read a more personal message:

Her life would be changed for the better as a result of your financial gift. With you support, Save the Children will work with Rokia’s family to feed her…

The experimenters observed that Rokia received more than twice the share of $5 from students than 3 million children in Malawi. Let’s park this idea about how people donate more for an individual and relatable victim than for a mass of victims. We will use it soon for our digital marketing case study example. For now, look at this data from Dr. Ariely’s book which demonstrates the identifiable victim effect at a large scale.

People are donating more money to a smaller set of affected victims than for a large set. This could be for multiple reasons including advertising, news time, etc. But still, the correlation is just too startling and one of the reasons for this could very well be the identifiable victim effect. It is sad but true.

Let us make our transition back to some useful concepts that will help us derive the maximum value out of our email marketing campaign. For the same, let’s visit a casino in Las Vegas and gamble a bit.

Multi-Armed Bandits and Thompson Sampling

If you have ever been to a casino you would know of slot machines. The idea with a slot machine is that you insert your money in it, and pull the lever or buttons. If you hit a jackpot or get the right combinations on the screen you will make money on your initial bet. Otherwise, you will lose all your money. A slot machine is also, for a good reason, called a one-armed bandit since you are more likely to lose your money than to make a profit.

Now imagine you entered a casino with multiple slot machines and want to figure out which of these machines is going to make you the most profit. This is a famous problem in optimization called “multi-armed bandits”. Each slot machine in the casino has a different probability of success i.e. the chance of making a profit for you. At first, you don’t know the probabilities of success for these machines. Hence, you need to explore to learn about these probabilities. At the same time, you also want to make money. This is essentially an example of the explore and exploit optimization we discussed at length with reinforcement learning.

I am sure you could make the link between the “multi-armed bandits” problem and our marketing campaign. Our three ads are no different than the slot machines. It’s all about making the maximum profit while exploring to find the probability of success. There are multiple strategies to solve the “multi-armed bandits” problem. However, one strategy that usually does better than others is Thompson Sampling. This paper from Yahoo/Microsoft Inc. illustrates the same point.

Thompson Sampling and Bayesian Priors

William R. Thompson proposed a sampling method that exploits Bayesian priors in his research paper published in 1933. In the paper, Thompson was trying to design an effective sampling strategy for clinical trials to save as many patients while exploring new drugs and methods of treatment. The loss of a patient to a trial drug despite the availability of better drug is called regret. A clinical trial is essentially an explore/exploit optimization where one is trying to minimize regret while finding new and effective ways to treat the patients.

Thompson pointed that when the sampling design exploits the Bayesian methods to constantly update sample distribution based on new knowledge the regret can be minimized. This may sound complicated but actually is a reasonably simple method as we will soon see. Remember, in the last post we discussed how knowledge is an incremental process. Moreover, Bayesian statistics captures this essence of scientific growth better than the traditional frequentist approach to statistics. Thompson, essentially, was pointing towards Beta prior distribution for Bernoulli trials that we discussed at great length in the previous article.

Example for Thompson Sampling

In his paper, Thompson suggested that clinical researchers must design the clinical trial samples using the existing knowledge about the effectiveness of medicines. This strategy will save most number of patients while exploring new and better drugs. For instance, let’s assume that you are testing two different drugs A and B. You have a prior belief that drug A is slightly better at treating most patients. Let’s assume you have quantified this belief i.e. for every 100 patients treated with drug A, 53 will show signs of improvement vs. 47 for drug B. According to this, when a patient walks in for a treatment you will toss a biased coin which has 53% chances of the head vs. 47% of the tail. If it’s a head the patient will get the treatment from drug A otherwise drug B.

One of the questions could be why not treat every patient with drug A since it is more effective? The answer is the that the effectiveness is based on your belief. However, this belief needs to be validated by hardcore evidence or data. The important aspect of Bayesian thinking is that belief could be made a part of the research and the belief gets updated with evidence.

This 53/47 sample of drug A and B is just the initial sample design. This design is constantly updated with new evidence through Beta prior distribution for Bernoulli trials. Let’s assume you found drug B to be more effective for 70% patients vs 30% patients for drug A. Clearly, this is against your initial belief. Now, the next patient walked in will receive treatment from drug B with 70% probability.

Now, Let’s head back to our case study example:

Digital Case Study Example – Thompson Sampling

You are in a meeting with your client that she had set up last week. She is keen to learn about your reasons for uneven sample design for the three ads. Moreover, she is curious to know how this design will help her maximize the donation money. You tell her about the identifiable victim effect experiment. You also tell her how Rokia, or in her case Elisa, will make people donate almost twice as much money than the less-identifiable map of Africa. At this point, she asks if the ad with Elisa was going to get more money then why did you test the map of Africa at all. You think that this woman is clever and asking all the right questions.

To answer her question, you tell her that you were not sure about the click-rate for Elisa vs. the map of Africa. Moreover, the experiment described by Dr. Ariely was for a group of students and you wanted to test the concepts for a larger population of email recipients. She seems satisfied with your answer and asks you to tell her how you used Thompson Sampling to create the sample design. You tell her that had you divided the sample evenly i.e. in 33% you would not have exploited the knowledge shared by Dr. Ariely in his book. You also tell her how the job of a data scientist also includes awareness of research work going on in many different fields including psychology, computer science, sociology, marketing, operations research etc. and use this knowledge to improve results for the clients.

Artificial Intelligence

Since you knew ‘the map of Africa’ will have a little over twice as high regret than ‘Elisa’. This knowledge is reflected in the sample design i.e. 46.3% recipients get Elisa ad vs. 20.4% for the map of Africa. Moreover, the mixed group i.e. ad B was sent to the 33% recipients.

The interesting part, as you explain to your client, is that this is an evolutionary design. With each level of experiment and evidence, the sample design will get optimized to generate more profit. Moreover, this design is highly flexible. Here, new ads (say D, and E) can be added and while removing underperforming ads (e.g. ad A) with reasonable ease.

You tell your client that her organization has taken its baby-step towards artificial intelligence driven marketing campaign design, a design that modifies itself based on evidence and experience. This is somewhat similar to the human brain that learns from experience. This design is highly effective in a real-time digital marketing effort where banners are used to test the click-through rate and associated conversion rate.

Sign-off Note

The loss of the patient to an inferior drug and treatment in Thompson Sampling is defined as regret. Regret, as defined in the dictionary, is the feeling of sadness, repentance, or disappointment about an action. It’s possible, a collective scientific intelligence is giving us a message to make artificial intelligence more human. The day artificial intelligence will feel the sorrow of losing a patient and the joy of donation/giving back to others is the day when artificial intelligence and machines will feel alive. It is possibly a good message for humans to feel somewhat more alive.

3 thoughts on “Thompson Sampling for Artificial Intelligence – Digital Marketing Case Study Example (Part 4)”

Dheeraj Singh says:

November 27, 2017 at 10:41 pm

Could you please elaborate with coding as well in R ,so that we can understand well with examples?

Ranbir says:

June 2, 2019 at 2:56 pm

What we are trying to do is to serve templates in real time in proportions basis Thompsons sampling.
Looks like it is possible in a website but not for mass email marketing.
What are the best practices used for A/B in email marketing?

- Roopam Upadhyay says:
  
  June 2, 2019 at 5:59 pm
  
  There is no reason why this construct doesn’t work for email mass marketing especially if they run as a process, as many regular campaigns often do. This is a way to automate the process with limited supervision.
  
  For seasonal campaigns, I suggest you read about ‘design of experiments’. DOE will help you test multiple factors such as communication, creative, day, time etc. in the same go. DOE is a much superior form of A/B testing.