Multi-arm bandit algorithm vs classical A/B Testing

How do you know which ad to serve to a customer and get more conversions, or which version of the website to show?

Multi-arm bandit “MAB” works by assigning weights to multiple experiments “arms” using an algorithm known as epsilon-greedy algorithm and uses a explore vs exploit strategy to choose an arm to show

In the ‘classical’ A/B testing you’ll conclude your experiment B is significant if the confidence is more than 95%

MAB is specially useful when you have more than two experiments to run to see which gives better conversions in our case, this is where it truly shines and A/B testing lacks support for this type of testing

Here’s a paper by Google that performs different tests with their results 

This article challenges this argument ‘MAB is better than A/B Testing’ using some tests where they compare the two and get similar results

Watch this quick intro to learn about Multi-Arm bandit

Next read ‘Contextual-bandit’ a strategy that Netflix uses to show personalized artwork of their shows to get maximum views


Leave a Reply

Your email address will not be published. Required fields are marked *