Imagine that you are in the market for a new digital camera, and you visit Amazon.com to begin your search. You peruse the site’s user-generated reviews. While some cameras have many reviews, most have very few or no ratings. How much would it take to get you to experiment and purchase a scarcely reviewed camera and hopefully give it a rating?
In “Incentivizing Exploration,” which won the best paper award at the 2014 ACM Conference on Economics and Computation, USC Viterbi Computer Science Professor David Kempe and his collaborators at Cornell University used an extension of the Multi-Armed Bandit (MAB) model to determine how to incentivize people to look outside their comfort zones and explore new opportunities.
Developed in the 1950s by Herbert Robbins, and refined by many researchers, the MAB model explores the tradeoff between sticking with what is known and exploring new alternatives.
The MAB model is usually illustrated with the image of slot machines. There are multiple machines and the user must determine which machine they will use and for how long to achieve the best benefit or payout.
For non-gamblers, the MAB model is easily illustrated by considering a route to work with many options: freeways, side streets and various combinations. After trying each route once, do you decide that the fastest route is your preferred one? Not necessarily, Kempe believes. He suggests that you should take each route several different times before choosing a favorite. That’s because the route you may have chosen after one trip may have had less traffic because it was a holiday, while another one may have experienced construction, causing delays. But as Kempe explained, “the more often repeated trials confirm an experience, the more you will stick with that choice.”
Cornell professor and collaborator Robert Kleinberg said, “Our research benefits the designers of platforms that rely on user-supplied evaluations to rank a set of alternatives. However, rather than thinking of our paper as supplying practical advice on how to make those services better, I prefer to think of it as offering insight into a fundamental question about why those services work well in the first place.”
Crowdsourcing, the process of soliciting many users to provide information or reviews, is typically used online in apps such as Waze, Yelp and Amazon. Crowdsourcing aims to encourage current users to provide recommendations to guide future users. However, users want only to choose the single best option at any given moment. How can these differing goals become more aligned?
In a word: money. But how much would incentivize someone to change behavior? The team set out to answer this question mathematically. Their analysis found it is best to decide randomly for each customer between offering no discounts and offering a large enough discount to convince the user to buy a product that needs more reviews. Kempe said that, by offering a discount, a company like Amazon can quickly learn about a product’s quality, but offering too many large discounts could result in a profit loss.
The team’s mathematical contribution discovered that a relatively small incentive can get the goals of designer and user closely aligned. Accounting for customers who receive a discount and customers who do not, the team found that with a relatively small number of discounts, even just
20 percent of the overall profit, Amazon could guarantee its customers an 80 percent satisfaction level. Simply put, companies need to find just the right balance between offering enough of a discount to attract new buyers and reviews, but not too much of one to lose money.
Not everyone is equally bribed into exploring new options, so Kempe believes that “to save money, a retailer should use all available knowledge to estimate the best individually tailored discounts to offer.”