Dawn of the Predictable Shopper

A team of computer science students takes the mystery out of shopping for the second-largest retailer on the planet


It comes as no surprise that Target Corp. is particularly adept at hitting the bull’s eye when it comes to predicting shoppers’ behavior.

As the ability to analyze patterns in purchasing behavior using data extracted from emails, recent transactions and social media has grown, Target wants to stay on the cutting edge of analytics. That’s why the retail giant turned to USC students for a fresh take.

This spring, Target collaborated with the USC Viterbi Institute for Innovation to launch the Target Data Challenge. Twenty teams competed head to head to create a system that predicts future purchases of sample shoppers using a data set of 700 million actual transactions. The aim was to produce an algorithm that most closely predicted the shoppers’ purchases for the following month.

The data points given were all assigned random variables to disguise personal information. With this important privacy measure in place, the team developed a system of “random walking” that combed the purchasing histories of 20 million users.

USC Viterbi computer science majors Chung Ming Cheung, Palash Goyal and Ajitesh Srivastava snagged first place in the challenge. Mentored by Research Associate Arash Saber Tehrani and Professor Viktor Prasanna of USC Viterbi’s Data Science Lab, the winning team originally planned to build on the best predictive analysis models the market had to offer. Along the way, however, they discovered that the “best” models were slow and not as accurate as they had anticipated.

Typically, you don’t think of inventing a whole new system when competing.

Yet he and his teammates did exactly that. Their program randomly picks an item from a user’s purchases, searches for another user that bought the same item, then picks another item from the second user’s cart at random and finds another user that also purchased that item. The program randomly “walks” through thousands of customer carts. And voilà, the item purchased most commonly among everyone sampled can be confidently suggested to the first user.

The team, which hopes to further test the model’s reliability, demonstrated that even multibillion-dollar corporations could learn a thing or two from a handful of determined young scientists.