When recommendations go bad: Walmart & the “Planet of the Apes” fiasco

The recent recommendations fiasco at Walmart (people looking for Planet of the Apes were directed to movies about Martin Luther King Jr.). Expectedly this was found offensive many and was all over the blogosphere. Walmart now says it was due to some mis-cataloging by an employer in 2005. The end result: its not just that recommendation being changed, the whole recommender system is being taken down.

This is not the first time that recommender systems have been in the news for strange recommendations – though in the past its been about about ludicrous rather than offensive recommendations. For example, recall the “My TIVO thinks I am gay” article and Amazon recommending “underwear” to people looking for .NET books.

Recommendations are often used to provide some cross-category interlinking, going beyond staid genre systems, or “books by same author” etc. As such, recommendations are inherently risky since they help customers discover tastes. I feel 100% confidant in predicting this is not the last recommendation fiasco.

Turns out that the Walmart fiasco was due to human error (actually, turns out that Walmart base its recommendations on some type of manual cross-linking, not user input or collaborative filtering algorithms).

Walmart.com manually assigns movies to specific “item display groups,” such as science fiction or African-American culture. The company’s internally developed software then generates links guiding shoppers to other movies in that group.

Their approach makes their system very susceptible to human error (and not much of a recommender system, I might add). But one can imagine more automated approaches (e.g., systems using collaborative filtering) making such recommendations – under specific circumstances. Let me describe a few possibilities.

Recommender System bombing: John Battelle used this term first, and its entirely within the realm of possibility. A group of people could decide to let the recommender system know that they see a link between two items. For example, people could buy two books, or rate both highly over a short period of time. Recommendation system algorithms look for signals in a bunch of background noise. Given enough people, the system would learn that there was a link between two items.

By chance: You could end up with the same result without any deliberate manipulation on anyone’s part. Imagine that Walmart was experimenting with showing different movie paired with Star Wars (rotating among a bunch of movies chosen randomly). This random pairing was being done as a trial phase while they gathered user data to make real recommendation. Imagine that one of the recommendations was so odd (maybe offensive or maybe simply weird), that people clicked on it just to understand why it was recommended. Pretty soon, the recommender system “learnt” that these two items were linked in people’s minds and an offensive recommendation was born.

Recommender System picks up politically incorrect stereotypes:At a basic level, recommendation systems simply pick up correlations in people’s preferences. They are tuned to pick up cultural notions and are blind to whether those notions are politically correct or not. So, an offending Walmart recommendation might simply reflect what enough people in society think on a deeper level. (Refer to the work on Implicit Associations and how to test for them).

So, are recommender systems to blame. I think not? Can we make sure this never happens again? I think that’s going to be hard.

There are no easy answers. Recommender systems are for making links between two items based on user feedback. It is not possible to examine every such pair and decide what’s offending and what’s not. Maybe one can inbuild some heuristics, and reduce human error. But its silly to stop using these systems because of problems once in a while. And its not as if this process can be done manually.

If anyone has better ideas regarding how to guard against such future happenings without taking down the whole recommender system, do share. Apart from an increase in accuracy, and more testing, I cannot think of any other ways…

4 thoughts on “When recommendations go bad: Walmart & the “Planet of the Apes” fiasco

  1. Based in your description, it doesn’t look like a mistake. They will never post the details, but the guy who categorized the movies should had put the Planet of Apes movie in the Afro American category. If this is true, it wasn’t the primitive recommender system that had a problem, but the Walmart process. I don’t believe that they allow someone to write anything in their website without reviewing it first. Why would they allow someone enter data in their system without reviewing it first?

  2. Paulo,

    You might be right in that this was no mistake. Its not completely clear from their description why this happened. But i do think it would be difficult for any company to be vigilent about the connotations of every recommendation. Editing content is an entirely different ballgame. That must be definitely edited. But with recommendations, there is no content to be edited, instead you will need to look at every pair recommended and examine it for any offensive connotations. Thats would be pretty difficult to check for any company. (My point is really about recommender systems in general).

    In this case, I think it comes down to whether you think Walmart was giving the right story or not.

  3. A good solution might be to apply the flagging model of community sites.

    Users flag offensive recommmendations. Then, a reviewer breaks the connection between the items (in effect “resetting” the system) if in his/her judgment it is actually an offense.

  4. Chris,

    That might work quite well. As more recommendations get flagged, it might be possible to develop some heuristics and flag other problematic pairs.

    As it has been said: the many eyes make all bugs shallow!

Comments are closed.