Behavioural targeting gauges Web browsing habits to spot the services and products consumers want to buy. But is the technique putting privacy on trial?
Gavin Potter knows a thing or two about online dating. As a psychologist interested in artificial intelligence, it's Potter's job to teach machines how to match prospective couples more effectively. The more data you collect, he says, the more predictable human behaviour becomes. More than 90 per cent of men seek women who are half their age plus seven, claims Potter. Female daters, on the other hand, tend to click on profiles of men who are on average three years older. "That's up until the age of 40," he adds. "Then they go for toy boys."
To Potter, love is a numbers game. "People think they are unique," he laughs. "They say, 'you can't possibly profile me'. The depressing thing is that you can actually do it pretty accurately." But the numbers don't tell the whole story. The data can reveal which profile a person is interested in, but not why they refuse to click on other people with similar profiles.
To fill in the gaps, Potter taps his knowledge of psychology to teach the database to recognise human idiosyncrasies, such as personality and mood. Men are more active on dating sites, he explains, because in general they are more confident. "They see a profile they like and they click on it." But women need more coercion. They are more likely to respond to a computer-generated recommendation than someone they have found themselves.
A former management consultant for IBM, Potter started IntroAnalytics with Nick Tsinonis, owner of dating website yesnomayB.com. IntroAnalytics is at the sharp end of a growing industry, behavioural targeting: the practice of segmenting Web users to help predict what they might want to do next. For content providers, this means showing more tailored information, including advertising. For retailers, it means presenting goods that a user is more likely to buy. And for an online dating company, it means suggesting profiles that seem somehow more appealing.
There are many different applications of the technique. UK start-up Imagini supplies profiling software known as VisualDNA, which it claims measures customers' emotions, allowing clients, such as travel website Hotels.com and dating company Match.com, to serve personalised information. To detect personality type, users are asked quick-fire questions, which they answer by clicking on images. Founder Alex Willcock says visual profiling can generate a 526 per cent rise in click-through rates over randomly selected ads.
At its most fundamental, behavioural targeting might rely on time of day, weather conditions, or location. More complex methods use software known as recommender engines, which group customers based on their perceived interests. Or in Amazon parlance: "Customers who bought this item also bought..." Thirty per cent of Amazon sales are derived from its collaborative filter algorithm, a type of recommender engine that makes use of transaction history and user ratings to predict demand for other products.
Recommender engines are particularly useful for companies that specialise in aggregating vast libraries of content. Mendeley is a social network for researchers, allowing students to manage and share research papers online. Its goal is to democratise academic research, recommending papers to users based on usefulness and popularity.
"The good thing about Mendeley is that we can analyse the content of a researcher's library and then come up with relevant additional content for the user, via collaborative filtering, co-citation analysis, or full-text semantic analysis," explains Mendeley co-founder Jan Reichelt.
Advances in computing processing power, which allow data sets to be analysed quickly, and the application of psychological theory to what were previously reams of numbers, have allowed companies such as Amazon to increase the effectiveness of their collaborative filters and therefore be more predictive about user behaviour.
Fashion website Net-a-Porter uses collaborative filtering to automatically and anonymously segment customers, allowing the company to serve different groups with content designed to appeal specifically to them. The company behind Net-a-Porter's technology is Cognitive Match, a UK start-up that claims its algorithm can boost the rate of click-throughs by up to 50 per cent inside a week. "The system will build [customer] segments, modify them and potentially throw them away several times a day, testing to see what works and what doesn't," says Glen Coneybeare, Cognitive Match's co-founder.
Those segments are full of individuals displaying similar types of behaviour. But it isn't just about analysing where customers click. In the fashion world, location is also highly predictive, says Coneybeare. "The system is saying, 'you're west coast, you've got a high-end PC, it's a Friday evening. You visited before and you seemed to like shoes'. That might be your segment."
There are limits to the success of automated segmentation. Suppose you are logging on from a partner's PC, or the last purchase you made was a present for somebody else. Any prediction system worth its salt must be able to account for unpredictable, idiosyncratic behaviour. The birth of a child, Coneybeare suggests, will change buying behaviour overnight. "Say you're Tesco and you keep pushing the latest PSP games to a person that always used to buy PSP games but now buys nappies, if you keep doing that for six months you're really going to irritate them."
To be effective, collaborative filters must be able to make machines seem less artificial. "What we're trying to replicate is sitting next to someone and asking, what do you want to buy today?" says Cognitive Match co-founder and chief executive Alex Kelleher. "What we don't want to do is create a world in which you once said to your aunt you like blue sweaters and for the next 29 years she's buying you blue sweaters."
A machine's ability to identify serendipitous discoveries is rooted in the psychology of reward, specifically an area known as reinforcement learning. One of the key challenges of reinforcement learning is balancing how much time we spend exploring uncharted territory with how much time we should take up exploiting current knowledge. In the case of a retailer such as Amazon, how often should the company's collaborative filter recommend books or films that are inside a customer's comfort zone, and how often should it show us riskier items—items that it isn't so sure that we'd like?
A recommendation engine with a good understanding of reinforcement learning helps counter a criticism of behavioural targeting: that automation stifles random discovery. "What people worry about is that if I'm always making recommendations based on past behaviour then there isn't room to discover anything new," says Potter. But with increased input from psychologists, he says, machines are learning that many consumers don't know what they want until offered it.
Potter became interested in recommendation systems in 2007 after hearing of a crowdsourcing competition organised by US DVD rental company Netflix. Aware that customer retention depended on how accurately the firm could predict the type of films customers might like, Netflix released its database of 100 million films, rated by 480,000 of its customers, to the public. The challenge for contestants was simple: predict how these customers might rate other films at least 10 per cent more accurately than Netflix's recommender system. The first competitor to develop a winning algorithm would receive a £1m prize.
More than 50,000 statisticians entered the contest, won by a team from AT&T. Netflix was so pleased with the results that it announced a follow-up competition, with an even bigger data set, to try to improve accuracy. But last December the company was hit with a privacy lawsuit from a customer who claimed insufficiently anonymous details made it possible for her to be identified and outed as a lesbian. The prize was cancelled.
Netflix insists it had not used personally identifiable information (PII). Many in the industry adhere to a voluntary code: PII should not be used to predict behaviour, meaning no names, addresses, or ages, without consent. Coneybeare says: "If someone were to give me that data on a plate today I don't think I'd use it. For the additional benefit of what you'd get, the risk of going against what people are prepared to give up is too great."
Such an acknowledgement of consumer privacy owes much to the story of UK start-up Phorm. The Office of Fair Trading launched an inquiry into behavioural targeting after Phorm was accused of breaking UK data interception laws with controversial software that uses partnerships with ISPs to match keywords from the content of a page to an anonymous profile. Users are targeted with adverts on partner websites that have signed up to Phorm's technology. After campaigning from privacy watchers, including Sir Tim Berners-Lee, who called Phorm the "internet equivalent of wire tapping", the government has backed the technology, but only if users consent to being tracked.
Phorm is yet to launch, but its progress will be monitored by companies whose business models depend on the leveraging of huge databases of personal information. Facebook's huge valuation is seen by many experts as a bet on how the social networking giant can in the future manipulate its vast arsenal of data. Facebook has angered users with arcane privacy settings and its attempts, through ad platform Beacon, to monitor and manipulate user behaviour.
Greg Lastowka, a professor at US law school Rutgers, says it's difficult to assert privacy rights when a person voluntarily shares data by posting it on a social networking site. "We talk about the ownership of coins and apples because these are objects that are subject to the control of only one person," he explains. "While it is hard to share many forms of tangible property, information has the opposite characteristic. So you might say, 'it's your data', but what does that mean?"
To Web entrepreneur David Murray Hundley, it means a lot. His drinksin.com website mixes AI and geolocation to help suggest bars and pubs. All social networks are competing to operate at the centre of an ultra-personalised Web, he says, which makes data ownership a big land grab. "Why did Microsoft pay so much for [a share of] Facebook?" Possibly because Facebook is only "three or four years away from knowing everything about you from your status updates". Like online dating, it's how you interpret the information that counts.