
Data mining (sometimes called knowledge discovery – you will see why) is not a very exciting subject, and memorable examples can be hard to find. It is often hard to realise how understanding and knowledge can be gained from lots of separate little pieces of data. This example from the US retailer Target proves a very memorable example!
Forbes reports that the retailer’s data mining systems are so advanced they can quite accurately predict buying habits based on a customer’s situation (such as being pregnant). For example, at some point in early pregnancy, women typically stock up on vitamin and mineral supplements. Women in the second trimester apparently buy lots of unscented lotions. Women closer to their due date buy a lot more hand sanitiser and wash cloths. These are fairly easy predictions to make, if you know a customer is pregnant.
More significantly, Target were able to do the reverse, using their vast database of buying habits to predict a customer’s situation (like, being pregnant) and therefore predict future behaviour (i.e. what they are likely to buy soon). Target went as far as using purchasing data to give customers a “pregnancy predictor” – a percentage figure to estimate how likely the customer was to be pregnant.
Of course, if you know a customer’s situation and what they are likely to buy, you can send them promotional material based on those items. This is exactly what Target apparently did when they sent advertisements for pregnancy related items to a teenage girl in Minneapolis. The girl’s father was apparently quite angry about the adverts his daughter received, but not as angry as he was when he found out why she received them….
Of course, what Target did was not illegal – their terms and conditions and privacy policy for the store card the girl used presumably informed her that data would be collected and processed – but who reads terms and conditions and privacy policies? And legal, of course, does not necessarily mean ethical. But the biggest problem here is the power of data – like the case of the woman whose anonymous search data was released by AOL (she was tracked down and interviewed by a NYTimes journalist) – people don’t understand how small pieces of data can build up a history about who you are and what you have done. This is why privacy is such a hard concept to understand – individual items of data may not be private, but combined, they can be – especially when we consider how long digital data is kept in databases.
Sources:
Forbes news article via tylerdo itgs blog
NY Times article – longer but very detailed article
Leave a Reply