The Economics of Personal Data And The (Reckless?) Use Of Unreliable Statistics

A paper by a scholar of the university of Trento (IT), co-authored by people from the Kessler Foundation,Telefonica Network, Telecom Italia and Google finds that we are ready to sell our personal data for about two Euros.

Although the conclusions are – in principle – fair enough and match the “gut-feeling” of whoever works in the field of the personal-data handling, I wonder how it would be possible to draw statistics evidence by the criteria adopted.

I’m not a statisticians, but the only part of the paper dedicated to the sample’s composition reads:

All volunteers were recruited within the target group of young families with children, using a snowball sampling approach where existing study subjects recruit future subjects from among their acquaintances … A total of 60 volunteers from the living lab chose to participate in our mobile personal data monetization study. Par- ticipants’ age ranged from 28 to 44 years old (μ = 38, σ = 3.4). They held a variety of occupations and education levels, ranging from high school diplomas to PhD degrees.
All were savvy Android users who had used the smartphones provided by the living lab since November 2012. Regard- ing their socio-economic status, the average personal net in- come amounted to e21169 per year (σ = 5955); while the average family net income amounted to e 36915 per year (σ = 10961). All participants lived in Italy and the vast majority were of Italian nationality.

While, again, I have a limited knowledge of the statistic, there are a few oddities in the method applied by the researchers that undermine the value of the findings:

  1. The sample is made by only 60 people, belonging to young (wealthy enough) young families with children. This isn’t actually a fair depiction of the Italian socio-economics. Furthermore, there are neither enough information about the socio-economic status nor the ? geographic location of the participants to actually understand the sample quality.
  2. Even Wikpedia knows that the “snowballing” sample selection method is known to be prone to biases. No evidence are given in this paper of who the biases are handled.
  3. Though broadly used, Android isn’t the only platform. A well balanced sample should have taken into account Blackberry, IOS and Windows Mobile (or whatever the name.)
  4. The “measurements” of individual traits data relies upon psychological categories and methods. Psychology is not a science and putting a bunch of equations into an highly subjective discipline doesn’t turn it to hard science (I know, I know, positivism is dead, natural sciences aren’t so “absolute” etc. But try to send a rocket to the moon by assessing the “mood” of a ballistic trajectory and tell me the results.)

Before concluding that this paper offers no scientific evidence of its findings I would like to have these (and maybe other, expert-made) questions be answered. But I’m afraid that the final judgements wouldn’t change.

A final remark: the lack of scientific method shown in this paper is dangerous because, as often happens, poorly informed journalists jump on the news and “sell” it without any warning to the readers, thus luring them – and the Data Protection Authority, I fear – into thinking that what is a limited, partial and non-relevant work actually drives to factual conclusions.