Location data is the big problem, the Singapore-led group says: even if the resolution of a phone's GPS records is reduced in a stored dataset, following a user's track (trajectory in the paper) for long enough will easily identify that user.
Just how revealing location trajectories are is revealed in their analysis of 56 million records: “with two random points, more than 60 per cent of the trajectories are unique”, they write.
The researchers say anonymisation of mobile datasets is improved if the “trajectories” – literally, the “where the user has been” location datasets – are reduced. This way, they write, anonymity can be better protected, without trashing the utility of the dataset.
The researchers, Yi Song of the National University of Singapore (working under an SAP internship), Daniel Dahlmeier of SAP and Stephane Bressan of the National University of Singapore, note that the longer a user's location data can be strung into a trajectory, the easier it is to identify that user. In other words: a couple of location data points is nowhere near as useful as 24-hours' worth of the user's movements.
Splitting a user's trajectory can help preserve anonymity
The attraction of their approach, they hope, is that only one
parameter needs to be adjusted to give users better anonymity: the time
window that trajectories are cut into. That's a simple enough operation
that they believe it will be scalable to very large data sets
No comments:
Post a Comment