Author and consultant Geoffrey Moore once tweeted: “Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway”. An innovative thought when it was first written in 2012, it was in many ways prophetic of the world we now live in.
Last year’s Cambridge Analytica scandal – which brought to light the company’s improper use and distribution of consumer data, which was then used for political advertising purposes – is the most prominent example of public anger at the misuse of valuable data.
As awareness on the matter increases, so has the volume of concern from security companies and advocacy groups around the world, who in particular stress the need for privacy, as well as improved data use and protection.
Transport for London (TfL) – the operator responsible for running the Underground, bus and Overground services across the British capital – is one of the latest companies to come under the spotlight for its use of passenger figures. In particular, as of July this year, the company is facing public scrutiny for starting to track its commuters’ smartphones through Wi-Fi.
The move, which TfL is arguing will help it understand journey patterns and improve services across the network, has been criticised for presenting a number of issues related to anonymisation of data and transparency.
The issue alerted experts in the UK and overseas, with some calling TfL for more clarity on its programme. But is there anything to be concerned about?
Collecting data from passengers
TfL first trialled its smartphone tracking scheme in 2016; tests were carried out at 50 stations across the network, as part of a pilot scheme that collected more than 509 million pieces of data in the space of four weeks.
During that period, customers didn’t have to necessarily connect to a specific Wi-Fi network for the collection to happen, as mobile devices automatically send out signals that are then picked by TfL.
As a result, some 5.6 million Wi-Fi-enabled mobile devices searched for a network on the Underground by sending out a unique identifier – the so-called Media Access Control address – to nearby routes. Information extracted from these connection requests was gathered and depersonalised by TfL.
As the operator wrote in a press release in May, the results of the scheme revealed pivotal information about commuter movements – such as the amount of people passing through a specific station during a particular timeframe – that was impossible to obtain through paper-based surveys or traditional ticketing data, such as the one extracted from Oyster card journeys.
Three years – and a lot of close cooperation with the Information Commissioner’s Office – later, the programme is now fully operational on 260 Wi-Fi-enabled London Underground stations.
Privacy and data usage concerns
The fact that information is gathered even when devices are not connected to Wi-Fi leads to a number of implications that, according to experts, TfL should be looking into.
First of all, says Andrew van der Stock, senior principal consultant at software specialist Synopsys, it means that customers who don’t want their data harvested will have to turn the Wi-Fi off or switch their device to airplane mode.
“This [data gathering process] has been done in shopping centres for a very long time,” he adds. “They use it to identify crowds for traffic.”
But the problem is that while visiting particular shopping centres is optional – implying customers voluntarily choose to have their movements tracked – the tube is often the only alternative for hundreds of thousands of London commuters.
“Technically, there is no difference between what shopping centres and TfL are trying to achieve here,” explains van der Stock. “But in privacy relation, you have a different class of collection happening. Unless people change their mobile phones on a regular basis, or deliberately try to shield themselves, they have no option but to be collected.”
Further concerns, he says, should be raised on the privacy and data protection policy the operator is putting in place. As TfL explained in a press release, passenger information is automatically depersonalised and encrypted through a pseudonymisation process that protects the individual’s identity.
However, van der Stock argues that pseudonymisation doesn’t guarantee full anonymity and risks exposing data to de-anonymisation processes.
As he puts it: “It does worry me that they don’t seem to have much in the way of a data protection guarantee here, such as making sure that they are eliminating the data as quickly as possible and being very transparent about what they’re using it for and who they’re sharing it with.”
TfL has indeed confirmed it might share collected data with the police or third parties, though assures that no specific information – but rather the analysis of it – will be passed on.
However, van der Stock says that this is not enough of an assurance, especially considering the precedent created by the Cambridge Analytica case.
“Even if you don’t know exactly who these people are, you can make some very interesting demographic information about them and be very specific about who this person might be,” he claims. “These trends risk being sent to companies like Cambridge Analytica, whose job is to associate interesting pieces of information, and then sell them to someone else.”
TfL fires back
Ever since announcing the tracking scheme, TfL has been working to reassure its customers the situation is not as grave as many portray it. For starters, fully anonymising passenger data would compromise the very purpose of the scheme.
As a TfL spokesperson explains, “If you were to fully anonymise it, then when you would collect it, you would scramble the numbers and get a completely randomly generated figure. So, if you were trying to understand a pattern, you would never be able to do that.
“[By depersonalising the data] we can understand the patterns of the system, but we have no ways of actually identifying who people are.”
To further ensure depersonalisation is preserved, TfL has also set a minimum number for devices below which it will not monitor the data. So, if there are less than five device connections at a particular location, they will be counted as zero, meaning “you can’t unintentionally identify someone by the fact that they were the only ones at that specific location.”
As for sharing information with third parties, TfL’s representative assures that it will only be a matter of passing on analysis of trends and passenger flows. “We’ve worked with the Information Commissioner’s Office all the way through on a privacy by design basis,” he explains. “So, we’re making them design the whole project around privacy rather than it being last-minute thoughts.”
Should TfL go softer on the scheme?
Now a few months into the scheme, TfL looks unlikely to make any changes, although concerns among security experts have not disappeared.
“We leave a digital footprint and wake as we navigate the real world that is analogous to a trial of bread crumbs, but less private,” explains Sam Curry, chief security officer at Cybereason a cyber security company based in the US.
For this reason, he says, TfL should start to be more transparent about its use of passenger data, which could be achieved in a number of ways.
“[TfL should] create a privacy advisory board or committee, keep the data and analytics under lock and key,” Curry argues, mentioning it would also need to “have policies on others collecting underground digital telemetry and make these known.”
Adding to this statement, van der Stock believes TfL should switch to an ‘opt-in’ scheme that allows customers to sign up to the Wi-Fi tracking scheme instead of having to opt out of it.
But TfL’s spokesman reiterates this is definitely not in the cards. “We’ve run a pilot scheme and we’ve done a lot of customer awareness campaigns to actually make sure that people understand the reasons why we’re collecting this,” he says.
“We’re not collecting any personal data, we’re not collecting any further information, and we will not be combining this data with other methods to identify people. This is purely so we can understand how people move across our network so that we can actually give a better service.”
Does the end ultimately justify the means, though? As van der Stock himself concedes, the thousands of London Underground commuters are unlikely to ask themselves this very question.
“People have accepted the fact that when they go to a shopping centre, this sort of information is collected,” he concludes. “And I think that there’s probably not going to be enough of a kerfuffle to change TfL’s mind.
“Once it becomes the norm, it’ll be too late.”