Open data, in short, is available to any one of us. It is data released by big organisations which, in the transport sector, is changing the way passengers are fed information and, moreover, how they travel.
In April, Transport for London (TfL) announced that more than 2,000 new developers had signed up for its open data programme in the preceding six months. The Application Programme Interface (API) allows developers to register to access TfL data free of charge, and use it as they please – creating journey planner apps, live maps, real-time location services, and so on.
“In 2009 we began to make large volumes of data available,” says Phil Young, head of online at TfL. “Initially that was things like the location of stations, but very quickly it became the live data. We have now around 200 items that people query and it is all forms of transport. All we ask is that people attribute us, say you’re powered by TfL.”
In total, more than 8,000 developers are signed up with TfL, ranging from large technology companies to ‘high-volume app developers’ to SMEs and academic institutions. “I think we’ve reached a critical mass with open data in London,” says Young. “It is widely talked about and known.”
It is not just TfL and London. Network Rail (NR), the Association of Train Operating Companies (ATOC) and National Rail Enquiries (NRE) release feeds such as train timetables, station footfall, and train positioning and movement event data. Those wishing to use this have to create accounts and agree to the terms and conditions.
As one example, data from a system known as Darwin powers NRE and train company websites, mobile apps and departure board screens, but it can also be used by the open data community. Openness is growing, but when and why did this subset of the industry emerge?
2011: gaining popularity
For Peter Hicks, the “big moment” was 2011. Hicks, who runs OpenTrainTimes – built on open data from NR – the Open Rail Data Wiki and a dedicated forum, all in his spare time, took the decision “to knock on the right doors” and get the word out. In 2010, the UK Government also pushed a transparency agenda, encouraging organisations to share what they have. NR and ATOC soon did just that.
“Ten years ago, if I wanted to know if my train was on time, I’d have to go to the NRE website before I left the office to check. There was no possibility of doing something different,” he says.
“Where we are now, the industry is more transparent, which in turn means they are more honest about what goes on. I’m of the belief that being transparent and honest puts you in a better light. People are incredibly cynical about rail travel in the UK, but open data can help.”
Transparency is something Hicks mentions more than once. His OpenTrainTimes website (he’s currently working on an app version) started life as a very simple interface using static data from NR, whereby the user would enter their location to see the corresponding timetables.
Demand has since seen Hicks introduce data and maps that show the location of trains on the railway in real-time. He’s now seen as something of an expert in the field, not that it has all been easy. “I’ve had examples when my data has been wrong,” he continues, “but I don’t try to claim my website is always 100% accurate all the time, or infer things within the data that I know cannot be inferred.” The OpenTrainTimes website itself states that “It is as accurate as we can make it”.
Trains.im: trend setter
In 2012, a chemistry undergraduate made it into national newspapers when word of his website spread. Ian Shortman built trains.im using timetable information, which The Guardian headlined: ‘How a student used open data to beat National Rail Enquiries at its own game’.
“I found myself frustrated when using the rail network,” explains Shortman. “I decided I would see if I could solve my own problem.” Was there an easier way to access timetables for journeys, especially when you might not know when you plan to travel, he asks?
The purpose of trains.im, Shortman says, “Is to provide a simple and free way to access train timetables”. At the time railway company website and apps were not as responsive, or detailed, as they are today”, he adds. “Trains.im shows the final destination station of the train on the timetable, so you know which train to look and listen out for if you’re not familiar with the routes. That information has always been available, but trains.im was the first to incorporate it.”
Mike Flynn is another like Shortman. He uses data from NR and NRE for his TrainTime, RailwayStation, Stationboard and Tubestation websites. “The more data the more potential, is how I see it,” he says. “I make use of national timetable data, a raw data file which I currently download weekly, so there’s a lot of processing at my end.” His most popular site, traintime.uk, had 699,848 page views in May.
There are numerous other case studies, including Live PPM, which shows real-time performance information for each train company and has approximately ten million hits per month; Raildar, highlighting real-time arrival and departure information; and Real Time Trains, showing similar information. And, ever heard of Citymapper? The hugely successful transport app uses open data from TfL, Google, Apple, and more.
Benefits, yes, but concerns emerge
These sites and apps share the ambition of making rail travel and transport just that bit easier. It will never be completely hassle free. Hicks and the cohort of developers can also make money through ad revenue, although Hicks is running OpenTrainTimes out of his own pocket.
But, what does it mean for passengers? “They get a lot more choice,” says Young. “Everywhere you go now you can have an app or a screen showing how our [TfL] services are running, really there’s no shortage of availability. It gives us a greater reach to passengers. Now we get very few calls that relate to how to get from A to B.”
Hicks follows a similar line: “From the passenger point of view, there are apps that can help people make better informed decisions. That’s the key.”
However, there are legitimate questions to be asked. The consensus is that, while there be will some market consolidation, more and more people will begin to see themselves as open data developers in the making. If that materialises, is there not a danger of too much information complicating the process rather than simplifying it?
“There’s a challenge [in making sure] that people don’t just develop services, but services that help Londoners [and the whole country],” admits Young. To this end, Young and his team are keen to engage with the community, as is Hicks through his forum. There is a need to “vocalise what the issues and limitations are”, he claims. “Essentially we want people singing from the same hymn sheet.”
A foremost issue is a lack of knowledge. Go into open data with your eyes open. “If you’re not interpreting data in the right way you can come up with the wrong conclusions,” warns Hicks. The pitfall, therefore, is people “visit your website but you’re giving out information that is not consistent with everyone else. That can detract from the value of open data in general”. Also, bad data in equals bad data out: “Data will never be perfect,” says Hicks.
Hicks and Young agree, however, that if an app or website always shows false information, the reputational damage will ensure it falls by the wayside.
“The cases where people do the right thing will outweigh the cases of bad practice,” Hicks says confidently. “Maybe you have 10,000 good cases but ten bad cases. And of course you always have the official sources.”
Open data: the new normal
Hicks is similarly emphatic in his prediction for the future. He strongly believes that open data will become the new normal, where people will no longer have to chase organisations to open their vaults; it will all be published proactively.
TfL is working on a release of historical crowding data on key London Underground stations. It hopes to have this out by the end of the year, although no date has been set in stone.
“What we think will be really useful for people is to know, in as many places as possible, where the network is likely to be at its busiest. They can make a choice about whether they can avoid those busiest times,” explains Young.
“The data itself is telling you, in time segments, what the busiest times are. We’re working to ready this right now and we’ll encourage developers to get it into their apps.”
Much has changed since the likes of Hicks, Shortman and Flynn launched their websites. A whole open rail data community has emerged and it is only going to get bigger as NR, TfL and others – with no shortage of demand – drive the creation and dissemination of data.