The State of Eindhoven
Analysis: Eindhoven’s data code sets a good example, but is it enough?
Eindhoven, like many cities, is hard at work developing its views on data use, partly because it aims to evolve into a "smart society". In the following analysis, Klaas Kuitenbrouwer of Het Nieuwe Instituut examines Eindhoven's data policy. In the process, he looks at the nature, history and development of the use of data.
Data in the city
In recent years, discussions around the smart city focused mainly on the opportunities presented by technology. They looked ahead to how data-driven applications could transform messy, largely unpredictable cities into shiny green-and-white living environments glowing with health. The dreams were ambitious: big data would finally enable us to solve a range of stubborn urban problems. We would untangle city traffic, usher in the circular economy, predict thieves' and hooligans' behaviour and nab them before they could do anything bad. And data would help us to shine light on a number of other complicated social issues and subsequently get control of them. In this version of the smart city, people were mainly passive providers of data. Their behaviour was something to be read, modelled and ultimately managed.
Recently, however, the debate around the smart city has changed course. The new smart city is more about citizens' needs and less about perfect models. (The UK think tank Nesta outlines this development in a report.) Today, the question is no longer how the city's hardware can be made smart but how the city can become the setting for a smart society. Data, however, continues to play a central role. But data is not a uniform substance. There are different types of data and different ways of collecting it. So what kind of data are we talking about? Where does it come from? Which parties in the city possess it, and which do not? Who is permitted to use what data for what purposes, under what conditions? And is Eindhoven's municipal data policy equipped to help citizens play a central role in the smart society?
A very brief history of urban data
Even the Bible contains a famous story (link in Dutch) featuring a government that collects data. For a long time, governments and local councils were the biggest keepers of data, and it has been used for ages to track the development of cities. Before the rise of computers, information was recorded on paper and stored in files and archives. Citizens provided data by performing actions in society: putting their names on official registers, buying and selling houses, consulting doctors and visiting hospitals, buying and insuring cars, and so on. Censuses were also regularly conducted, with representative groups of citizens asked to answer series of questions. Information stored in various places could be accessed by journalists, policy researchers, historians and others.
From the 1960s onward, increasing amounts of data gradually came to be stored in computer databases. The big advantage of databases was that they made searching for and analysing information much faster. The downside was that searching through data necessitated a new kind of specialist: the programmer. In the 1980s, computers started to be connected to the Internet, and data could be sent instantaneously from one machine to another. This made it easier to combine data sets. Computers also became more user-friendly, so that a broader group of people could read and use the data stored on them. In the 1990s, the Internet exploded, growing with lightning speed into the global network used by almost half the world's population today. With the emergence of the Internet, for the first time, nongovernmental parties became the primary owners of data. At the moment, Alphabet (formerly Google) and Facebook (probably) hold the world's biggest data sets.
Everyone who uses the Internet is constantly generating data, both technical and personal, through their computer activity. When they use social media, people are mainly providing masses of information about themselves (the rise of Facebook began around 2007). Thanks to the large-scale use of RFID since 2000, and more recently the arrival of the Internet of Things, a vast number of machines has joined the army of human users, generating data via sensors and exchanging them online.
Open data
The data traditionally possessed by municipal authorities was mostly explicitly collected in censuses, with citizens' knowledge and consent. This information was used to devise policies and justify decisions. From 2007, a movement gained momentum - first in the United States and later in European countries, including the Netherlands to a significant degree - to press governments and other organisations to make their data publicly available online. The idea, first of all, was that this would serve democracy. If everyone could examine the information used to make public policy, they could take part in discussions and decision-making in a more informed way, and governments would be prevented from twisting the facts. Second, citizens who had access to official government data could use it to identify problems and come up with solutions themselves. Access to open data gave a new practical dimension to citizenship.
The open data movement was, of course, successful. Today, it is standard practice for municipal and state governments to make their data publicly available. This open data is structured in such a way that one can see direct, concrete links between, say, house prices in a certain area and financial information for that area separately obtained from two databases.
Dutch public organisations publish open data on the Buurtmonitor website. The availability of these figures promotes transparency of governance, helps to justify official decisions, and can be useful in holding parties to account. They can also be helpful in setting and influencing public agendas. Data posted on Buurtmonitor is collected in annual sample surveys and aggregated at postcode 4 level - that is, in units corresponding to the numerical segment of a postcode, denoting a neighbourhood. Buurtmonitor users can zoom in only as far as a particular block of houses. In theory, municipal authorities are interested in more detailed information on development, in both the spatial sense and the temporal one - i.e., more frequent updates.
In Straatkubus, a pilot project currently under way in Almere, the city provides open data aggregated at the postcode 6 level - that is, at the scale of an individual block of houses.
At this scale, however, the private lives of individual citizens begin to become visible, and these are protected by Article 10 of the Dutch constitution. While the government does have certain rights when it comes to interfering in people's private lives (as regulated in the Dutch privacy act), street level is about as far as a city can go in terms of collecting and publicising data.
Now that government data has become publicly accessible, it's somewhat disappointing that in spite of all the programming workshops ("hackathons") devoted to the subject, no killer app has emerged. One likely reason is that government data is constructed and formatted specifically for the purpose of underpinning policy, and citizens' needs can't really be addressed using this type of data.
Big Data
As far back as 1941, people were talking about an "information explosion", and the term "big data" has been in circulation since 1998 (see this Forbes article), though the amount then deemed "big" was microscopic compared to what is generated these days. What we call big data today is mainly a product of Web 2.0. It is the endless reservoir of information produced by the online actions of millions of people and machines. It's the digital traces of the human and machine users of Google, Twitter, Facebook, Instagram and other platforms. And it's the information produced by CCTV cameras, traffic sensors, air quality meters, weather stations, smart phones, cars and so on.
Big data is neither structured nor officially "published" by anyone or anything. It is generally the property of the private companies that make the hard- and software used to generate it. Big data is not neatly organised in public databases; it can be found and made visible only using specialised technical knowhow. With big data - unlike open government data - the value lies less in the content of individual data points than in the meaningful patterns that can be discerned in vast numbers of them by using advanced algorithms to find correlations. It's the tide of big data that's primarily responsible for smart-city fantasies.
The promises of big data interest those who are concerned with urban development out of a need for efficiency and manageability. Big data provides insight into urban processes at a new, incredibly rich level of detail. But with the new focus on citizens' place at the heart of the city, the use of big data comes with its share of issues.
Issues
First of all, the average citizen can't read big data. The traditional policy figures now being made available to the public are hard enough to interpret, in spite of efforts to make them accessible. Reading big data requires advanced programming knowledge. Some citizens, of course, have that knowledge, and Eindhoven, with its history as a technological hub, is probably home to an above-average number of them; still, they are extremely few. So, paradoxically, even as individuals become increasingly visible within the ever-richer data streams - to companies in particular - they become less and less visible to themselves.
For that matter, plenty of government statisticians aren't versed in dealing with big data either, having come of age in a world of traditional policy data.
Second, once again, the generation and correlation of data don't necessarily flow from citizens' needs or demands. The only question isn't which data citizens are able and permitted to access; a more fundamental question is how they can have a say in the systems used to gather it.
This touches on a third complex issue connected to big data: that of ownership. Traditional open data belonged to the government, and therefore literally to the citizens it represented. Big data is generated mainly by businesses, and it's highly valuable.
While there is legislation (such as the Dutch privacy act) protecting personal details (name, address, place of residence), it's possible to build up a precise picture of an individual's personal life even when such details aren't technically in play: think of online user history and purchasing behaviour or a mobile phone's location. Current law effectively assumes individuals are responsible for enforcing their own data privacy rights, but citizens clearly have little idea of the various parties processing their data, how it is being used to assess and characterise them, and what the implications are.
This points up the underlying problem: the current legal principles applied to the use of data no longer suffice in the face of the technological reality of big data. The Nederlandse Juristen-Vereniging (Dutch Association of Lawyers) (NJV), in preparing its preliminary advice on issues expected to be pertinent in 2016, examined legislation for the digital sphere and came to the following conclusion. Current Dutch privacy law focuses on the purpose of data processing, and while some purposes are legitimate, others are not. Safeguarding health, for instance, is a justifiable goal. But the law says nothing about the interest parties may have with regard to data processing - and this is its weakness. Under the law, a commercial party that builds an app to let users see their chances of catching the flu is as legitimate as the World Health Organisation, even if that commercial party then sells the data it collects to insurance companies. The NJV argues that if people are to have a say in what happens to their data, the law must pay more attention to a party's interest in processing data than to the purpose it may technically serve in doing so.
Eindhoven's data policy
The city of Eindhoven aims to design a smart society and is keenly conscious of sensitivities around the handling of data and personal details. Therefore, in September 2015, it drew up an eight-line data code to serve as a guiding framework for how data should be processed in public space. In principle, it's an example worthy of imitation. With this code, the city of Eindhoven is taking initiative and claiming a say in how the smart society should be organised. Nevertheless, on closer inspection, it's apparent that the code fails to address a number of issues.
Read the data code here (in Dutch)
In any case, between them,** Rule 1** - "Data in public space belong to everyone. These data are public property. Data collected, generated, or measured (e.g., by sensors positioned in the city) must be made publicly available so that everyone may make use of them for commercial and noncommercial purposes." - and Rule 4 -_ "Data that do not contain, or no longer contain, personal details should be made available in such a way that everyone has equal access to those data (e.g., via an open data portal__). No technical or legal impediments limiting access to data shall be imposed._" - establish terms for making collected data publicly accessible, and thereby extend the possibility of democratic control over how data is handled. Of course, this depends on another condition: citizens must possess the technological ability to be able to read this new open data.
Rule 2 states: _"Data may contain personal details. __Therefore, these data may concern individuals' private lives. The rules of the Wet Bescherming Persoonsgegevens [Dutch privacy act - ed.] apply to these data. _These data may be made publicly available only following anonymisation."
Technically, this means the data must be stripped of people's names, addresses and places of residence. Thanks to big data, though, a detailed picture of an individual's private life can be obtained even without them. So the question is: how far does anonymisation go? Are IP addresses, for example, also removed?
Rule 3 states:_ "Data that do pose a privacy or security risk may be processed only within the framework of the Dutch privacy act. _Storage and processing of data must be carried out according to existing legislation." In fact, all this says is that the law is applicable - a possibly reassuring but redundant statement. The NJV, however, argues that the law is no longer actually adequate. Though that isn't the city of Eindhoven's fault, it does mean its code doesn't truly suffice as a set of guidelines for handling data obtained in public space.
Finally,** Rule 8** states: "The municipality will remain in dialogue with the parties that contribute to the data infrastructure in the city and will strive to create income opportunities and a fertile economic climate."
This position seems to have been adopted in order not to overly deter private parties that are, or could be, involved in maintaining the local data infrastructure. It also suggests - correctly - that there are still many considerations to weigh up with respect to the balance between public and private agendas in connection with Eindhoven's data infrastructure.
[This article has also been published on E52, media partner of The State of Eindhoven.]