The Past and Present of Open Data
The history of open data can be divided into several waves, each reflecting the priorities and values of the era in which they emerged.
The Three Waves of Open Data Emerging from freedom of information legislation that appeared in many countries in the latter half of the 20th century, the first wave of open data focused primarily on questions of transparency and accountability . Open data was seen as a way to remove secrecy from government activities and a way to allow experts — journalists, lawyers, and activists — to answer specific queries. The scope of open data was limited but it served a specific value in making government more accessible.
Some saw this approach as too limited and narrow. With the dawn of the Web 2.0 era , open data practitioners began to look at how data could be used for greater transparency and new ways of problem-solving. This second wave called upon national governments to make their data open by default (rather than specific demand) so that civic technologists could develop responses to systemic problems such as climate change , development and epidemics . Resulting open data platforms, such as the United Kingdom’s data.gov.uk and the United States’ data.gov , served as showcases for new tools and resources.
While the second wave expanded the scope of open data, it too had its limitations. In particular, many data silos remained untouched (and inaccessible) at the subnational level and in the private sector. Moreover, technologists did not always focus on issues of pressing public concern — there was often a mismatch between the supply of open data and the demand (i.e., the true social and public challenges for which it was needed).
The Third Wave of Open Data
The Third Wave of Open Data emerges from a recognition of both the achievements and the limitations of these earlier movements . While the contemporary approach to open data in many ways builds on prior waves, it also has several distinguishing features.
Major Elements of the Third Wave of Open Data First, the Third Wave adopts a purpose-driven approach to publishing open data. Instead of publishing data merely for the sake of publishing — an approach likely to flood the ecosystem with datasets that have little actual value — the Third Wave seeks to understand public needs, and how those needs may truly be amenable to data solutions. Put more broadly, this latest iteration of the open data movement concerns itself with the technical, social, political, and economic context in which data is produced and consumed. It pays as much attention to the needs and requests of those using data as those capable of supplying it.
Second, the Third Wave centers the role of partnership and collaboration , expanding the types of stakeholders and data holders who participate and benefit from open data initiatives. Among other changes, this involves a reconceptualization of the notion of what constitutes “open” by calling data holders across sectors and regions to adopt approaches to make data accessible to community organizations, NGOs, academics, and small businesses. Using data collaboratives and other data sharing methods, a wide variety of organizations can bring public and private sector assets to bear on public problems.
The Third Wave also seeks to accelerate open data at the subnational level . Datasets released in previous waves often privileged larger institutions, such as national governments, over smaller ones with fewer resources, such as local and regional governments. By contrast, the Third Wave approach acknowledges that local organizations have a role to play in addressing local needs. It seeks to support organizations closest to people by providing the expertise, technical infrastructure, and resources needed to address a paucity of local information.
Finally, the Third Wave approach prioritizes data responsibility and rights . While the open data movement has always sought to promote public welfare, the Third Wave emphasizes that fairness, accountability, and equity need to be at the core of any data effort. In practice, this means anticipating biases and risks to privacy in project design phases and developing mechanisms to mitigate them. It also entails understanding the entire context in which a data project takes place so to avoid perpetuating existing inequities.
Many of these features are evident in new and emerging open data movements around the world. The emphasis on partnerships and data responsibility is, for example, what motivates many cutting-edge data collaborative efforts, such as the Yale University Open Data Access program which allows clinical data holders to share data with medical researchers to develop new drugs and treatments. These and many other examples are featured more fully at our website . Considered together, they suggest that there is indeed an emergent Third Wave of Open Data, and that it offers new ways for researchers, policymakers, civil activists, and the public at large to benefit from previously inaccessible data assets.
Photo by Lars Kienle on Unsplash The Way Forward
As Audrey Tang, Taiwan’s digital minister, has argued , addressing modern challenges requires more than just basic digital literacy. New innovative research requires institutions to foster a form of data competence that would allow citizens to participate more fully in data efforts as both producers and consumers. As we mark this Open Data Day, we seek a more expansive understanding of the open data ecosystem, one that encourages data re-use and that makes data accessible to key stakeholders in a responsible manner, whether they be public servants, public advocates, or average citizens.
The emerging Third Wave offers encouraging evidence that such a movement is underway. But fully enabling the necessary transformation requires a number of changes in data policy and governance. While specific actions may vary by sector and context, there are a few cross-cutting approaches and principles that need to be emphasized.
For public and private data holders , the need of the hour is to build the policies, systems and expertise needed to facilitate data reuse, which is central to the Third Wave. More specifically:
First, they can create and empower data stewards , responsible data leaders within organizations empowered to identify opportunities for data re-use in a responsible manner and who seek new ways of creating public value through cross-sector collaboration. They can act to promote data capacity throughout the organizations they manage, in an effort to break down the silos that often make data re-use efforts seem ad hoc and isolated rather than more generally effective. Policymakers can also work with other organizations in their field to develop data intermediaries (e.g., institutions like Sage Bionetworks and StatsNZ’s Data Ventures ) that match potential data research collaborators and help address some of the transaction costs in data collaborative relationships. Finally, they can support the technical infrastructure needed for broad reuse, whether that be in the form of funding new data portals or subsidizing computing capacity among researchers, especially in fields where the datasets are prohibitively large and complex. For data users, including researchers and civil society, promoting a Third Wave is about making the value that open data produces more apparent; and more equitable.
First, data users can work to build an evidence base . They can publish broadly accessible research that demonstrates the insights offered by open data and how those insights can produce tangible benefits to people’s livelihoods beyond increased transparency and accountability. Second, they can support efforts to foster public data competence and encourage meaningful participation in data projects . This work can be achieved through research challenges and participatory agenda-setting . Finally, data users should insist on systems to track decision provenance , so that others understand the decision points impacting data’s collection, processing, sharing, analysis, and (re)use. Awareness of these decision points is crucial to proactively identifying gaps and biases, both of which can undermine project goals. Finally, all open data stakeholders can work together to establish legitimate governance frameworks to guide data re-use. A recent MIT survey found that 64% of executives in the United States are reluctant to embrace open data because of regulatory uncertainty. Many organizations themselves lack internal guidance on how they can and should engage others. Everyone is affected by data re-use and, as such, needs to have a voice in the policies, plans, and procedures governing it.
Photo by Miguel Bautista on Unsplash Conclusion
The achievements of the open data movement are real, and there is much to celebrate on this International Open Data Day 2021. At the same time, open data researchers and practitioners cannot lose sight of how much work remains. COVID-19, climate change, rising inequality, social and economic divisions, and a panoply of other threats require renewed commitment to harnessing the potential of open data, and recognizing ways in which the movement needs to continue evolving. The Third Wave of Open Data offers a path forward. In this paper, we have sought to highlight its key principles and actions. Considered together, these offer a more effective and legitimate data ecosystem — and, potentially, solutions to some of our most pressing and complex problems.