Event
Fourth Wave of Open Data Seminar: Data Commons for the Public Good
Posted on 13th of June 2025 by Christopher Rosselot
How can we build and sustain data commons that balance openness, and trust, while fueling innovation for the public good? Can they ensure that communities and other networks have a say in how their data is used? Can they ensure that public interest organizations can get access to the data they need to meet societal challenges?
In “Data Commons for the Public Good,” the Open Data Policy Lab’s fourth panel on the Fourth Wave of Open Data, Vivienne Ming (Chief Scientist, The Human Trust), Angie Raymond (Director of Data Management and Information Governance, Ostrom Workshop), Heather Coates (Data Steward & Data Librarian, Indiana University), and Alek Tarkowski (Director of Strategy, Open Future) joined The GovLab’s Stefaan Verhulst to answer these questions.
Together, they discussed design principles, co-creation, and purpose of data commons.
Design Principles and Incentives
The discussion began by asking participants about the foundational aspects of data commons arrangements. What principles guide data commons? How is understanding and trust created among those that provide and use data within the commons?
Vivienne Ming responded to the questions by explaining design principles guide The Human Trust's use of unstructured human development data to build biobehavioral foundation models.
"Instead of predicting the next word in a sentence, we want to predict the next beat in a lifetime," Ming said.
According to Ming, the Trust's data-pooling model removes data ownership as the factor that complicates and slows down innovation. Developers using the Trust's data should be able to touch the entire dataset, generate new data, and add value back in.
The Trust depends on individuals providing personal and sensitive data including medical, education, and legal records. Ming made the case that both individuals providing data and groups needing access to data stand to benefit from a data commons approach. Data contributors receive wellness recommendations while entrepreneurs and researchers get to interact with large datasets to which they would not otherwise have. Access at large scale is not often accessible given the proprietary nature of traditional, privately-held governance models.
Angie Raymond from the Ostrom Workshop pointed the audience to a December 2021 article on how to apply Elinor Ostrom's eight design principles to data commons governance:
Clearly-defined boundaries;
Appropriate rules;
Collective choice;
Monitoring;
Sanctions;
Conflict resolution mechanisms;
Right to self-governance; and
Interoperability.
Raymond distinguished between data commons and trusts, referencing that different types of collaborative data governance models (e.g. commons, trusts) are needed based on the relative sensitivity of data held within such models. In short, different approaches benefit different types of work.
Problem Orientation and Co-Creation
The participants then discussed how data commons facilitate cooperation. They noted that the data commons governance model exists to drive responsible use of open data in solving public problems. Commons should be problem-oriented and co-created.
In the context of academic research, Heater Coates from Indiana University referenced the need to shift mindsets from extracting data and delivering solutions to prototyping tools that then allow the community to co-create data and data applications.
"How do we begin to work with the community and not study them?" Coates said.
Vivienne Ming agreed, saying, “To me, it’s like hammers for good. Don’t hit people with hammers; build things. We can get caught up in building these frameworks or these philosophical commitments when the starting point for doing good with AI is problem-solving and problem-solving means owning the whole problem.”
As institutions, such as research bodies at universities, must contract with entities and not individuals, a crucial step in co-creation is recognizing data providers as communities and not just individuals.
Alek Tarkowski continued the argument for protecting collective data rights, particularly in the context of linguistic data, by pointing to the Bloom GenAI model from BigScience in 2022. The data commons governance model holds promise for GenAI applications when considered as a framework for data collection.
Future and Purpose of Data Commons
Data collaboration for AI-usable data is still in its infancy. Experimentation is needed to determine what governance models are fit for which purposes. This teed up Stefaan Verhulst's final question to the panelists: How is purpose defined in a commons-focused manner, and what questions should be prioritized?
For the Human Trust, Vivienne stated, the board simply asks, "Does this project (that uses Human Trust data) advance Human Development?" If so, the board grants approval.
Angie Raymond remarked that amid a data explosion (and the potential for a data winter), granularity on the types of data is essential. Heather Coates added that the question space originates in relationships and co-creation.
Alek Tarkowski closed out the seminar by pointing to alignment assemblies as a vehicle for using collective intelligence to define the “good” that AI can help create.
***
These are just a few of the reflections offered in our latest seminar on the Fourth Wave of Open Data. To follow the full discussion, watch the video here.
Stay tuned for the announcement of our final seminar and our forthcoming State of Open Data Policy Summit, which will examine larger challenges and trends.