Primer 6: Creating the Technical Infrastructure for Re-use
Investing in innovative and sophisticated technologies to improve data use on data supply and demand sides
Posted on 22nd of March 2021 by Andrew Young, Andrew Zahuranec, Stefaan Verhulst, Kateryna Gazaryan
Open data portals have been key in enabling open data, combining various institutional datasets and allowing users to browse, filter, search, and download data to their machines. While the open data portal format will likely remain a common piece of technical infrastructure, new and sophisticated technological developments could facilitate greater collaboration and responsibility in data re-use. These developments could include improved computing capacity to analyze large datasets and new and secure ways of transmitting data. To facilitate this improved technological development, an intersectoral, multidisciplinary research and development effort will be useful.
Unlocking Funds and Resources: Technological innovation and infrastructure development are often cost-intensive exercises with extended time frames. Organizations can look to various internal and external sources of funding to develop the technical infrastructure necessary for systematizing impactful and responsible data re-use.
Prioritizing Purpose-Driven Innovation: With the support of governments and foundations, data scientists and researchers can co-design and co-develop technologies needed to implement data collaboration at scale and in a responsible and sustainable way. This collaborative research could focus initially on core needs such as privacy-preserving technologies, security technologies, and access-control technologies.
Experimenting with New Innovations in Responsible Data: Several recent technical innovations are helping organizations to ensure safe and responsible re-use of data. New tools such as differential privacy, releasing information about a dataset without sharing personal or classified details it contains; and synthetic data, artificially created replica data; and safe sandboxes, tightly controlled data processing environments, can help support data re-use while mitigating risks.
State of Open Data: Data Infrastructure: A chapter from The State of Open Data explaining key elements of data infrastructure as it pertains to open data using an analogy to physical infrastructure.
Where Is Data Sharing Headed?: A piece from the Boston Consulting Group introducing several technical models to facilitate inter-organizational and cross-sector data sharing.
Differential Privacy for Privacy-Preserving Data Analysis: A blog series from the U.S. National Institute of Standards and Technology providing an introduction to differential privacy and practical implementation guidance.
Data Sharing and Interoperability Through APIs: Insights from European Regulatory Strategy: A paper by Oscar Borgogno and Giuseppe Colangelo which examines the importance of APIs within data sharing ecosystems.