Definition
Data stewards are responsible data leaders from the public, private, and civil sectors seeking to create public value through data collaboration
Communities across the world face unprecedented challenges. Strained by climate change, crumbling infrastructure, growing economic inequality, and the continued costs of the COVID-19 pandemic, institutions need new ways of solving public problems and improving how they operate.
In recent years, data has been increasingly used to inform policies and interventions targeted at these issues. Yet, many of these data projects, data collaboratives, and open data initiatives remain scattered. As we enter into a new age of data use and re-use, a third wave of open data, it is more important than ever to be strategic and purposeful, to find new ways to connect the demand for data with its supply to meet institutional objectives in a socially responsible way.
This self-directed learning program, adapted from a selective executive education course, will help data stewards (and aspiring data stewards) develop a data re-use strategy to solve public problems. Noting the ways data resources can inform their day-to-day and strategic decision-making, the course provides learners with ways they can use data to improve how they operate and pursue goals in the public's interests. By working differently—using agile methods and data analytics—public, private, and civil sector leaders can promote data re-use and reduce data access inequities in ways that advance their institution's goals.
In this self-directed learning program, we will teach participants how to develop a 21st century data strategy. Participants will learn:
- Why It Matters: A discussion of the three waves of open data and how data re-use has proven to be transformative;
- The Current State of Play: Current practice around data re-use, including deficits of current approaches and the need to shift from ad hoc engagements to more systematic, sustainable, and responsible models;
- Defining Demand: Methodologies for how organizations can formulate questions that data can answer; and make data collaboratives more purposeful;
- Mapping Supply: Methods for organizations to discover and assess the open and private data needed to answer the questions at hand that potentially may be available to them;
- Matching Supply with Demand: Operational models for connecting and meeting the needs of supply- and demand-side actors in a sustainable way;
- Identifying Risks: Overview of the risks that can emerge in the course of data re-use;
- Mitigating Risks and Other Considerations: Technical, legal and contractual issues that can be leveraged or may arise in the course of data collaboration and other data work; and
- Institutionalizing Data Re-use: Suggestions for how organizations can incorporate data re-use into their organizational structure and foster future collaboration and data stewardship.
The Data Stewardship Executive Education Course was designed and implemented by program leads Stefaan Verhulst, co-founder and chief research development officer at the GovLab, and Andrew Young, The GovLab's knowledge director, in close collaboration with a global network of expert faculty and advisors. It aims to:
- Empower you to solve public problems through systemic, sustainable, and responsible data re-use;
- Provide you with a powerful toolkit to draft a data re-use strategy that addresses public needs;
- Introduce the latest policies, tools and technologies for use in designing, implementing, and managing data initiatives;
- Enable you to pursue more purposeful, demand-driven partnerships and collaborations with a broad and inter-sectoral group of actors;
- Guide you on the best ways to mitigate risks and promote responsible data practices; and
- Ensure you can translate data into insights and insights into meaningful action.
Weekly Activities
Description
In this introductory session, we review the history of data re-use and the ways it has been used to solve public problems. Through the framework of the three waves of open data, participants will learn how views of data use and re-use have changed over time, the impact data re-use has made, and major challenges facing the field.
Key Takeaways
- New advances in data science and data analytics offer tremendous potential for creating public value. New models of cross-sector collaboration, or "data collaboratives," can help to provide actors working in the public interest with functional access to previously inaccessible data assets.
- A (chief) data steward is an essential new function and profession with the mandate to guide data-holding institutions in the creation of cross-sector data collaboratives and data re-use efforts to deliver on the public-interest value of the data they hold. Data stewards have five key roles:
- Partnership and Community Engagement, to reach out and vet potential partners while also informing beneficiaries of insights generated from efforts;
- Internal Coordination and Staff Engagement, to coordinate actors internally and gain sign-off from them;
- Data Audit, Ethics, and Assessment of Value and Risk, to monitor and assess the value, potential, and risk of all data held within an organization;
- Dissemination and Communication of Findings, to act as the "face" of the company's data projects and communicate shared outcomes to external actors; and
- Nurture Data Collaboratives to Sustainability, to work with stakeholders to gather the needed resources and support for broad, long-term impact.
- In addition to public value creation, there is an emerging business case for data stewardship. Several private sector companies are establishing models and processes to create public value with data they hold while also generating business value. Data stewards in the private sector can advance this work by:
- Aligning Corporate Social Responsibility goals with data stewardship;
- Establishing Codes of Conduct (i.e. outlining requirements of parties as well as expected governance frameworks);
- Creating Standardized Legal Agreements and Describing Roles of Participants (i.e. addressing routine compliance and liability issues);
- Identifying Fiduciary Responsibilities of the parties in advance to any beneficiaries;
- Developing common methodologies for use case definition, data discovery and fitness of use;
- Creating common assessment tools for vetting partners, determining reputation of institutions, and individual participants;
- Developing data access protocols; and
- Developing common standards for retention and destruction across data classes.
Presentations
- Why Does Data Stewardship Matter? — Stefaan G. Verhulst and Andrew Young, The GovLab
- Recent Advances in Data Analytics and their Potential for Data Reuse for Public Good — Ciro Catutto, ISI Foundation
- Reciprocity of Incentives: The Business and Societal Case for Data Collaboration — JoAnn Stonier, Mastercard
- Why Develop a Data Stewards Program? — Stefaan G. Verhulst and Andrew Young, The GovLab
- AI For Good: Lessons from COVID-19 — Richard Benjamins, Telefonica
Recommended Readings and Multimedia
- "Data, Data Everywhere, Data, Data Everywhere." – The Economist. 2010.
- William Davies. "How statistics lost their power – and why we should fear what comes next." – The Guardian. 2017.
- Nick Barrowman. "Why Data Is Never Raw". – The New Atlantis. 2018.
- Stefaan G. Verhulst. "Data responsibility: using corporate data to improve our lives." – TEDxMidAtlantic. 2017.
- Thilo Klein and Stefaan Verhulst. "Access to New Data Sources for Statistics: Business Models and Incentives for the Corporate Sector – OECD Statistics Working Paper. 2017.
- Stefaan G. Verhulst, Andrew Zahuranec, Andrew Young, and Michelle Winowatan. "Wanted: Data Stewards: (Re-)Defining The Roles and Responsibilities of Data Stewards for an Age of Data Collaboration." – The GovLab. 2020.
Recommended Activity
- Develop the preamble to a Data Stewardship Strategy for your organization or institutions outlining, in a broad but tangible manner, how effective data stewardship can create societal and institutional value (1 page maximum).
Description
Following a discussion of the history of the movement, we will discuss current approaches to data re-use around the world. This session will develop the discussion of data models introduced previously—focusing on freedom of information, open government data, and data collaboration. It will then shift to the failures of current approaches and the need of current approaches and the need to shift from ad hoc engagements to more systematic, sustainable, and responsible models. Participants will receive a detailed explanation of the changes needed to realize a third wave of open data and the ways in which the third wave can produce public value.
Key Takeaways
- The First Wave of Open Data centered around the freedom of information, making data available on request. The Second Wave of Open Data took hold with a focus on open government data and an open-by-default approach. Recent years have witnessed the emergence of a Third Wave of Open Data. This Third Wave of Open Data builds on lessons learned from previous efforts and is defined by four key components: Publishing with Purpose; Fostering Partnerships and Data Collaboration; Prioritizing Data Responsibility and Data Rights; and Advancing Open Data at the Subnational Level.
- Data stewards can capitalize on the potential of the Third Wave of Open Data by using the lessons learned from earlier waves as well as related fields such as the civic technology movement and global efforts to use data to achieve the Sustainable Development Goals. They can do so by establishing a data-driven culture, forging collaborative and cross-functional teams, creating targeted metrics and impact assessment processes, and innovating with an eye toward sustainability and scaling.
Presentations
- Open Government Data Policies and the (Digital) Transformation of Governments — Main Trends and Considerations — Barbara Ubaldi, OECD
- Open Data & Civic Tech: Building on the 2nd Wave to Deliver Excellent Digital Services — Jaimie Boyd, Government of British Columbia
- Operational Strategies and Lessons Learned; The Need for a New Profession and Methodology — Stefaan G. Verhulst and Andrew Young, The GovLab
- Lessons from Leveraging Data for the SDGs — Jessica Espey, former SDSN TReNDS
- Field Visit: LinkedIn Economic Graph — Paul Ko, LinkedIn
Recommended Readings and Multimedia
- Stefaan G. Verhulst, Andrew Young, Andrew J. Zahuranec, Susan Ariel Aaronson, Ania Calderon, and Matt Gee. "The Emergence of a Third Wave of Open Data:How To Accelerate the Re-Use of Data for Public Interest Purposes While Ensuring Data Rights and Community Flourishing" – Open Data Policy Lab. 2020.
- Geoff Mulgan and Vincent Straub. "The new ecosystem of trust: How data trusts, collaboratives and coops can help govern data for the maximum public benefit" – Nesta. 2019.
- Stefaan G. Verhulst, Andrew Zahuranec, Andrew Young, and Michelle Winowatan. "Wanted: Data Stewards: (Re-)Defining The Roles and Responsibilities of Data Stewards for an Age of Data Collaboration." – The GovLab. 2020.
- The GovLab's Data Collaborative Case Studies (2020):
Recommended Activity
- Review and reflect on the draft preambles shared by your peers for Week 1's assignment. Consider how their articulation of the value proposition of data stewardship relates to or could inform your own.
Description
Week three focuses on the first step in a data re-use project: understanding the problem an organization intends to solve and for whom and how that translates into questions data can answer. Participants will learn approaches for quickly researching complex problems, assessing public interest in identified problems, and segmenting these potential stakeholders according to interests and needs. Using experiences from The GovLab's 100 Questions Initiative, this session will describe participatory problem definition and agenda setting, both processes which require the capacity to listen to all people affected or involved in a project inside and outside the organization.
Key Takeaways
- "Publishing with purpose" necessitates a clear and actionable problem definition to inform stewardship activities and enable the effective matching of data supply and the public-interest demand for it. A good problem definition can reflect on, for example, the key issue to be addressed, why that issue matters, who the intended beneficiaries of addressing that issue are, what solutions were previously tried, why one might be better positioned to address the issue, assumptions present in one's understanding of the issue, relevant counter-arguments and controversies, and current efforts working to address the issue both internally and externally.
- To develop an actionable problem definition, Data stewards can experiment with a new methodology and science of questioning. This involves a demand-driven approach that works to address complex issues by asking the right questions. Data stewards can focus on defining questions to generate new situational awareness, better understanding of cause and effect, new predictive capabilities, and improved impact assessment.
- Data stewards can segment the demand for the data that they hold by identifying and mapping potential users or partners. One approach involves mapping potential demand-side actors according to their potential for delivering impact in the near-term and their institutional readiness and capacity to leverage data.
- There is a need for increased responsiveness, connectedness, and engagement in the data re-use ecosystem. Data stewards can take a people-led approach to engagement that is purposeful in whom to engage (e.g. intended beneficiaries, domain experts, local anchor institutions), in what manner and at which stage of the problem-solving lifecycle (e.g. reviewing and commenting on a data re-use strategy or co-creating a problem definition).
Presentations
- Defining Problems, Topic and Systems Mapping, and a New Science of Questioning; and Identifying and Segmenting Partners, Users, and Beneficiaries — Stefaan G. Verhulst and Andrew Young, The GovLab
- Defining Demand: Participatory Problem Definition and Questioning — Stephen Chacha, Tanzania dLab
- Open Cities: The Drive Towards Data-driven Government — Lilian Coral, Knight Foundation
- Identifying and Segmenting Partners, Users, and Beneficiaries— Stefaan G. Verhulst and Andrew Young, The GovLab
- Field Visit: Data for Children Collaborative with UNICEF— Alex Hutchison, UNICEF
Recommended Readings and Multimedia
- Dwayne Spradlin. 2012. "Are You Solving the Right Problem?" Harvard Business Review, September 1, 2012.
- Stefaan G. Verhulst. 2019. "Raw Data Won't Solve Our Problems — Asking the Right Questions Will." Medium. September 9, 2019.
- Andrew Young, Jeffrey Brown, Hannah Pierce, and Stefaan Verhulst. 2018. "People-Led Innovation: Toward a Methodology for Solving Urban Problems in the 21st Century." The GovLab and Bertelsmann Foundation.
- Stefaan Verhulst and Andrew Young. 2018. "Toward an Open Data Demand Assessment and Segmentation Methodology." New York, New York: The GovLab.
- Aleise Barnett, David Dembo, and Stefaan Verhulst. 2015. "Toward Metrics for Re(Imagining) Governance: The Promise and Challenge of Evaluating Innovations in How We Govern."
Recommended Activity
Develop a detailed overview of a public problem you will seek to address through your institution's data stewardship work. This problem statement should comprise:
- A brief description of the issue area;
- An organizing question;
- An assessment of the problem's likely root causes;
- A list of ways life would be different if the problem were addressed (including positive impacts, and potentially disruptive knock-on effects);
- Assumptions related to your understanding of the problem;
- Relevant counter-arguments, contradictions, or controversies;
- A mapping of stakeholders and potential collaborators; and
- A shortlist of data types and sources that could help to provide insight into the problem and potential solutions.
Description
Data re-use requires both relevant data resources and technical capacity. This session will teach participants how to assess both. In the first part of the session, we will explain how participants can find useful data in their organization and data outside it. Participants will learn how to conduct data audits, map the location of relevant assets among partner organizations (e.g. companies, nonprofits, and academia), and make use of open data platforms. In the latter portion of the session, participants will learn the value of internal capacity reviews and receive recommendations on how they can build internal support for data projects. These explanations will be informed by real-world examples.
Key Takeaways
- To support their work, data stewards often need to tap into sources of data and insights as well as expertise and capacity from both within and outside their institution. An internal and external data and expertise audit can help to clarify which assets could be brought to bear as part of a data reuse strategy.
- Several considerations or criteria will help data stewards to determine which datasets are fit for purpose, such as the level of disaggregation, time scale, geolocation, comparability or interoperability with other data, and the scope or "level of abstraction."
- Institutions can enable intra-institutional discovery and re-use of data and activate interdisciplinary data teams through efforts to standardize metadata and adhere to open data standards, treat data as a service through APIs, and clarify an internal data governance framework with well defined roles, sharing protocols, standard operating procedures, and rules of engagement.
- Data stewards often need to build internal buy-in and high-level support for their work to unlock necessary institutional resources. They can obtain this support through a clear articulation of the potential value and purpose of the effort, recognition of institutional priorities (and pitfalls), an understanding of how the work would fit into the broader data economy and ecosystem, and a clear strategy for mobilizing existing tools and resources to operationalize the strategy.
Presentations
- Field Visit: Open Data Charter – 5 Years of the Principles in Action— Natalia Carfi, Open Data Charter
- Data and Expertise Mapping — Stefaan G. Verhulst and Andrew Young, The GovLab
- The State of the Data Economy — Pieter De Leenheer, Collibra
- Creating and Training Interdisciplinary Data Teams — Rudi Borrmann, Open Government Partnership
- Building Internal Support and Buy-In for Data Stewardship— Tyler Kleykamp, Beeck Center
Recommended Readings and Multimedia
- Jean-Claude Burgelman, et al. 2019. "Open Science, Open Data, and Open Scholarship: European Policies to Make Science Fit for the Twenty-First Century." Frontiers in Big Data: Data Mining and Management.
- Stefaan Verhulst and Thilo Klein. 2017. "Access to New Data Sources for Statistics: Business Models and Incentives for the Corporate Sector" The GovLab and OECD for Paris 21.
- Beth S. Noveck, et al. 2017. "Smarter Health: Boosting Analytical Capacity at NHS." The GovLab.
- "Open Up Field Guides." Open Data Charter, Open Up Field Guides: Methodology
- "Strategic Planning for Internal Investments." UNICEF, Strategic Planning for Data Investments
- Natalie E. Harris. 2019. "Sharing Data for Social Impact: Guidebook to Establishing Responsible Governance Practices." Beeck Center, Beeck Center appendix
- High-Level Expert Group on Business to Government Data Sharing. "Towards a European Strategy on Business-to-Government Data Sharing for the Public Interest." pp. 95–116.
Recommended Activity
Based on your problem statement from Week 3 — identify a few specific, core questions and develop a list of internal and external sources of data and expertise that could support efforts to address the problem.
Description
Data re-use efforts can take several forms depending on the operational and organizational needs of supply and demand stakeholders. In this session, we will provide an overview of the forms these relationships can take, focusing on two axes: Engagement and Accessibility. We will also provide guidance on the organizational decisions that data reuse stakeholders must make about how their work is to be governed and maintained.
Key Takeaways
- There is no one-size-fits-all approach. It is important for data stewards to make decisions and develop organizational models for data collaboration that depend on the needs of data supply and demand actors. To do this, a data steward can determine what is fit for purpose by examining several variables, including:
- Engagement: The degree to which the data supply and demand actors will directly collaborate on the data re-use initiative.
- Accessibility: The conditionality of accessing data by external parties. Some approaches might allow for open access while others might place more restrictions on accessing data.
- Analysis of the current field of practice clarifies an emerging typology of data collaboratives, comprising six core operational models:
- Public Interfaces: Data holders provide open access to certain data assets, enabling independent uses of the data by external parties.
- Data Pooling: Data holders agree to create a unified presentation of datasets as a collection accessible by multiple parties.
- Prizes and Challenges: Data holders make data available to participants who compete to develop apps; answer problem statements; test hypotheses and premises; or pioneer innovative uses of data for the public interest and to provide business value.
- Trusted Intermediary: Third-party actors support collaboration between data holders and data users working in the public interest.
- Research & Analysis Partnerships: Data holders engage directly with public-sector partners and share certain proprietary data assets to generate new knowledge with public value.
- Intelligence Generation: Data holders internally develop data-driven analyses, tools, and other resources, and release those insights to the broader public.
- An understanding of these issues can also help to inform the design of a fit-for-purpose governance framework that can enhance trust among the public and other key stakeholders. Data stewards are experimenting with several emerging governance models for data re-use—such as ethical councils and independent review boards—as well as more traditional approaches—such as contracts and terms and conditions.
- When working towards data collaboration, data stewards may find it useful to map out the data ecosystem to understand the values and incentive structures of relevant actors and partners. By focusing on a few priority use cases at the outset, data stewards can determine which operational and governance approaches are most appropriate for the opportunity and stakeholders involved.
Presentations
- Operational Models and Variables of Data Collaboration— Stefaan G. Verhulst and Andrew Young, The GovLab
- Building and Maintaining Trust Related to Data Sharing — Alison Paprica, University of Toronto
- Field Visit: Microsoft — Jule Sigall, Kenji Takeda, Juan Miguel Lavista, Krishna Sood, Microsoft
- Models for Collaboration: Governance Structures and Case Studies in Biomedical Research— Brian Bot, Sage Bionetworks
Recommended Readings and Multimedia
- Lara M. Mangravite, Avery Sen, and John T. Wilbanks. 2020. "Mechanisms to Govern Responsible Sharing of Open Data: A Progress Report." Sage Bionetworks.
- Alison Paprica, et al. 2020. "Essential requirements for establishing and operating data trusts: practical guidance based on a working meeting of fifteen Canadian organizations and initiatives."
- Stefaan Verhulst, Andrew Young, Michelle Winowatan, and Andrew J. Zahuranec. 2019. "Leveraging Private Data for Public Good: A Descriptive Analysis and Typology of Existing Practices." Data Collaboratives. October 2019.
- Scan 1) Data Collaboratives Explorer and 2) #Data4COVI19 repositories
Recommended Activity
Using your mapping from Week 4 or another current project, draft a brief project road map outlining how you will operationalize collaboration and enable the exchange of data and expertise. Describe which operational model you will use and why it is most appropriate for the opportunity. Outline key steps necessary to forge the partnership, enable collaboration, and clarify decision-making processes.
Description
The use and re-use of data is not without risks to data practitioners, the intended beneficiaries of their projects, and others. In this session, we will review the various dangers data practitioners need to be aware of, including those related to privacy, data security, poor decision-making, and open washing. Participants will learn about the different risks data re-use poses to individuals, a relatively well-discussed issue, and groups, a relatively under-discussed problem.
Key Takeaways
- Although data is often treated as a tangible thing, data is the result of a process comprising several stages: a data lifecycle or value chain. Taking a lifecycle approach to risk identification can help data stewards to diagnose potential issues or sensitivities, develop fit-for-purpose mitigation strategies, and ensure data ethics are at the center of their work. Some examples of notable risks at each stage of the data lifecycle include:
- Collection Stage: Poor data entry, duplication, inconsistencies, and non-representative collection;
- Processing Stage: Insufficient security provisions, aggregation and correlation challenges;
- Sharing Stage: Lack of interoperable institutional norms and practices, improper or unauthorized access, conflicting legal jurisdiction, and different levels of security;
- Analyzing Stage: Inaccurate data modeling, biased algorithms, and poor problem definition/design; and
- Using: Faulty reporting, lack of understanding, and misinterpretation.
- Various parties make decisions across this data lifecycle with significant implications on the effectiveness and responsibility of data re-use efforts. Data stewards can help to track and monitor the "decision provenance" by documenting who is (or should be) responsible, accountable, consulted, or informed about activities and choices. This documentation can help data stewards identify gaps or deficiencies in current decision-making procedures that could introduce new risks.
- Much of the data responsibility discussion centers on the misuse of data, such as privacy violations or biased data analysis, risks can also arise from the missed use of data that could have created value. Data stewards can help to avoid missed use of data by treating data as the "nervous system" informing and guiding strategic decision-making and operations.
Presentations
- Introduction to the Data Lifecycle— Stefaan G. Verhulst and Andrew Young, The GovLab
- Avoiding Missed Use — Jacqueline Lu, Mozilla Foundation
- Data Ethics in Practice — Natalia Domagala, United Kingdom Cabinet Office
- Field Visit: Data Responsibility in Humanitarian Action— Stuart Campo, Sarah Telford, Jos Berens, and Fanny Weicherding, UN OCHA
Recommended Readings and Multimedia
- Inter-Agency Standing Committee. 2021. Operational Guidance: Data Responsibility in Humanitarian Action.
- The GovLab and UNICEF. Responsible Data for Children Opportunity and Risk Diagnostic.
- The Global Data Responsibility Imperative
- Principles for Development: DIAL, Principles for Digital Development
- Privacy Policy Guidance Memorandum, U.S. Department of Homeland Security, 2008. Fair Information Practice Principles
- Stefaan Verhulst. 2017. "Why We Should Care About Bad Data." The Governance Lab @ NYU (blog). August 2, 2017.
- Andrew Young. 2020. "Responsible group data for children." UNICEF Issue Brief.
- Meg Young, Luke Rodriguez, Emily Keller, Feiyang Sun, Boyang Sa, Jan Whittington, and Bill Howe. 2019. "Beyond Open vs. Closed: Balancing Individual Privacy and Public Accountability in Data Sharing." In Proceedings of the Conference on Fairness, Accountability, and Transparency.
Recommended Activity
Use the inventory of potentially useful and relevant data assets from your earlier data audit activity and develop a list of potential sensitivities or risks involved in handling those assets across the data lifecycle
Description
Data re-use projects, such as data collaboratives, need policies and procedures that allow all participants to understand their roles and interact with the data in an ethical, legal, and responsible manner. In this session, we will seek to demystify the legal and governance elements of data collaboration to address the risks identified in the previous section. Using the Contractual Wheel of Data Collaboration as a model, we will invite participants to explore the Why, What, Who, How, When, and Where of their legal agreements and demonstrate the different forms these elements can take.
Key Takeaways
- Data stewardship and re-use efforts often occur without a clear social license or engagement with relevant stakeholders, data subjects, and intended beneficiaries. New forms of engagement, such as data assemblies (citizen's assemblies on the use and re-use of data) can help data stewards to better align their work with public perceptions and preferences and gain a social license for their work.
- To support responsible data re-use, data stewards can prioritize equity in their work. Although new and innovative approaches like differential privacy and synthetic data can help to mitigate risks, no single technical approach can achieve three types of data equity:
- Representation Equity: ensuring that data used is an accurate reflection of the "world."
- Access Equity: ensuing organizations have the information (i.e. features, data, and models) needed to study and assess inequity and prevent technical bias.
- Outcome Equity: mitigating the risk of inequities arising outside of the system's direct control, including through the creation of "nutritional labels" for data with clear and concise descriptions of important data and metadata elements.
- Data re-use often relies on contracts, data sharing agreements, and memoranda of understanding to establish a legal basis for collaboration with external parties. While these legal instruments tend to include much legalese and jargon, Data stewards can focus on core issues and considerations related to the Why (e.g. scope and limitations), What (e.g. data provenance and standards), Who (e.g. rights and responsibilities of parties to the agreement), How (e.g. dispute resolution strategies), When (e.g. data retention provisions), and Where (e.g. jurisdiction implications) of the data collaborative in question.
- Data stewards can consider several data licensing regimes to enable external re-use, such as the Creative Commons, Open Data Commons, and the Community Data License. Data licensing regimes involve different provisions—such as only allowing noncommercial re-use or requiring attribution of the source of the data—that could impact whether they are fit for purpose.
- Data stewards in many contexts are tasked with navigating an uncertain legal, policy, and regulatory landscape. In cross-jurisdictional efforts, they might be faced with the even more challenging task of adhering to conflicting or diverging provisions. To better understand the opportunities and potential legal barriers at the outset of a data re-use effort, data stewards can review potentially relevant privacy, administrative, procedural, and analytical policies that restrict certain activities and alter their strategies accordingly.
Presentations
- Governance Frameworks, Contracts for Data Collaboration, and Data Licensing Regimes— Stefaan G. Verhulst and Andrew Young, The GovLab
- Building Data Equity Systems — Julia Stoyanovich, New York University Tandon School of Engineering
- Everything Transaction' or Data Sovereignty Introduction — Douwe Lycklama, INNOPAY
- Data Policies and Regulations: Cross-Cutting and Jurisdictional Issues — Nick Hart, Data Foundation
Recommended Readings and Multimedia
- Hayden Dahmm. 2020. Laying the Foundation for Effective Partnerships: An Examination of Data Sharing Agreements. Sustainable Development Solutions Network.
- Hayou Ping, Julia Stoyanovich, and Bill Howe. 2019. "DataSynthesizer: Privacy-Preserving Synthetic Datasets." SSBM '17.
- Andrew Young, Stefaan G. Verhulst, Nadiya Safonova, and Andrew J. Zahuranec. 2020. "The Data Assembly Responsible Data Re-Use Framework." The GovLab.
- Andrew Young and Stefaan Verhulst. 2019. "The Contractual Wheel of Data Collaboration." Data Stewards Network.
- GovLab Selected Readings on:
- Juliet McMurren. 2020. "Indigenous Data Sovereignty." The GovLab.
- Kezia Paladina, Alexandra Shaw, Michelle Winowatan, Stefaan Verhulst, and Andrew Young. 2018. "Data Responsibility, Refugees and Migration." The GovLab.
- "Responsible Data for Children." 2020. The GovLab.
Recommended Activity
Use the Data Responsibility Journey tool to build a risk mitigation strategy for each stage of the data lifecycle to inform the project or program you have been developing throughout the course.
Description
Data re-use projects can address public problems but their potential is best realized when they can be maintained in the long term. As such, organizations need to make changes in how they operate, ensuring they are positioned to take advantage of future cross-sector data collaborations. In this concluding session, we will discuss how systematic, sustainable, and responsible data reuse can be best realized through the creation of new roles and processes, and a focus on purposeful institutional change management. The session will also feature final reflections on the 8-week course, and the path ahead for participants' data stewardship strategies and the field at large.
Key Takeaways
- The field of data stewardship and data re-use has been dominated by pilot projects and experiments that fail to reach scale or are abandoned when funding dries up or the initial timeline comes to a close. To maximize the value and impact of their work, data stewards are faced with the task of managing institutional change toward cementing a culture of data stewardship.
- To initiate this change and institutionalize their work, data stewards can establish a strategic foundation for implementation, learning, and iteration. They can approach each project or use case as an opportunity to further refine and build an institutional evidence base on the value created through these efforts, the intended (and realized) outcomes, guiding principles, enabling conditions, common challenges, and necessary institutional capacity and capabilities.
- Data stewards may face challenges in ensuring the long-term financial sustainability of their work, regardless of the size of their institution or business. By capturing and amplifying proofs of concept and examples of internal and external value creation and impact, data stewards can better make the case for continued internal investment and/or establish new funding relationships.
Presentations
- Field Visit: Cloudera Foundation — Claudia Juech and Ananthan Srinivasan, Cloudera
- Field Visit: Data Strategy of the Secretary-General for Action by Everyone, Everywhere — Kersten Jauer, Executive Office of the Secretary-General, United Nations
- Final Thoughts and the Path Forward — Stefaan G. Verhulst and Andrew Young
Recommended Readings and Multimedia
- World Bank Group. 2018. Improving Public Sector Performance through Innovation and Inter-Agency Coordination.
- World Economic Forum. 2018. Data Collaboration for the Common Good Enabling Trust and Innovation Through Public-Private Partnerships.
- Alison Taylor. 2017. We Shouldn't Always Need a "Business Case" to Do the Right Thing. Harvard Business Review.
- Stefaan G. Verhulst. 2020. Today's Rembrandts in the Attic: Unlocking the Hidden Value of Data. Data Stewards Network / Harvard Business Review
- Andrew Young, Andrew J. Zahuranec, Stefaan G. Verhulst, and Kateryna Gazaryan. 2021. The Third Wave of Open Data Toolkit. Open Data Policy Lab.
Recommended Activity
Compile the elements of the data strategy you have been developing over the previous eight weeks, and create a new executive summary or briefing memo to introduce and summarize the key actions, procedures, considerations, and needs.