Summer of Open Data

Panel #10

Defining the Value Proposition, Building Common Infrastructure, and Avoiding Missed Use

Posted on 23rd of September 2020 by Andrew Young

Panelists, clockwise from top left: Andrew Young (The GovLab); Daniel Jarratt (Infinite Campus); Vanessa Brown (National Student Clearinghouse); Matt Gee (BrightHive); and Felix Shapiro (Office of the Governor, Commonwealth of Virginia)

The Summer of Open Data is a three-month project spearheaded by the Open Data Policy Lab (an initiative of The GovLab with support from Microsoft) in partnership with the Digital Trade & Data Governance Hub, Open Data Institute, the Open Data Charter, and BrightHive. Each week, we speak with data experts in local and regional governments, national statistical agencies, international bodies, and private companies to advance our understanding of how to establish a vision of open data focused on collaboration, responsibility, and purpose.

The Panel

Moderated by BrightHive CEO Matt Gee, the cross-cutting panel featured:

Vanessa Brown, Managing Director, Strategic Initiatives, National Student Clearinghouse;
Felix Schapiro, Workforce Policy Analyst, Office of the Governor, Commonwealth of Virginia; and
Daniel Jarratt, Head of Learning Science Technologies, Infinite Campus.

In a 45-minute conversation, Matt and the panelists spoke on a variety of issues, including incentives for data collaboration and reuse, the need for repurposable legal and technical infrastructure, and avoiding missed use of potentially valuable data.

The full conversation, as well as a brief overview of highlights, is below:

Highlights

Leveraging Administrative Data and Avoiding Missed Use

Matt kicked off the discussion by asking Felix to reflect on how the Commonwealth of Virginia is unlocking the value of administrative data to support workers and learners across the state.

Felix described the importance of laying the groundwork for Virginia’s moves toward inter-agency and inter-departmental data sharing. Many of the early, difficult challenges on data integration, political territory, prioritization of certain information streams, and which data systems should be treated as the authoritative source of “truth” in different contexts have been hashed out over many years. As a result, much of the state’s current work focuses on the operational aspects of data sharing and re-use, instead of getting bogged down in foundational questions that have largely been addressed.

With that said, redundant data processing and analysis — as well as missed uses of data that already exists — can slow the impact of institutional data.

“The great irony of state government is that we rely on multiplying factors derived by the census to estimate populations when we possess data within the Department of Tax and other data systems that we could use to objectively answer questions about the commonwealth and the population we serve,” said Felix Shapiro.

Defining the Value Proposition and Building Networks

Reflecting on National Student Clearinghouse’s efforts to understand the return on investment of various education and workforce credentials, Vanessa shared lessons from her organization’s partnership with Workcred and the creation of their Voluntary Data Sharing Network, established earlier this year. The network includes 30 credentialing bodies that collaborate not just on data integration and sharing, but in co-designing the intended value proposition of the collaboration and its joint data offerings. More recently, the Clearinghouse has been able to share initial labor market outcomes of making this data available, further clarifying the value of the collaboration.

“It opened a lot of eyes and broke down a lot of barriers,” Vanessa said.

She further highlighted the importance of defining the value proposition to incentivize action from various stakeholders. New uses of data require new processes and policies, and people need to be motivated to take on this additional work.

This “incentive compatibility,” as Matt called it, was also important for Daniel. He described how critical it is to “lead with problems of practice that educators actually face and build solutions that educators are actually asking for.” That means focusing on school improvement, he argued, not just compliance or scrutinizing teacher performance.

Felix pointed to a failed data system project in Virginia that created significant, persistent transaction costs for users. He urged practitioners to collaborate early in the process of designing a data initiative to ensure costs make sense and do not disincentivize engagement. The system’s failure was not due to technological issues, but to a misalignment of costs and incentives.

Building Flexible, Extensible Infrastructure and Reducing Transaction Costs

Daniel then reflected on lessons from previous failures in the education data space, including the InBloom platform and how Infinite Campus aims to support data-driven research. His current work aims to create a flexible technical infrastructure that alleviates the need for significant investment in ensuring the interoperability of various data systems, standards, and formats. According to Daniel, Infinite Campus aims to create unified, national technical infrastructure and common, automated business processes while minimizing the requirements on researchers seeking ways to use that data.

“We want to reduce the marginal cost of every new research question,” he noted.

Vanessa pointed to work being done with the network of credentialing bodies to create flexible infrastructure from a legal perspective rather than a technical one. The group is working to establish reusable templates for data-sharing agreements and similar legal instruments to lower the transaction costs of establishing a new collaborative or data-sharing initiative.

Daniel agreed and noted the importance of investing in legal data protections when establishing research data collaboratives. More automated, repeatable processes, such as legal templates, are critical for scaling.

Impacts of COVID-19

Daniel then highlighted the new challenges and demands facing Infinite Campus and other education data providers since the start of the pandemic. He noted that teachers are being forced to learn and switch between different learning apps while students are engaging with data and technology in new ways, including through hybrid online–classroom models.

As such, funders are more interested in supporting pandemic-focused research. Needs and expectations are changing across the landscape of demand for education data, and educators and data custodians like Infinite Campus are navigating that shifting demand in real time.

These challenges, he argued, cement the case for creating modular data infrastructure that allows for new analyses of data that already exists.

Felix noted similar challenges in the face of COVID-19, noting that, “When you’re falling out of an airplane, it is a bad time to design a parachute.”

Final Thoughts: Building the Human Infrastructure

The conversation ended with a clear call for building and connecting human capacity across the data landscape.

Daniel argued that despite the focus on technology, the two biggest challenges in computer science are people and convincing developers that the biggest challenge is in fact people, and not every solution can be solved through a technical fix.

For Vanessa, evangelists are key. Data providers need people who can accurately assess stakeholders’ needs and communicate the incentives and value proposition for engaging.

Felix encouraged people to seek out collaborators within their institutions eager to make progress on purpose-driven data use in the public interest.

“Don’t lose hope. There are more allies than you realize,” said Felix.

Next Steps

This panel marks the conclusion of the Summer of Open Data. The series tapped into the wisdom and experience of 31 open data experts from around the world.

This initiative provided clear lessons for accelerating and maximizing the public value of the Third Wave of Open Data. The GovLab and its partners synthesized and shared learnings from across the series in its work on the Third Wave of Open Data.

Back to the Blog