The Center for Data Innovation: Data Innovation Day

Yesterday, I had the privilege of attending the second annual Data Innovation Day, sponsored by the Center for Data Innovation. The event took place in downtown D.C., and in keeping with its capital location, half of the talks were devoted to the use of open data in local and federal governments. Here’s a brief look at some of the highlights of the day:

Data Innovation Day kicked off with a talk from Nick Sinai, the U.S. Deputy Chief Technology Officer, White House Office of Science and Technology Policy. He discussed the strides that the Obama administration has made to promote data-driven innovation, including the Open Data Policy and public-private collaborations around data.

The ultimate goal of the administration’s open data movement is to allow everyone (not just the government) to reap the benefits of the open data revolution. This means that we will see new, clever uses of open federal and local data sets — successful startups Trulia, Zillow, and the Climate Corporation offer great examples of how private companies can use public data in exciting ways that benefit society. Data, after all, informs more than just apps and software; it inspires innovation and service and promotes market transparency and competition.

Cameras and road data are being used to reduce traffic; cell phones passively report potholes; free tax records enable efficient mortgage acquisitions; and acoustic sensors and GIS are being used to pinpoint gunshots. These are all examples of innovative ways people are using open federal data — and a good explanation of why the White House believes it’s so crucial to make more government data available while at the same time making it easier for people to find and use.

Ben Balter’s lightning talk also discussed the White House’s open data initiative, this time in terms of the open data policy itself. The White House open data policy is a living, breathing document hosted on Github. Unlike many policies, which are assembled in committee and outdated by the time they’re published, this document is a collaborative piece that invites comments from both experts and the general public. It is constantly up to date, since it lives on the web, and options are given to “help improve this content,” “view revision history,” and “discuss” the content. In this way, the open data policy document gives citizens the chance to eavesdrop and participate in what used to be a closed-door conversation. It’s an innovative approach to shaping and publicizing a very innovative policy and one that will hopefully set a precedent for future lawmaking.

Two other lightning talks focused on personalized public policy and data visualization methods. The biggest impacts of the day, however, came from the panels the Center for Data Innovation had assembled, which covered the data economy, health care analytics, and the smart city movement.

Author Joel Gurin, Francine Berman of the Research Data Alliance, and Paul Zolfaghari of Virginia-based Microstrategy first discussed the U.S. data economy. Ownership and cost of data were the topics of the hour, with Berman asking who pays for data collection and storage. In some disciplines, data provides a competitive advantage, so it’s hard to promote sharing among firms. The task becomes even more difficult when companies are asked to pay to fund data that is then shared freely with competitors. Industries like health care, however, have shown that there’s an enormous benefit in using open data to improve treatment, bringing down patient costs in the process and opening the door for further medical innovations.

Because of this, Zolfaghari said, the next generation’s leadership is tied to abilities around data, and that managerial positions like chief analytics officer and chief data officer are springing up to bridge the gap between tech/data and business. In order for the U.S. to remain at the forefront of the data economy, Zolfaghari believes, colleges and universities must produce data scientists who are able to work across disciplines. In the health care industry, this would mean physicians who are able to use data science tools to better understand their patients’ records and data scientists who understand the inner workings of health care. The recent Moore Foundation grant for UC Berkeley, the University of Washington, and New York University is a good example of this kind of collaborative trailblazing.

The health care analytics panel with Russ Cucina, Marcia Kean, and Nina Preuss spoke to this need as well, explaining that physicians are interested in evidence and want better data to make better decisions. At the same time, consumers are crowdsourcing their medical and genetic data, looking for answers outside of traditional health systems, whether by collecting data with wearable technology or having their genomes sequenced. Ultimately, Kean said, health care is looking for the perfect marriage between traditional clinical trials and this kind of mass data collection we can get with patient assistance. We need to collect all sources of information, including implantable and wearable devices, and tie them to a secure electronic health record. Then, and only then, will the health care industry be able to take full advantage of the open data revolution.

The health care panel pointed out what there are all sorts of things that we currently imagine that we can do with data…but an even larger universe of applications we can’t even dream of at this moment. The field of open data is wide open and there for anyone who wants to take advantage.

Check out the following posts for more clever uses of open data shared at Data Innovation Day:

Did you attend the Data Innovation Day conference? What was your favorite session? Learn anything new or eye opening? Let us know in the comments!