The Challenge


How can we harness data and information for the health of communities? read the brief


Open Humans Network: Find equitable research studies and donate your personal data for public benefit

This project aims to reimagine health discovery as a collaborative effort through the creation of a portal populated with research studies pre-screened for data sharing practices, participants willing to share their data, and public data results.
Solving some of the most pressing problems in health and medicine requires resources that bring together people, ideas, and data in open research. Collective intelligence cannot be leveraged when data is confined to a single lab, commercial company, or health care system. Information technology and the Internet have simplified technical barriers to sharing data while making the value of sharing abundantly clear. 

Typically, research studies are islands unto themselves. After design and funding, each study recruits and enrolls a cohort, collects samples, then generates private data to produce publications and intellectual property. The resulting data silos are economically inefficient and fail to enable the integration of data, participants and insights. Our proposed research portal is an alternative paradigm that centers on open data and collaboration from the outset.  

Highly integrated, longitudinal health data are extremely valuable for science and the advancement of human health. Because detailed, individual-level data can be impossible to anonymize, traditional privacy assurances are problematic. These assurances create structures that reduce sharing, inhibit data aggregation and collaboration, and compromise discovery. Anonymization practices often isolate participants, preventing them from contributing their full value and learning and participating in dissemination of results.

The Open Humans Network is inspired by and will collaborate with the Harvard Personal Genome Project (PGP). Founded in 2005, the PGP has championed a model that returns data to participants and enables them to share. To promote data aggregation, integration, and discovery, the PGP has turned the privacy problem on its head: rather than jeopardize the trust relationship with weak promises about anonymity, this study specifically recruits people comfortable with public sharing and the potential for re-identifiability -- a practice called "open consent". To-date, approximately 3000 Harvard PGP participants have contributed extensive data and tissue specimens with the goal of advancing scientific knowledge and biodiscovery.

The end result of this project is to create the Open Humans Research Portal, connecting participants willing to publicly share data about themselves with researchers interested in using and adding to that public data. The portal will showcase this public data, facilitating exploration and providing tools for unrestricted download of publicly available machine-readable data sets. Researchers interested in working with participants to create additional data (e.g. performing "microbiome profiling" for "participants with genotype data") must agree to guidelines to become network members.

Initially, this project will specifically target researchers and research participants in Boston and New York.  These two cities are among the most densely populated academic-industrial centers in the world for health and biomedical research.  We will incrementally expand the network to other regions in the United States.

Who is working on the project? Who are your partners?
We will seed this effort through collaborations with several leading examples of equitable research:
  • Harvard Personal Genome Project (George Church, Harvard Medical School)
  • American Gut (Rob Knight, University of Colorado, Boulder / HHMI)
  • Flu Near You Research Participants (Rumi Chunara, Boston Children’s Hospital / HMS)
  • Chronic Fatigue Syndrome Study, (Eric Schadt, Mount Sinai School of Medicine)

Other contributors and advisors include:
  • Computational biologist: Madeleine Ball (Harvard)
  • User interface design: Involution Studios (Boston)
  • Secure data protocols: Jeremie Miller (Jabber, Singly, Telehash)
  • Open devices: Charles Fracchia (MIT Center for Bits & Atoms / HMS)
  • Citizen-Scientist Advocate: Abigail Wark (Harvard)
  • Outreach: Celia Fulton Walden (GET Conference & GET Labs)

How do you know there is demand for this project?
Participant Demand: While there has been much progress in health care with patient access to their own medical records, the same cannot be said about the research enterprise despite repeated evidence that people are strongly in favor of getting access to their data [ ref 1, ref 2]. Based on direct experience with growing the Harvard PGP and American Gut communities (~10,000 members combined), many people are already engaged in the open health ecosystem, but being early-on in its development, there are many unmet needs and lots of opportunity to make public engagement in health research fabulous.

Researcher Demand:  Researchers benefit greatly from public data resources, but frankly, they are having trouble figuring out how to create them due to complex issues around potential re-identifiability, appropriate consent, and governance. Returning data to participants and allowing them to manage its public sharing cuts this gordian knot and creates a vital community resource of health data that can be leveraged in many ways, to say nothing of the incredible value that is created for researchers who gain access to cohort of well-characterized people who may serve as controls for many different studies.

How is your project different from what already exists?
Participatory research studies that share computable data with their enrolled volunteers are a new and important phenomenon, not only as an enabling force toward more meaningful collaboration in research, but they also create new opportunities for participant-managed data sharing. Most health research studies today fail to provide much, if any, agency to research participants on these matters, due in part to simple inertia (“we’ve always withheld data from volunteers”) and public data sharing is stymied further by governance models that only allow one-size-fits-all decision-making around privacy.  As a result, access to valuable community health data resources are mediated by a small number of individuals who get to choose what to investigate and publish, and sharing, when it does happen, occurs through restrictive and cumbersome controlled access databases.  Our project will make it possible for people to easily find equitable research studies, aggregate their data over time and increase the impact of their contribution by facilitating public access to the data.

How will the data or information you use or create be made open?
The whole endeavor will be open.  We will seed the portal with 4 amazing research studies that already engage in the return of computable data to participants.  We will enable individuals enrolled in one or more of these studies to choose to publicly share their data through the OpenHumans portal using a CC0 waiver or equivalent public domain license.  The website will have an interface where that data can be explored and downloaded in standard formats.

What will you make or do in this project?
We will create an online portal that has the following three components:
  1. Participant profiles: Participant-facing system for managing public data profiles, based on our experiences with Harvard PGP (see example public profile).
  2. Public data explorer: A public interface for accessing, exploring, and downloading computable data, initiated with data from the four seed projects.
  3. Design Guidelines: We want to raise the visibility and success of IRB approved research studies that share data with participants, so we will feature these four seed projects and use them as case studies for developing design guidelines and resources for enabling more researchers to follow suit.

How can others learn from/build on what you do?
Researchers will be able to learn from our design guidelines about how to facilitate sharing of data in their own studies.  Study managers and review boards will have concrete examples to support their decision making when reviewing studies that return data to participants.  Individual-level, open health data that we help organize is likely to accelerate the development and evaluation of tools that manipulate human data because current restrictions drastically reduce the number of software engineers, data scientists and others, that are able to access high-quality human data.  We expect the OpenHumans community to become a vital global resource that can be utilized in many ways: open data can be deployed in efforts to improve biological literacy, a cohort of well-characterized people can serve as controls in numerous studies, and health discovery will be more effective when research data remains connected to the individuals it is about so it may be aggregated and re-used in many contexts.

How much do you think it will cost?
For this project to succeed we believe we need a minimum of approximately $650k in funding for the first 18 months.  This will allow our consortium to pursue light-weight integration across these four seed projects, establish a critical mass of OpenHumans participants that bring valuable public data, and feature these resources in a functional portal with solid web design.  We also need to devote funds toward establishing relationships with researchers who want to join the network in future phases, write guidelines for how that will work and have sufficient staffing to support high quality interactions with community members.

How would you use News Challenge funds?
We'll spend the majority (70%) on funding people: a software engineer, an informatics expert, as well as contracts and part time work for user interface design, participant community management, and some support for teams members in our consortium. Another 20% will be spent on computational infrastructure needed for a data intensive site (web server, colocation and bandwidth, and systems administration), and 10% will go towards travel and coordinating a conference in Boston that brings together OpenHumans researchers and participants.
Describe your project in one sentence.
The Open Humans Network connects researchers and citizens who wish to collaborate on the creation of public data resources that will improve human health and biological literacy.
Who is the audience for this project? How does it meet their needs?
Our audience is composed primarily of individuals who are interested in creating detailed public profiles through participation in qualified research activities, such as the Harvard Personal Genome Project or American Gut. The other component of our audience is made up of a small but growing contingent of researchers who agree to return individual-level, computable datasets to participants who enroll in their IRB approved studies.
What does success look like?
Everyone benefits from a health research enterprise that is more transparent, efficient and equitable. Toward this end, we aim to reimagine health research and biodiscovery by initially attracting 1000 participants who agree to share data generated as a result of their enrollment in at least one of our four pilot studies.
Your Location
Boston, MA USA


Join the conversation and post a comment.

Jason Bobe

October 22, 2013, 09:04AM
For any researcher who wants to contribute to the study of the "Open Humans" cohort, or any individual who wants to join and contribute their data: We are running an event called GET Labs on April 30, 2014 in Boston (GET = Genomes Environments Traits). Please join us! Here is a poster with more info:

Or you can get touch with the organizing committee:

Sally James

October 02, 2013, 19:01PM
You may wish to investigate the Synapse program and Bridges Project of @sagebio (Sage Bionetworks) with provide a patient dashboard for providing data to researchers. They recently won a Robert Wood Johnson Foundation grant. Sounds similar to what you wish to accomplish.

Jason Bobe

October 03, 2013, 10:57AM
Hi Sally:

First and foremost, congratulations on the RWJF funding for Bridge, that is really very exciting. That is an important validation for the small, but growing community of folks working in this area.

Yeah absolutely. SageBio does great work. I've been to several Congresses and spoke at the last one. I think Synapse and Bridge are definitely potential partners, and where it makes sense, it would be silly to duplicate effort or re-invent any open source infrastructure. Instead, in this proposal we would aim to extend and strengthen those efforts, especially around public human data, an area we've been working since 2005:

You may not know it, but our projects already collaborate and share common elements. Namely, the great work John has done around PLC was inspired by the "open consent" framework the is the foundation of the Harvard PGP's governance for data sharing. It would be great to continue in that spirit and find more ways to be mutually reinforcing. I'll take it up with Stephen!


Maura Keaney

September 21, 2013, 13:02PM
Thanks for your submission. What an innovative way to create opportunities to share data. Do you envision running into any obstacles with funders of research who may want to have more ownership of their findings?

Jason Bobe

September 22, 2013, 10:05AM
This model may not be appropriate for all funders of research, but it will be very attractive to many. Here are some examples of major funders who support data access & sharing:

Government: The NIH has had data sharing policies in place since at least 2003. With highly indentifiable datasets, like genomic data, government supported open-access databases have run into problems with sharing mostly due to outdated informed consent practices. As a result, many data resources have become difficult to access, e.g. dbGAP. Our "open consent" model is a solution that enables sharing. An editorial in Nature Medicine this month encourages researchers to adopt "open consent".

Private Industry: Pfizer's Craig Lipset has been working on making the return of clinical trial data to participants a new standard. This is a new, but encouraging idea. I think it also speaks to the possibility that business models and data sharing can co-exist.

Individuals: Access to medical records are now seen as a fundamental right in the United States. The same can't be said (yet!) about health research data. I think when tax-payers (a major funder of research) take up this issue, like what has been done with medical records and open access to scientific literature, we'll start to see data more data liquidity. Helping individuals identify equitable research practices (and those that are not), through the Open Humans portal will increase awareness about the importance of this issue.

Sandra Holt

September 18, 2013, 14:01PM
This would be awesome!

Althea Mcluckie

September 17, 2013, 21:26PM
Regardless of the success of your application, I wanted to try to connect. The National Participant Network might be able to support your idea with connection to our participants. If you want to explore us, we can be reached at Good luck!

Jason Bobe

September 18, 2013, 06:27AM
I'll check it out, thank you.
Login to News Challenge