Swarm Survey - The collective knowledge engine
The platform will enable quick and easy deployment of new collective research tasks. It will offer researchers ways to analyze and edit information. And it will publish research to an open data network for the world to access and build together.
Who are the users or target customers of your project, and what have you learned from them so far? Please give specific examples.
The target customers of Swarm Survey are publishers, news organizations, in particular. They already experiment with crowd-based research and collective tasks for news projects today, but the tools they use for this purpose are insufficient.
Deploying interactive research is hard to do cheaply and quickly. The Guardian either plans months ahead or uses swat teams for big stories. Both options have costs, most of which is the cost of development and time to release something meaningful.
In addition, many of the data journalism efforts happening now at places like ProPublica, The New York Times, The Atlantic and now FiveThirtyEight, among others, are reliant on data scientists and uniquely talented developers. These are skilled jobs.
We believe crowd-based research projects should be easier to deploy by a wider range of professionals. All news orgs, including the leaders in the field, would benefit from technology platforms that improve the data collection, the analysis, and the output of crowd research.
- The Guardian Australia partnered externally with a team on the Detention Logs
- Satellite imagery searches can be used to cover a large area very quickly, but these tools are not readily available to news organizations
What assumptions are you making in what you propose, and how will you test them?
There are many assumptions about Swarm Survey that will have to be assessed as part of the development process. From input methods to data analysis and editing to output usability, the entire stack is new and will present many challenges.
We’re also making assumptions about a core premise of the platform - the benefits and usability of open and shareable data. We’re hopeful that the market will embrace this as it becomes real and tangible, but publishers have a poor track record for sharing content.
The project will operate in 2 distinct phases - prototype and beta. We will iterate very quickly during the prototype phase and maintain clear success criteria. We will conduct a thorough review at the end of that phase prior to progressing to the beta.
Crucially, we will rely heavily on the teams at The Guardian to trial aspects of the platform with real news stories and real end-users. That will include using The Guardian’s UX lab where we will watch The Guardian’s end-users using the product in context.
How will you get your project in front of the necessary people or organizations?
The team will work at The Guardian’s offices in London, and we will sit with the relevant teams that will use Swarm Survey. This will help us design and optimize the service for a key customer that we believe to be representative of the wider market.
After the prototype is complete we will invite applications for a beta period where we will work closely with a small number of customers of the platform in a very hands-on way.
We will see the platform in use by a range of different kinds of customers and how they work with their end-users to create successful crowd-based investigations.
What are the obstacles to implementing your idea, and how will you address them?
The technology stack proposed may be too complicated. We will have to stay focused on the core needs of the customer and always be willing to throw away our work in order to get the solution right for their needs.
Working with customers’ internal platforms may be time-consuming. We will be very careful about maintaining loosely coupled integrations with things like user registration and content management.
Customers may not use the platform in meaningful ways. If a news story is uninteresting, then users won’t engage, and, as a result, Swarm Survey may appear unhelpful. Depending on progress, we may deploy some training resources to work closely with news orgs to get the most out of the service.
We may run out of money before completing a solid product. While unforeseen issues may arise, we will monitor progress against a clear set of goals that map to our budget. The prototype will be completed with significant budget remaining to adjust for the next phase.
We will also establish a board of advisors with experience in the field who can help guide our decision-making.
How much do you think your project will cost, and what are the major expenses?
The initial phases - prototype and beta - will require an investment of $250,000 ($125k for each phase).
Most of the expenses will be staff-related. We will complete the prototype with a team of 4 (CTO, Developer, Designer, Editor). We will require a small budget for hosting, software tools and minor expenses.
In addition, The Guardian will contribute to the cost of the project through shared resources, including business management and operations, office space, use of the UX lab, and editorial and development collaboration.
In the future, we may seek additional funding from other sources and partners in order to develop the product further and turn Swarm Survey into a sustainable and hopefully successful independent platform.
How will you acquire users and build a community around this product?
In Phase I (prototype) we will apply the tools in context with The Guardian’s journalism, the news desk, in particular. This will establish need and benefit for our target customers. We may engage another partner, but we only want 1 to 3 prototype partners.
As the product is refined in response to The Guardian’s use of it, we will begin Phase II (beta). We will open applications for access and reach out directly to Knight Foundation partners and others. We expect 5 to 10 beta partners.
Additional funding will be required for following phases, but we have a plan for growth if and when that happens. *
After the collaboration model is established in Phase II (beta) we will then begin Phase III (v1). The goal of v1 will be to create an active network of news publishers from around the world. We will use co-promotion tactics with our partners to establish the customer base and create a robust partner network. We will also deploy a traditional sales and marketing team for outreach and to build the commercial business.
If we are successful with news organizations then we will assess how to serve other publishers interested in open data and also organizations that want to use the platform commercially.
* We will seek additional funding if and when the product demonstrates real potential, the data provides real value in the world and customers want more from us. That may or may not include venture capital partners, foundation support or other funding sources.
The insights we gain collectively from pooling the information we have individually has the power to change our understanding of the world and what’s happening around us.
From crowdmapping disasters to searching satellite imagery to tracking political campaign messaging to crowdsourcing the price of milk, there are many ways to use the power of active communities as a data task force to get a clearer picture of an issue that affects us all.
The problem for publishers, communities and groups of active citizens who all have a need to create clearer pictures of what’s happening in the world is that each crowdsourced data research project is typically a standalone and independent data set with a data collection and publishing system that gets built and rebuilt over and over again for each new project.
Swarm Survey aims to platformize the crowd research process.
It will provide easy to use self-serve interfaces for collecting and connecting data, robust data editing and analysis tools, and output systems that make data accessible and useful.
In addition to the high quality crowd research toolset, Swarm Survey will be an open data network.
Research projects will be open and re-usable by other research projects on this and other platforms. The political campaign messaging from one project may be useful when joined up against voting intentions in another study, for example. Or perhaps the crowdmapped price of milk from one research project may be a useful datapoint in a research project about urban supermarket development.
The technology platform will run off a robust ElasticSearch environment with a distributed database. The service layer will be built for performance and growth from the outset. The architecture will be based on services used in production at The Guardian.
The front-end will be developed in a modular way in order to extend it and integrate with other tools as users find new use cases for the platform. The first task should we be awarded funding will be to assess the front-end environment options and to identify a solution that will support modular development most effectively.
The principles of freemium software services will be applied in order to create a useful platform for both small organizations with few resources and large organizations with more complex requirements.
Not all research is suitable for public access, for example, and many customers of this platform will value private data over publicly shared data. Fees will be introduced for services that restrict use and access or special services that incur costs to operate.
Similarly, integrations with 3rd party tools such as email services, CRM and Single-Sign-On platforms will be available for appropriate fees.
Commercial features of the platform will be investigated after the initial proof-of-concept phase of the project is complete and clearly demonstrates value and committed customers.
Initial deployments will be trialled with publishing partners including The Guardian. Use cases will be identified and tested in partnership with key editorial desks in order to optimize all the requirements for both large and small research projects.
After achieving success in testing, we will target publishing partners with similar use cases in the US and Europe. Then we will expand marketing efforts out to global aid and development agencies, social health and academic institutions, and, finally, the larger commercial organizations and more grassroots local communities across the US and globally.
Any organization that publishes information will want to use Swarm Survey.
Our vision is to build a collective knowledge engine.
Swarm Survey will make this possible by enabling topical and issue-based, data-as-storytelling research at both large and small scales. We will create quick, easy, useful, and valuable methods for collecting and connecting data, managing and analysing information, publishing and visualizing research.
The result will be a new kind of open data network, a public place where expert local knowledge fuels collective insights about the world.