Workshop participants chose which of six data lifecycle tables they wanted to sit at and discussed the people, projects, tools, challenges, and opportunities they saw at the various stages, including:
- PLAN
- GATHER
- HOLD
- CLEAN & VERIFY
- ANALYZE
- ACT UPON
PLAN
- Imperative to work with end users from the very beginning. If it's not useful to the end users and they're not involved in the development process, it will fail.
- The process of developing partnerships with mobile providers is oversimplified. It's not difficult but very time consuming.
- Need to promote ownership in ministries and think about how they can get in on the planning process from the very beginning.
- Do this as part of an existing system, in which there's already multisector buy-in, already meeting annually
- Need to plan for technical capacity once the project has launched, esp. with regard to software developers who know the system and can update and adapt it as changes come to light
- Technologies and mobile applications for data collection do not replace tested methods but are seen as complementary until further notice
- Challenge: how to involve local stakeholders already in the planning stage?
- Good idea: take prototypes to the field to show and explain
GATHER
- What is data?
- Range of types of data on a spectrum, ie, from anecdotal<----->official
- Two types of data:
- Data that is used to evaluate the success of a project (indicators) versus data that is being collected to create a body of knowledge for other people to draw on
- Talked about the different methods of gathering data: survey, data mining, photos
- Range of tools for collecting data: from paper to dedicated data collections
- Types of audiences: split between the general public type of audience, and weak and strong ties, then a big space, then business NGOS, and at the far end statisticians
Case studies, based on group projects
Case Study A
Tech brought to schools. Wanted to find out, Is tech having positive impact on children's learning?
People visited the schools with tech, to get qualitative and quantitative data. Qualitative data acts as a supplement for quantitative data, which people don't always want to give, need to incentivize, but how?
Need to automate data collection, with mobile phones in the process, to reduce resources needed to collect data
BUT even with phones, you still need physical presence to get data
Case Study B
Ushahidi implementation: collecting mapping data, directed projects and grassroots projects
Volunteers as official data collectors
In both projects, end up with data that has to be reentered, always a translation phase from one type of system to another type of system
Re: the Malawi case, when working with a volunteer force, need to get them thinking about the data in the same way. Otherwise, you may face the telephone game problem.
HOLD
- Reviewed projects and tools, ushahidi, apstrata, other platforms for storing data: Google App engine, Amazon cloud computing, focused on platforms as service not just a tool
- Identified key functions that data storage services need:
- Needs to be open
- Has to support flexible schema, easy to adapt to different needs
- Has to provide a simple query language so that people can easily extract the data
- Needs to have import and export functions
- Has to support rich media, not just talking about text data
- Challenges
- Scaling
- Infrastructure needs to grow wihen the demand grows
- Shouldn't cost a lot of money
- Has to be easy to implement
- That's why we only chose platforms available as a service
- Opportunities
- Standardization of storage platforms to make it easy to plug-in
- Intersections side, APIs storage tools need to have web-services based APIs, support open formats, XML, KML, gears (rss feed that has location-based info attached to it)
CLEAN & VERIFY
- Challenges
- Don't have ways to verify the data and have a meaningful and objective way of presenting them to the public
- Data will always be biased
- Two Types of Data? With the new types of data collection going on, we're seeing two types of data that are hard to pull apart:
- One type where cleaning it becomes very important
- Another data that comes so fast that it starts to clean itself
- Some data that one wants to use immediately for action; are we dealing with something new when we're getting massive amounts of experiential data and what do we do with them?
- General Questions:
- What is clean data?
- Is bias bad?
- Is all data biased?
- At one point does data start to clean itself?
- What is real-time data?
- Question of cleanliness of data:
- Pretty subjective
- Process before willing to share it
- Sharable?
- Actionable?
- Credible?
- Timely?
- Bias? Notion of bias, what does it mean? Important to source
ANALYZE & DISPLAY
- KGB human data analysis
- Analysis can be difficult, involves math
- Programs are expensive and complicated
- Is there some way to maintain the analysis but make it accessible?
- Create an open-source toolkit where some of these analytic procedures could be online; once you mashup and normalize data, can you run preset analyses?
- Wolfram Alpha
- Standardizing methodology
- Copy methodology and then compare
- Documentation
- Repository framework
- Obvious points: curve to display it
- Forming a representative sample
Visualizing
Design in a vacuum problems.
No one at the table had any information so there was nothing to visualize, common problem. Generalized frameworks.
One-to-one process to take data and visualize it
Must remember that data is not reality - modeling, reviewing critically, and understanding data in context is an important part of visualization and analysis:
Act Upon
- There's a difference between acting in a development situation versus an emergency situation
- Alert, analysis of the data that allows you to recognize where the gaps are
- Triage process that allows you to categorize and assign
- Prioritize
- Assign
- Track
- There are local, meta (programmatic and systemic) and shared responses (open up data set) to data
- Regarding tools, different tools selected by different organizations for different purposes. The important things to think about are interoperability and standards:
- The more interoperable, the better.
- What we need from others to succeed: good communication among all stakeholders.
- Action partners need to be willing, funded, and empowered.
- Talked about the types of action, what can you do with the information?
- Policy recommendation
- Prioritization, send more resources to where needed
- Help-line
- Use strong data that displayed well to increase public awareness, hopefully
- Initiating investment
- Need horizontal sharing rather than just hierarchies
Comments (0)
You don't have permission to comment on this page.