The Problem with the “Data Revolution:” How Bad Data Could Derail Global Anti-Poverty Efforts.
2015 is set to be a formative year for the future of international development. The Millennium Development Goals, considered an achievement in global cooperation for poverty reduction, are due to expire. A new set of Sustainable Development Goals (SDGs) will take their place. As these goals are being drafted, new buzzwords are entering the development vernacular: Big data, open data, evidence-driven design and performance-based aid are all the rage.The confluence of these events offers an unprecedented opportunity for the global community to unite around shared priorities for action. The development field’s increasing reliance on data, however, is plagued with problems of low capacity, poor technical knowledge, and bad analytics. Practitioners have an opportunity to address these challenges at the upcoming Financing for Development conference in Addis Ababa, Ethiopia, in mid-July 2015. Governments and donors must commit to dedicating political and financial resources to improving data collection and analysis.The current draft SDGs feature a firm set of more than 150 targets—a sign that data, once the exclusive property of economists and evaluators, has taken on a life of its own. Development policymakers and practitioners, egged on by politicians and donors seeking quantified returns on investments, use all kinds of data, good and bad, to drive decision-making. In a field of limited resources, the stakes are already high: one country’s gain is another’s loss. Ignorance of the fallibility of data could soon become a missed opportunity for the development community to truly capture the realities of how people in poor countries live, work, and make decisions.The problem of bad or misinterpreted data is not exclusive to poor countries. These countries, however, do face certain barriers to data collection and access. Informal economies lack reporting mechanisms, making it difficult to track the buying and selling of goods and services. Legal codes, especially relating to land and agriculture, are often outdated and not enforced. National statistics bureaus experience chronic shortages in funding and personnel. Statistics meant to be comparable across countries are measured using different indicators and algorithms.1 Fragile and post-conflict states, often those in most need, are particularly susceptible. All this adds up to incomplete and inconsistent pictures of national economies.National economic data relies on population statistics as a multiplier to evaluate economic sectors, but the process of conducting population censuses is difficult and controversial.2 In Nigeria, for example, this problem is pronounced. Following the 2006 Nigerian census, the chairman of Nigeria’s National Population Commission attempted to draw attention to the difficulties of census work, stating in a report that “some enumerating staff deployed by the Commission were killed while some were assaulted and chased away during the current census in certain parts of the country.” The president of Nigeria responded, stating “those who dispute the [census] results [are] ‘confusionists’… If you like [the result], use it, you don’t like, leave it.”3 This experience is emblematic of the complex logistical, social, and political drivers of poor data that can be common in low-income countries, and that further complicate the problem of weak statistical capacity.With such limited data, economists are forced to use alternative methods. This often means manipulating the data to obtain an estimated representation of the full population, which can subject raw data to bias. Dr. Mark Hansen, a statistician and professor of journalism at Columbia University, points out that an algorithm’s author is biased by his or her own knowledge, beliefs, and experiences. The design of an algorithm’s software also inevitably makes certain paths easier to follow than others.4 In the aforementioned case of Nigeria, the economist Morten Jerven notes that “any statement, whether it is the number of people below the poverty line or the number of doctors per capita, becomes guesswork.”5 Similar problems exist in measures of land ownership, agricultural output, and business dealings. Too often, policymakers take such data as gospel, charging ahead with growth strategies based on bad numbers.The act of data collection itself, whether for census purposes or program evaluation, is also fraught with bias. Epidemiologist James Trostle, citing studies of survey accuracy in Bangladesh and Nepal, describes the process of collecting data as a social exchange: Information is exchanged for goods and services like money or academic authorship. Survey questions often address sensitive subjects like income, health status, or education level, eliciting responses skewed by pride, shame, or altruism.6 Errors and time lags in survey methodology can also generate misleading results. These statistical data and surveys, however misleading, form the basis of impact evaluations of development initiatives like the SDGs.The solution is not to abandon bad data—in many cases, it may represent the most accurate measure available—but to be aware of its limitations. The current draft text of SDG 17 calls for building capacity “to increase significantly the availability of high-quality, timely and reliable data.” This recognition is a start, but donors and governments must be held responsible for backing this policy with political commitment and resources. Until improved capacity and data exist, basing success or failure of the SDGs on a set of potentially unreliable data points represents a major risk to the effectiveness of the SDG agenda.
1. Jerven, Morten. Poor Numbers: How we are misled by African development statistics and what to do about it. Cornell University Press: 2013.2. Jerven, Morten, 56.3. Jerven, Morten, 59.4. Hansen, Mark. “Remarks at Data Revolution and Public Policy Event.” December 8, 2014. Columbia SIPA, New York, NY. 5. Jerven, Morten, 61.6. Trostle, James. Epidemiology and Culture. Cambridge University Press: 2005, 85.