FirstRepository

Assignment 4b

Aravind Surumpudi

May 17th, 2021

Word count: 1811 words

Introduction

My research question is: “Can predictive models and poverty mapping work in regions where poverty is not as apparent or outright?”. Throughout studies and papers that I have read, I have noticed a pattern of poverty mapping and predictive models only being done in areas of extreme poverty such as Eastern Africa. When people think of poverty, they think of Africa, but Africa only consist of 40% of the global poverty. The other 60% need financial aid and my plan of using geospatial data science methods (Satellite imagery/RS Data and CDR) to achieve this. Satellite imagery has been very useful in producing predictive models and has improved high frequency surveys. Datasets can include night lights and conditions of roads/homes to produce “heat maps” of poverty. CDR (Call Details Record) on the other hand uses information from cell towers, such as number of texts sent/received or app usage. These two methods produce predictive models of poverty, which are used to efficiently direct financial and humanitarian aid. My question is aiming to use these methods to solve poverty in urban areas, where pockets of unknown and extreme poverty exist. There are many benefits that come with implementing poverty mapping and models in urban areas, in that urban areas generate more than 70 percent of United States’ GDP. If we were to solve the poverty crisis in urban areas and bring these people into the workforce, then both the economic and social status of the nation would improve as well. The money produced from the fruits of poor people transitioning to a working class of people can be put into further projects of human development. This sort of trickle-down economics will end up benefitting nations for the greater good of their future. Although, mapping out poverty is somewhat more difficult when compared to an area where poverty is more outright; It has been seen in regions such as Dhaka (the capital of Bangladesh) , proving that this is a very attainable goal.

The Drawbacks

As with everything good and promising in life, there are always drawbacks. A few salient harms/obstacles exist in the efforts of solving the poverty crisis through data science. These harms include: security/privacy, costs, and fixation (bad data/analytics). Security and privacy have always had negative connotations associated with data science. Large datasets can include private information that is prone to identity and data theft. If ordinary people are getting hurt while trying to solve the poverty crisis, then people would be hesitant to agree to share their data thus halting the progress of data science before it even starts. Another harm is costs, the cost of storing and mining data is enormous. The money would most likely come from government funding, but this could reduce the budget in areas of development such as schools or hospitals. Of course, this is more of a financial issue than a data science issue, but it is a harm nonetheless and will cause a “one step forward and two steps back” scenario. Fixation (bad data/analytics) is the last salient harm. Big data and these data models have shown their prowess in predicting poverty, but this large sea of data can be potentially blinding for some researchers. Data scientists can focus too much on one specific data-set, and this can sway their general/broad view of the entire project. Although there are many potential obstacles and drawbacks to using data science to solve the poverty crisis in urban area, a lot of good can come from this if properly achieved. By identifying and mapping out poverty in urban areas, further government aid can be efficiently directed towards those areas. As a result, further infrastructure will lead to greater job opportunities for those who are impoverished in the area. More people in the work force means more tax revenue, which means further money directed towards projects concerning human development and preservation.

The Argument to the Jury/Foundation

Dear Jury/foundation, my solution revolves around the plan that Asmi Kumar used to predict and map out poverty in Dhaka. Kumar used information from the DHS, PPI, CDR data and RS data in order to construct a pre-trained CNN (Convolutional Neural Network) that predicts poverty in urban areas, such as Dhaka. The best thing about my plan to tackle this problem of poverty in urban areas, is that it is very cost-effective. CDR/RS data and information from the DHS and PPI are all readily available sources of information. By using this data to construct a machine capable of self-learning and predictive modeling allows for extreme cost-saving capability. Funding for this research plan will most likely be needed to finance a few programmers to set up the groundwork for the machine-learning computer, as well as a few data scientists dedicated for data collection and sorting. The pros of this plan include its cost-effectiveness, ease of implementation, and existing proof of concept and capability. The cons of this plan mainly revolve around the government or other parties’ usage of the information produced by the study. Although data scientists are meant to paint the picture of poverty, the real work comes in the actions followed by the information. Some possible arguments concerning my plan may be directed towards where the money will come from when the time comes for the government to act on the results of the poverty mapping and modeling. My argument for this is that governments currently split up financial aid somewhat evenly throughout states, no matter if they need it or not. However, now they will know exactly which sector, town, or city needs it most and will be able to create more economic opportunities in that specified area.

Methods and Plan

On a more detailed and methodical look of my plan, I would like to take a closer look at the methods being utilized in this plan. The first method and data set to be used is Remote Sensing (RS data) and Satellite imagery. Satellite Imagery and RS (Remote Sensing) data play a crucial role in producing poverty maps. These images can collect data ranging from: nightlights, conditions of roads/homes etc. In a specific study in Rwanda, nightlight luminosity was recorded and used to produce a map of asset scores. In urban settings, however nightlight luminosity won’t be as useful due to the fact of such congested spaces. However, daytime images of conditions of roads/homes can prove to be useful in the data we are collecting. The second method and data set to be used in this plan stems from CDR data as well as BGMs (Bayesian Geostatistical Models). CDR (Call Detail Record) relies on data produced by phone usage and cell towers. Phone usage can indirectly indicate access to financial resources, and movement of the phones themselves can signal individual migrations to better economic opportunities. The Bayesian Geostatistical model will prove to be especially useful when making sure that the data inputted and outputted is relevant towards the model we are trying to produce, in that BGMs use probability to represent uncertainty in both input and output data. All this data from the methods previously discussed combined with readily available public data (DHS, PPI) will be utilized to construct a pre-trained construct a pre-trained CNN (Convolutional Neural Network) capable of predicting and mapping poverty in urban areas, ranging from New York City to London.

The Budget

As stated before, my plan will not need the full utilization of the $100,000; However, I would like to discuss where the money will be directed. First and foremost, the bulk of the finances will be going to the data itself (collection and analysis). CDR data as well as RS data from satellite imagery is free and open to the public, so finances will not need to be directed there. However, updated surveys and information from the DHS are required to further improve the accuracy of the machine learning model. Around a quarter of the $100,000 will go towards this, while the rest will go towards programmers and data scientists in charge of sorting and programming the machine. My plan is focused on urban cities; however, I would like focus to be upon urban cities located in generally impoverished countries or nations. Expanding urban cities will foster expansion of industrialization and job opportunities for impoverished people. I would be focusing on cities such as, Nairobi, Myanmar, Johannesburg, etc.

Conclusion

Global Poverty has been an issue for decades on end, and we finally have the technology and resources to solve it efficiently. Data science is an emerging market, thanks to the vast sources of data and information we have around us. Everything around us is data, whether it be our cell-phones or credit card transactions at various shops. Data science is a cost-effective way of solving modern day issues and will facilitate human development as we know it. With my plan, the pros out way the cons significantly. Satellite imagery has proven to produce high-resolution poverty maps, and when combined with other components such as CDR data, DHS, PPI; It results in a machine learning component capable of predicting and mapping poverty in real time. The results in various studies showed real promise, and the next step is to implement my plan in cities and other impoverished nations. After seeing how these geospatial methods operate and the work with each other to produce efficient poverty maps; I doubt that solving the poverty crisis will be a further issue. These methods have shown that they can address the salient harms of: fixation, cost, security/privacy, and bad data/analytics that come with data science. The data and findings proved to analyze essential elements of my research question. It gives me great comfort data scientists have already set up the ground work for my plan, and showed me the endless possibilities that data science has for solving real world problems and furthering human development.

Sources

Castelan, C. R. (2019, July 9). Making a better poverty map. Retrieved March 22, 2021, from https://blogs.worldbank.org/opendata/making-better-poverty-map

Horton, M. (n.d.). Stanford scientists COMBINE satellite data, machine learning to map poverty. Retrieved March 22, 2021, from https://pangea.stanford.edu/news/stanford-scientists-combine-satellite-data-machine-learning-map-poverty

Kumar, A. (2020, July 06). How to understand global poverty from outer space. Retrieved March 22, 2021, from https://towardsdatascience.com/how-to-understand-global-poverty-from-outer-space-442e2a5c3666

Martinez, A., Jr. (2021, March 11). Using machine learning on satellite images to map poverty. Retrieved March 22, 2021, from https://development.asia/insight/using-machine-learning-satellite-images-map-poverty

“Pape, Utz; Parisotto, Luca. 2019. Estimating Poverty in a Fragile Context : The High Frequency Survey in South Sudan. Policy Research Working Paper;No. 8722. World Bank, Washington, DC. © World Bank. https://openknowledge.worldbank.org/handle/10986/31190

“Pape, Utz; Wollburg, Philip. 2019. Estimation of Poverty in Somalia Using Innovative Methodologies. Policy Research Working Paper;No. 8735. World Bank, Washington, DC. © World Bank. https://openknowledge.worldbank.org/handle/10986/31267

Steele, J. (2017, February 19). Mobile phones can create high-resolution poverty map. Retrieved February 23, 2021, from https://www.indiatoday.in/technology/news/story/mobile-phones-can-create-high-reolution-poverty-map-959791-2017-02-09

Steele JE et al. 2017 Mapping poverty using mobile phone and satellite data. J. R. Soc. Interface 14: 20160690. http://dx.doi.org/10.1098/rsif.2016.0690