City: | United States of America, Global. |
Organization: | National Science Foundation grant CCF-1522054 (COMPUSTNET: Expanding Horizons of Computational Sustainability). |
Project Start Date: | December 15, 2015 |
Project End Date: | November 30, 2020 (Estimated) |
Reference: | C. Robinson and B. Dilkina, ‘A Machine Learning Approach to Modeling Human Migration’, in Proceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies (COMPASS) - COMPASS ’18, Menlo Park and San Jose, CA, USA, 2018, pp. 1–8, doi: 10.1145/3209811.3209868. |
Problem: | Human migration has a huge impact on cities. The accurate prediction of humans’ flow into cities is essential for cities planning, infectious disease control, public policy development, and international trade. This triggered the need to develop more accurate models than the currently used ones (i.e. gravity and radiation models). The conventional migration prediction models only use population and distance as the basis for prediction. Moreover, they are not able to capture more complicated migration dynamics because of their fixed form. This is seen as a shortcoming and required the development of new models that include more variables as their basis to enhance accuracy. The scope of this study developed models to predict human migration between USA counties and between countries on global scale. |
Technical Solution: | The study used the following techniques to predict human migration between USA counties and between countries on global scale:
|
Datasets Used: |
|
Outcome: | Pre Solution Performance; For the USA Migration dataset, the most accurate traditional model was the Extended Radiation model. However, for the Global Migration dataset, all traditional models had a coefficient of determination value around zero indicating poor fit of the models.
Post Solution Performance: The Machine learning models performed better than the traditional models when constrained to the same conditions. Where ANN performed best for USA Migration dataset, and XGBoost performed best for the Global Migration dataset. In the case of extended conditions, the ML models performed even better. The results indicate that more features than those included in the traditional approach must be considered to accurately predict migration. Accordingly, the study was able to determine the most correlated ten factors related to human migration which extend far beyond the traditional approach and aligns well with intuition. The ANN model performed best for county migration in the USA, and was very accurate in predicting the rural migration in contrast to the over estimation that is usually found in the traditional models. |
Issues that arose: |
|
Status: | In Development. |
Entered by: | 30-October-2020: Ahmad Al-Musa, a.almusa@mail.utoronto.ca |