Spatial and Social Media Data Analytics of Housing Prices in Shenzhen, China

City:

Shenzhen, Guangdong Province, China

Organization:

Wuhan University, Wuhan, China 


Project Start Date:

 Unknown (most likely January 2016)

Project End Date:

  26 October 2016 (Published)

Reference:

Wu, C., Ye, X., Ren, F., Wan, Y., Ning, P., & Du, Q. (2016). Spatial and Social Media Data Analytics of Housing Prices in Shenzhen, China. Plos One,11(10). doi:10.1371/journal.pone.0164553


Problem:

Housing prices in Shenzhen are increasing due to economic activity. Since the value of residential property value is influenced by proximity to public activity in points of interests (POIs) (e.g. green spaces and commercial business facilities), this study explores the effects and degree of public activity at green spaces (GRE) and commercial business facilities (CBF) on housing prices by using social media check-in data.


Technical Solution:

Using MATLAB R22012a, ArcGIS 10.2, pgAdmin and SPSS 19, the following methods were used: 

1. Data pre-processing was done on all datasets to remove abnormal values, delete invalid/unessential records, and/or employ a collinearity assessment.

2. Spatial hot spot analysis using check-in data in two ways:  

a) Vector-Based: kernel density estimation was used to analyze general spatial distribution of check-in data. POI popularity ranking was then done and used ‘average nearest neighbour’ to determine if spatial distribution displayed any clusters (hot spots)

b) Raster-Based: POI points were rasterized. Getis-Orb Gi* statistical clustering method was used to determine if any significant POI hot spots were present.  

3. Evaluation of factors that influence housing prices using:

a) Hedonic pricing method (multiple linear regression analysis)  

b) Geographically weighted regression (spatial regression that estimates local parameters)

 

Datasets Used:

  • Dataset 1: Check-in data from Sina Visitor System (social media platform) from July 2014 to June 2015. After pre-processing, there were 447,778 check-in records from 216,165 users and 22,670 POI available for analysis.
  • Dataset 2: New housing transaction data from Shenzhen Research Centre for Digital City Engineering. Included 27,112 dwelling units (159 real-estate properties) from July to December or 2015. Includes housing and real-estate attributes (i.e. apartment area, floor level, number of bedrooms, etc.) collected from SOFANG website (www.sz.fang.com/) using crawler technology


Outcome:

Kernel density estimation determined large clusters of check-ins in multiple district centres of Shenzhen (most likely because they are prosperous areas in the city). Getis-Orb Gi* analysis indicated five and three hot spots for GRE and CBF areas, respectively. 

With respects to evaluating the influencing factors on housing prices, the hedonic pricing method was able to account for 72.4% of housing price variance while the geographically weighted regression was able to account for 93.99%. This indicates that the second methods had stronger interpretive abilities compared to the first. Looking at GRE and CBF effects, both have positive influences on housing prices, up to 13.4% and 8.3% per unit increase in clustering degrees, respectively. However, the relationship between GRE and housing prices was not seen throughout the whole study and most (not all) CBF have a positive effect on housing prices.


Issues that arose:

-There was a small scope of housing prices available.

-Only spatial (not temporal) heterogeneity was addressed 

-Demographic biases occurred due to lack of information on elderly and children. 

-Hot spots were only explored for commercial service industry and green spaces; other types of hot spots might have effects

-Only one year of data for check-ins and housing prices were used 

-Not all people use check-in function on social media 


Status:

Completed


Entered by:

Amalia Despenic; amalia.despenic@mail.utoronto.ca 




CEM1002, 

Civil Engineering, University of Toronto 

Contact: msf@eil.utoronto.ca