Using Neural Nets to Predict Transportation Mode Choice

City:	Amsterdam, The Netherlands
Organization:	Vrije Universiteit Amsterdam Centrum Wiskunde & Informatica (the national research institute for mathematics and computer science in the Netherlands) Regional Transportation Authority of Amsterdam City of Amsterdam
Project Start Date:	Unknown (most likely 2019)
Project End Date:	April 9, 2020 (presented at the International Conference on Ambient Systems, Networks and Technologies (ANT))
Reference:	Buijs, R., Koch, T., & Dugundji, E. (2020, April 14). Using Neural Nets to Predict Transportation Mode Choice: An Amsterdam Case Study. https://www.sciencedirect.com/science/article/pii/S1877050920304440.
Problem:	A new metro line serving the entire north-south corridor of the city of Amsterdam was constructed in 2018. This infrastructure introduced a new and attractive option, primed to impact travel habits across the city. The investment also had a significant impact on the existing transportation network, as routes and schedules of buses and trams were changed to provide supporting east-west links to the new north-south spine. These changes also involved new restrictions for motorists traveling throughout the city. With consideration of the significance of these impacts, the City and regional transportation authority were interested in modelling the resulting changes in travel mode choice.
Technical Solution:	Trip generation models are often based on discrete choice mode analysis through the estimation of parameters using statistical models. While not typically used in the field of behaviour analysis in transportation, Artificial Neural Networks (ANNs) were selected as the method to classify the mode choice decisions in this particular case study. To establish the alternatives to the choices made by the participants, an open source routing tool, R5 (Rapid Realistic Route on Real-work and Reimagined networks) was used. This tool was supplied with local street and transit data to be able to provide accurate alternative trip options. For each observation and alternative, the routes were classified into one of 5 different non-overlapping strata: Walking trip Car Trip Bicycle trip Transit trip (without metro) Transit trip (with use of metro) The first step was to combine the observed trip data with the routing possibility generated by R5. The resulting dataset was comprised of mode choices (strata) being the dependent variable, and attributes including duration, distance, wait-times, and user characters being the independent variables. The data was then split into a training set with approximately 50% of the data (372 entries), a validation set of 20% (121 entries), and a test set with 30% (212 entries). User preference variables were also obtained for each user. Together with the observed trip data and alternatives, this data was fed to an ANN in the form of tables of 5 different alternative strata, each with 17 attribute values. Each table in the training set corresponded to one mode choice scenario. The ANN was trained for 300 epochs (cycles through the full training data set). The initial architecture consisted of a model containing 2 hidden layers, both containing 5 nodes. The ANN used cross-entropy loss as loss function for back-propagation.
Datasets Used:	Dataset 1: Data was collected using a GPS application installed by a panel of 78 participants on their smart phones that have used the North-South metro line at least once during the tracking period. Dataset 2: Two separate General Transit Feed Specification data files were used to feed the R5 tool with the correct timetable before and after the opening of the new metro line. Dataset 3: OpenStreetMap was used to feed the R5 tool with street network information.
Outcome:	Accuracy for the test data set was 94.6%. The model performed well, especially for scenarios where public transportation was chosen. However, the neural net may have been picking up on small differences in the data between the generated alternatives and actual trips. This was evidenced by the finding that removing significant variables (such as user characteristics) did not result in a significant decrease in model performance, however, removing far less significant attribute combinations did significantly impact performance. The technique was therefore biased by small differences or patterns between the two data sets.
Issues that arose:	1. Very small and biased sample, such that each user could not be used as a single entry. Trips needed to aggregated together to allowed for enough data for training and testing: 21 out of 78 users had only one trip entry, meaning no predictions could be made for these users Only 21 out of 78 users had more than ten trip entries. 2. Duplicates existed within the combined dataset when the observed trip was the same as one of the predicted routes. To avoid using biased data for the machine learning model, one of the duplicates was deleted in each of these instances. 3. Issues arose related to data preparation: The strata classification chosen is not the desired end structure of the output Some of the variables are correlated, such as bicycle_distance and bicycle_duration The researchers performed principal components analysis to generate linearly independent vectors, with each user having a label distribution of the relative frequencies of different choice types. They then used k-means clustering with a k value larger than the number of strata to generate classes that were composed of entries from different strata. 4. Minor but impactful differences between the generated and observed datasets may have compromised the validity of the performance of the neural net as described above.
Status:	Terminated
Entered by:	October 30^th, 2020: Matt Kussin, matt.kussin@utoronto.ca

CEM1002,
Civil Engineering, University of Toronto
Contact: msf@eil.utoronto.ca