A Case Study of Active, Continuous and Predictive Social Media Analytics for Smart City

City:	Milano, Italy
Organizations:	Politecnico di Milano - DEIB, Milano, Italy Delft University of Technology - WIS, Delft, the Netherlands Siemens - Corporate Technology, Munich, Germany Paris Decartes University - Paris, France University of Trento - DISI, Trento, Italy
Project Start Date:	Early 2014
Project End Date:	October 19, 2014 (Published)
Reference:	Balduini, M., Bocconi, S., Bozzon, A., Valle, E.D., Huang, Y., Oosterman, J., Palpanas, T., & Tsytsarau, M. (2014). A Case Study of Active, Continuous and Predictive Social Media Analytics for Smart City. S4SC’14 Proceedings of the Fifth International Conference on Semantics for Smarter Cities, 1280, 31-46. Retrieved from www.mi.parisdescartes.fr/~themisp/publications/s4sc14.pdf
Problem:	The Milano Design Week 2014 took place in the City of Milano, Italy and was attended by half a million visitors. With thousands of events taking place in half a thousand venues, visitors may find it difficult to determine which venue to go next, while getting the most out of the 6-day city-scale event.
Technical Solution:	The software architecture consists of several individual components connected by an integration middleware. The components are the Social Listener (SL), Visitor Modeller (VM), Visitor-Venue Recommender (VVR), and Visitor Engager (VE), which all communicate and interact through the middleware called Blackboard. Using supervised learning techniques, the system recommends venues to visitors who are attending the event. Blackboard: Facilitates the communication between all the components using Resource Description Framework. Social Listener (SL): Collects, translates, decorates and analyzes raw tweet data from Twitter’s Steaming API using C-SPARQL queries. Visitor-venue links analyzed from the tweets are sent to the VVR component. Most active visitor profiles analyzed from the tweets are sent to the VM component. Visitor Modeller (VM): Using SPARQL and Twitter’s RESTful API, generates an internal profile of the visitors chosen by the SL. Visitor-Venue Recommender (VVR): Predicts venues through a statistical machine learning approach using matrix factorization. Machine learning approach evaluated against base line methods such as random guessing, Pearson correlation coefficient and the most popular venues. Predictions use data from the VM profiles and visitor-venue links from the SL. 12 data updates, equally spaced out over the course of the event, are pushed to the VVR. The training data is the set of visitor-venue links and visitor profiles that are pushed by the current update. The testing data is the future set of visitor-venue links and visitor profiles that will be pushed by future updates. Recommendations are evaluated according to the venues that the selected visitors end up going. Visitor Engager (VE): Communicates with the selected Twitter users and provides them the venue recommendations if the users decide to opt-in to the service. Communicates through an official Twitter account supported by the event organizer, Fuorisalone.
Datasets Used:	Tweet data from Twitter’s Streaming API filtered to the Milano urban area, April 8-13, 2014, https://dev.twitter.com/docs/api/streaming Tweet data from selected visitor profiles from Twitter’s public RESTful API’s, April 8-13, 2014, https://dev.twitter.com/docs/api/1.1
Outcome:	A venue-recommendation service was offered in real-time to the visitors of the Milano Design Week 2014. The study successfully analyzed and linked visitors to venues based off visitors’ Tweet information. The predictions made by the VVR were better than the base line methods. Through the VE, the system sent out invitations to 179 users where 2 users opted in to receive venue recommendations and 5 users acknowledged the service but opted out. The response rate was 3.9% which was in-line with the research team’s previous empirical study of 4%.
Issues That Arose:	Engaging the Twitter users to receive the recommendations was more difficult than expected. Twitter’s anti-harassment and anti-spam policies: The team’s Twitter account was temporary suspended for violating the harassment policy even though messages were sent in 5-minute intervals along with 5 different versions of the same message.
Status:	Completed, but no longer active
Entered By:	Cedric Mosdell, cedric.mosdell@mail.utoronto.ca

CEM1002,

Civil Engineering, University of Toronto

Contact: msf@eil.utoronto.ca