Sessions Predictor

Project by Yonah

Abstract

Autofleet is a Vehicle as a Service platform that enables the largest fleets on the planet to maximize their revenues and margins, increasing fleet utilization through aggregating demand, controlling supply and optimizing rides. The goal of this project is to predict the number of daily sessions, i.e. the number of times users open their car-sharing application, in certain areas (divided into ‘tiles’)  in London. Included in the project is determining outliers to create confidence intervals for each tile in London.

Challenges

  • Determining the correct features to use 
  • Learning new technologies, such as Google’s cloudML, Keras
  • Trying to predict on a relatively small time-series data
  • Tuning hyperparameters of the model, which is computationally consuming and timely 
  • Finding the best model for the task (deep learning vs machine learning)

Achievements (according to KPIs)

  • Tune hyperparameters of already deployed XGBoost model using Grid search and Bayesian optimization which reduced the Mean Absolute Error by a whole session and the percentage error by 2%.
  • Produce an interactive report that visualizes errors over time, both on a daily level, tile level, and Borough level.
  • Create a sessions confidence interval on tile level which alerts the developer of daily prediction outliers.
  • Compare model with several deep learning models and created an LSTM model which matches the performance of the current XGBoost model.
  • Alternative models research, including alternative representations of the spatial data.

Further development 

A new idea to improve the predictions is a representation of the spatial data using image prediction. Essentially, we will convert our data into an image that is color-coded. For example, an area with higher predictions has a darker color. Then we select a deep learning model to predict the next image. 

Supervisor Feedback

Yonah is very committed to the tasks and delivers high-quality products.
Learns fast, and able to perform a wide range of data tasks.

Share this post

Share on facebook
Share on twitter
Share on linkedin
Share on email