Create, train and test a deep learning model in order to outperform the machine learning model already implemented

Project by Sam

Abstract
Agoda uses a multiplication machine learning model to estimate the return on a click won in auction on the Google Ad platform. The goal of the project is to create, train and test a deep learning model combining text features, simple features and a multiplicative framework in order to outperform the machine learning model already implemented.

Challenges

  • Considering the huge size of the data, overcome the problem of the longtime of running for each trained model. First testing it on partial data, then on the entire data set.
  • Determine the best features to use and how to process them.
  • Understand the metrics related to a deep learning model (batch size, learning rate, epochs…) and how it affects the performance of the model. Modify them accordingly.
  • Change the architecture of the model and understand how it affects the results.
  • Define by which metrics to judge the model and why.
  • Outperform the machine learning model on all the relevant metrics.

Achievements (according to KPIs)

  • Set up the environment to train and test the model.
  • Built, trained and tested several deep neural networks and compare their performance.
  • Outperformed the current machine learning model with the MSE metric.

Further development

  • Some features will need to be processed differently. For example, we should try embedding some text features instead of hash-bucketizing them.
  • Include the day of week and the time as a feature
  • Put the model in production and test it.

Supervisor Feedback
Sam joined us for a about 5 weeks in September 2019. She had no issues understanding the project from the mathematical side. Sam overcame obstacles in fair time and was able to replicate the results of our data scientist. Later they worked together to try and improve the model while doing many rounds of hyper parameter tuning.
Beyond that, Sam was able to learn how to use TensorBoard and give us a view into how the model embeds our keywords, the simple view gave us the feedback that our embedding strategy requires improvement (see below), which is a great research result. Had we had an opening I would have extended Sam an offer. Sadly that’s not the case right now. I wish her  all the best and would be happy to supply a good recommendation for her.

Share this post

Share on facebook
Share on twitter
Share on linkedin
Share on email