Implement a Factorization machine solution in TensorFlow for CTR prediction

Project by Ornella

Abstract

Outbrain strives to serve the best possible content to its users. For that purpose, various techniques are leveraged. Factorization machines are a leading industry standard for the recommender system. The project goal was to Implement a Factorization machine solution in TensorFlow for CTR prediction and measure its accuracy and performance on Outbrain’s extensive big data.

Challenges

  • Working with huge amounts of data: Outbrain systems produce around 2M/3M impressions per hour. The data I worked on contains 14 features and after hashing, it contains 4.5M of features. Working with this size of data (2M by 4.5M) locally caused Python to run for many hours or even crash in times (due to memory issues).
  • Enabling the model to learn incrementally: retrain the model every hour with new data and predict the next hour.
  • Setting up and running a Pipeline on a remote machine in order to train the model every hour (incremental learning) on a whole week.
  • Integrating the model into Outbrain’s evaluation system.

 

Achievements (according to KPIs)

Delivered a working pipeline:

  • Read and transform the data from LIBSVM format to sparse matrix tensor
  • Train the model every hour and save the model
  • Retrain the model from the last model (incremental learning)
  • Predict on the next hour following the training
  • Evaluate metrics on predictions like RMSE, MRR, and AUC
  • Write the predictions and evaluations in files for every training

Considering the huge amount of data, I found a way to overcome the memory issues (one hour of data contains on average 2.9M of impressions) by using sparse matrices and optimize the time training/predicting by finding the optimal batch size.

Trained the model (every hour) on 8 days of data and predict on the 3 next days:

  • Training time:  54 seconds per hour of data
  • Predicting time: 11 seconds per hour of data

Head to Head comparison of the results to the model in production:

  • RMSE: improved by 13%
  • MRR: improved by 2%
  • AUC: improved by 12%

Further development 

  • Train the model on more than one week (maybe one month).
  • Improve the model performances in terms of training time and MRR/AUC.
  • Fine-tuning the parameters of the model (number of epochs, batch size, learning rate, etc.).
  • Write the model in the production code.

Supervisor Feedback

Ornella worked as part of our CTR prediction team. She took an active part in exploring TensorFlow for that purpose by integrating TF into existing evaluation mechanism. Ornella worked diligently and demonstrated high professional and interpersonal skills. Results on her short internship were more than very satisfactory,  the TF package worked with Outbrain huge dataset and plugged in successfully to the evaluation mechanisms. Due to that we offered Ornella to extend her Outbrain internship in 5 more months.

– Assaf Klein – Perso/NLP lead

Share this post

Share on facebook
Share on twitter
Share on linkedin
Share on email