Contextual Click Through Rate Prediction for personalized recommendation

Project by Adi


The project’s goal was to deliver an E2E CTR prediction model for personalized recommendations. The project involved the fusion of different consumption features, taking into account their corresponding confidences & priors, for devising a click prediction model that can be used in Outbrain’s low-latency, high-throughput serving layer. The models’ quality was measured on a predefined test set using standard supervised machine learning evaluation metrics of model performance. The model’s performance was then checked via A/B testing.


  • Understanding how does Outbrain work – Understand the architecture, software used, features and labels.
  • Data collection- understanding what features to use from the original dataset and collect them accordingly. For example: collecting only Exploitation stage data.
  • Data exploration- see how the observations behave. data exploration showed dirty data such as missing features and in times ctr>1.
  • Categorical metadata-  most of the original observations were categorical, and transforming them via Get_dummies created thousands of features which made it difficult to work with.
  • Overfitting – the datasets behaved differently between themselves which caused certain aspects of the model to over fit.
  • Working with data outside the dataset- adding additional data from different tables.
  • Data quantity- the amount of data is huge and working with it locally caused the Jupyter to run for many hours or even crash in times.
  • Scaling model to bigger dataset- because the model ran locally, it required to work with smaller datasets than needed (hours instead of days).
  • Deployment difficulties- the transition from working locally to A/B testing required compromising on model type, required features, etc.

Achievements (according to KPIs)

  • CTR prediction model
  • reasonable metrics: 

 – average R2 score: 0.5

 – average RMSE: 0.009

Further development

A/B testing- the A/B test started on the 19/03/19 afternoon with a simple linear model (Ridge Regression) and minimal amount of features (signals only). The test kept a stable behavior of a 1% drop compared to the control groups over the weekend. The next steps will be to add additional features and a more complex model.

If a lift will be seen, then the next development steps will be to automate the code so that the weights and the means will be fed to the model automatically.

Share this post

Share on facebook
Share on twitter
Share on linkedin
Share on email