Fraud detection using behavioral biometrics data

Project by Shai


The goal of the project is to detect fraud incidents, using two main data types: device identification information and behavioral biometrics data. The project consists of an end-to-end fraud detection problem (exploration, extensive features extraction, modeling, evaluation and monitoring). The project also includes time-series modeling using LSTM and GRU to create new features to the main model based on the behavioral data



  1. Imbalanced data – The original data has a ratio of 100:1 benign to fraud. After the original downsampling, the data is not balanced hence it requires different techniques to handle this as well as evaluating on an upsampled representation of the predictions
  2. The data-sets are very different, therefore requires different approaches (that reflected in 2 different models – a time series model and a tree-based model). The output of the time-series model has to be an input to the final model
  3. Data exploration – The data contained dozens of raw features which constitute the building blocks of the final data to be consumed by the models. Good understanding of the problem is vital for the features generation step
  4. Features generation:
    1. Behavioral data –  raw features could not be used at all, hence generation of dozens new features is required. These new features have to make sense and represent behaviors that may be good for fraud detection
    2. Device identification data – Generation of more than a hundred new features. The new created features were proven to be fraud or benign related
    3. Examining the distribution of all features in terms of fraud and benign activities
  5. Large scale on a private laptop – The size of some of the files is more than a few GBs, therefore not all files could be loaded into memory at a single operation
  6. Model Hyperparameter tuning – Some of the hyperparameters tuning have to be done on an external  server due to required resources needed for this operation


Achievements (according to KPIs)

  1. Extensive data analysis notebooks (end to end):
    1. Identified fraud related features (numeric and categorical features) during the data exploration
    2. Developed automatic metrics calculations and plots for evaluation of the models (matplotlib and bokeh plots)
  2. Achieving the highest recall for 0.001 FPR – The Recall up until now has been around 0.4. In this project we achieved 0.5 Recall for the required FPR
  3. Comparing multiple modeling and architectures – We used LSTM /GRU RNN model on the behavioral data (represented as time-series) and Xgboost as our final model on all data. The results of the time-series modeling was plugged into the final modeling as another feature. Many hyperparameters and features were evaluated during the model’s testings in order to find the best architecture and generated features


Further development

  1. Evaluating the model’s performance in a larger scale without downsampling. The benign data was downsampled prior to the model and upsampled afterwards (only predictions and labels) to evaluate the model’s performance in “real world” conditions
  2. Monitoring the model’s performance over a few months
  3. Adding more features to the behavioral time-series data and testing more architectures

Share this post

Share on facebook
Share on twitter
Share on linkedin
Share on email