fbpx

Explainable AI for financial services, Deltika

Nicolas Mai

Data Science Fellows June 2020 Cohort

 

Abstract

Deltika does feature explainability, meaning that they help businesses understand why their machine learning algorithms predicted a specific outcome, in order to find out what should be changed in order to modify the final outcome. For example, a bank algorithm will tell a consumer if he gets his loan accepted or rejected, and Deltika will be able to explain to the bank and the consumer why the model took either of the decisions.

The project consisted in improving Deltika’s product, and finding out new ways to explain a model’s prediction

Challenges (at least two)

  1. Achieve feature explainability by resorting to counterfactuals 🡪 using autoencoders in order to:
    1. Identify a class in the latent layer
    2. Generate samples in the latent layer from a specific class
    3. Find the closest neighbour to an initial sample among fake generated samples in the latent layer
    4. Decode this closest neighbour in the output space in order to identify our counterfactual instance
  2. Generate a database of several tables with the same features: each table follows a similar pattern and with associated features. The purpose was to force the correlation between the features and the target variables (eg a business where the “number of products sold” is the target variable, and a created “marketing expense” feature is correlated with the “number of products sold” in order to explain the evolution of the target variable)

Achievements (according to KPIs)

  1. Created an autoencoder model that can efficiently identify a counterfactual to a specific sample
  2. Generated fake tables that can be used in the future to apply feature explainability models on it and test their efficiency
  3. Applied NN and LSTM on the fake generated data in Time Series or Static Data format, in order to predict the target variable for the next day

Future project development 

Counterfactual autoencoder part: test the model on real data, and see how well it performs

Data generation part: Fake data could be improved, by making it more complex: more “stories” to explain to be added to the data, more features…

Share this post

Share on facebook
Share on twitter
Share on linkedin
Share on email