Data Science Fellows Projects 2019

Model creation for marketing automation rules, based on big data

Project by Juan

Abstract

Jeeng is a targeted audience notification service which manages huge amounts of traffic as relevant data is pushed to subscribed customers. In the beginning, there was no way to manage or store this traffic for further analysis. A distributed streaming pipeline, storage services, and processing services were set up through cloud vendors in order to enable this functionality.

Challenges

  • Understanding Jeeng and the way it works. The way the audiences are created and how data is managed from both the client and server side.
  • Working around the limitations of client-side executed code and different data sources which may or may not remain consistent.
  • Learning how to use the services of a cloud vendor, namely IBM Cloud Services, also known as Bluemix.
  • Creating a resilient, highly available and consistent data pipeline through Event Streams (Apache Kafka) for development, staging and production environments.
  • Assembling serverless functions and endpoints that are able to produce messages into the Event Streams in order for them to be saved into a highly available storage known as Cloud Object Storage.
  • Deploy all these services and updates in-tandem through the creation of deployment scripts to each environment.
  • Evaluate the possibility of the implementation of client-side model training and model evaluation, or at least the execution of the trained model client-side.
  • Usage of many different tools: Serverless Functions and their linkage to REST endpoints, Node.js, Python, Apache Spark, Bash, Apache Kafka, Cloud Object Storage.
  • Take ownership over the Native Mobile SDK project and find the corresponding fit of developers to create it.

Achievements (according to KPIs)

  • The Event Streams have successfully been deployed to all environments: development, staging, and production.
  • The serverless functions and endpoints in order to publish to the Event Streams have been deployed to all environments.
  • Creation of a bash script that can be run through npm in order to specify to which environment the current version of the serverless functions will be deployed.
  • Evaluated the execution of client-side training. It proved infeasible due to complications in gathering evaluation metrics as well as being able to provide the corresponding data for evaluation.
  • Found a possible fit for the Native Mobile SDK.

Further development

Analyze the newly-streamed and stored data in order to create real-time visualizations, and create models that may improve notification targeting. Execute the Native Mobile SDK project.

Share this post

Share on facebook
Share on twitter
Share on linkedin
Share on email