Data Science Fellows Projects 2019

Implementation of a Named Entity Recognition and dependency parsing algorithm using BERT

Project by Guy

Abstract

In iCarbonX, a big challenge is to allow users to easily report everyday occurrences that could affect their health. This can range from reports about the users sleep patterns, the food they consume, and the exercise they do. The manner with which the users input this information today is as free text into an iCarbonX app. In order to extract all the necessary information from these spoken language sentences accurately and at scale, high performing named entity recognition algorithms need to be employed. This allows the pipeline to be automated and also removes the need for human interaction. Therefore, the project goal was to develop a new food entity recognition algorithm in order to accurately parse this free text.

 

Challenges

  • Understanding the mechanics of the BERT model and how to manipulate it.
  • Integrating the model into the existing system due to lack of familiarity with the codebase.

 

Achievements (according to KPIs)

  1. Learned in depth the BERT architecture and adapted it to create an NER algorithm
  2. Changed the existing annotation scheme such that it will fit OTS tools for BERT
  3. Integrated the NER model in the system for production

 

Further development

Attempt different variations of the algorithm and augment it with other bespoke features.

Share this post

Share on facebook
Share on twitter
Share on linkedin
Share on email