In iCarbonX, a big challenge is to allow users to easily report everyday occurrences that could affect their health. This can range from reports about the users sleep patterns, the food they consume, and the exercise they do. The manner with which the users input this information today is as free text into an iCarbonX app. In order to extract all the necessary information from these spoken language sentences accurately and at scale, high performing named entity recognition algorithms need to be employed. This allows the pipeline to be automated and also removes the need for human interaction. Therefore, the project goal was to develop a new food entity recognition algorithm in order to accurately parse this free text.
- Understanding the mechanics of the BERT model and how to manipulate it.
- Integrating the model into the existing system due to lack of familiarity with the codebase.
Achievements (according to KPIs)
- Learned in depth the BERT architecture and adapted it to create an NER algorithm
- Changed the existing annotation scheme such that it will fit OTS tools for BERT
- Integrated the NER model in the system for production
Attempt different variations of the algorithm and augment it with other bespoke features.