Brian Troib
Data Science Fellows February 2021 Cohort
Abstract
The process of onboarding new clients at Agora can take up to a few months. The project’s aim is to reduce
that to a few weeks by developing tools that will help process the data more easily and analyze weak spots as
fast as possible.
Challenges
1. The dataset was unlabeled, this required me to manually add labels to the dataset.
2. We have to analyze the dataset as though we don’t have any information, including categorizing what
are the features. Which would be very different from how we normally received datasets during the
course.
Achievements
1. Labelled the entire dataset.
2. Created hard code to label the columns of a new dataset.
Future project development
Our next step is to create a model to determine the category of the feature in parallel to the hard coded
labelling. This should yield better results in labelling the column.