Transcriber and Task Behavior – developing a model for reducing the cost of transcription E2E.

Project by Gabriel


Verbit is a company that uses automatic speech recognition technology combined with human transcribers to achieve 99% accuracy in their transcriptions. The project’s goal was to create a model

which is able to recognize efficient behaviors of human transcribers and use that in order to reduce the overall costs of the human transcription layer.


  • Understanding how Verbit works, the different layers and protocols in place that end up having an (indirect) influence on the analyzed data.
  • Data engineering/processing – complex functions had to be written in order to transform the raw data into analyzable data. Some already existing functions had to be adapted as they didn’t always fit the requirements.
  • Data cleaning – subtle outliers were removed due to off-the-record parts, names and other specific words were removed to make the analysis more precise.
  • Data exploration – many graphs and statistics were created in order to better understand which features potentially have the biggest impact on efficiency and to better prioritize further work. The availability of data to precisely quantify certain aspects wasn’t always available and statistical estimations had to be made based on other available data.
  • Unsupervised topic classification – used various unsupervised topic classification techniques to analyze more specific potential efficiency improvements on a per topic basis.
  • Actionable conclusions – making sure that the model would be able to predict features with big impact on the efficiency while also allowing the company to take action.
  • Data quantity – worked locally with a smaller dataset to test code and only later applied it to a bigger dataset as its size would on certain occasions cause the code to run for long periods.

Achievements (according to KPIs)

  • Editor efficiency prediction model
  • Metrics:

○ Average accuracy: 0.87

○ Average F1 score: 0.87

  • A full report was written describing the project, the process of development, the features created, the model’s quality and limitations and final conclusions
  • A list of features affecting editors efficiency

Further development

  • Doing the same analysis on other datasets from more types of clients, with the help of other previously unavailable user events data.
  • Using the engineered features for other purposes such as fraud detection
  • Audio transcription difficulty prediction using deep learning techniques
  • Sales focus guidance to categories where Verbit is more efficient
  • Audio recording device with live issues signaling

Share this post

Share on facebook
Share on twitter
Share on linkedin
Share on email