Apply

Please fill out in English

First Name*

Last Name*

Email*

Choose Program*

Academic experience in:(Which of these: Probability & Statistics, Calculus, Linear Algebra or none)

Mobile (Type your number without dashes)*

Country of residence*

utm_campaign

I agree to receive information from Israel Tech ChallengeI agree to receive information from Israel Tech Challenge

First Name

Last Name

utm_campaign

Choose Program*

Preferred Specialization

Mobile (Type your number without dashes)*

Linkedin Address (URL)*

Country of origin*

Country of residence*

Academic Institution*

Academic Degree

Do you have programming knowledge?

How did you hear of US?*

utm_campaign

Transcriber and Task Behavior – developing a model for reducing the cost of transcription E2E.

Project by Nitzan

April 14, 2019
, 7:29 pm
, Fellows 2018

Abstract

The project’s goal was to reduce the transcription cost by understanding what makes transcribers more efficient and more accurate than others. The project involved mining data from the company’s database, data cleaning and processing, feature engineering of transcriber related features and transcription job related features, and lastly – modeling of the problem and analysis of the results. The conclusions will be used to support both business and technological related decision making in the company. The model’s quality was measured on a subset of the dataset, therefore more testing and evaluation is required.

Challenges

How the company works – Technologies used, data types, transcription job life cycle
The problem – What exactly are we trying to achieve, how to measure a good transcriptions, how to measure efficiency, how does all the data supports answering the questions
Infrastructure – There was no data science team at the company and therefore function and tools were created during the project
Data collection – Understanding the database structure, what it contains, how to get it efficiently
Feature engineering – creating meaningful features to better understand the reason contributing to quality and efficiency
Data quantity – The goal was to get insights from transcriptions of a specific customer, therefore the dataset was rather small

Achievements (according to KPIs)

Editor efficiency prediction model
Metrics:
- Average accuracy: 0.86
- Average F1 score: 0.86
A full report was written describing the project, the process of development, the features created, the model’s quality and limitations and final conclusions
A list of features affecting editors efficiency

Further development

In order to verify the results, further investigation is required. The next steps are offered:

Getting more data from different customers to create a more reliable model – the current model was trained on transcriptions of only one customer.
Experimenting more with the features that were created such as tf-idf vectorization to fully understand them and their potential. In case of needed, creating more features.
Testing the findings on the editors in a controlled experiment.

Please fill out in English

Transcriber and Task Behavior – developing a model for reducing the cost of transcription E2E.

Project by Nitzan

Share this post

See more projects

Predicting and Alerting Maternal Emotional States during Pregnancy, Nuvo Cares

Feature engineering for the current Out of stock detection ML model, Trax Retail (Retail Watch team)

Points of Consumption Like You (PLU), WeissBeerger

Dataset2Vec, Explorium