fbpx
Blog - first step as a data scientist (5)

My First Step as a Data Scientist | Fellows Talk About Their Company Projects

Experience is the best way to learn anything, that’s why we send our students to work in projects with leading companies in the tech industry. Our students from the February cohort just finished their projects. We asked Yoav Vollansky and Dor Meir about their projects with Fiverr and Anna Roytberg about her project with Imprint.

 

In simple words, what is the model behind your projects?

Yoav: The goal of the project was to research and develop an NLP based model that would help Fiverr’s analytics team to better understand what their users (part of which are service buyers) are looking for when using the website’s free search. This is important for analytics purposes, since, as with all search scenarios, different users who are looking for the same thing might use different search queries.

For example, Let’s say I need a new website to market my new bicycle model. In this case, I might look for “website design”, “landing page”, or maybe “new product marketing website”. Some people might even try “quickly build shop site for bicicle” (spelling mistake intended!). We want to be able to know that all of these search queries have the same intent behind them.

Anna: I was working on sentence and paragraph segmentation. 

I wanted to get practical experience in both Deep Learning and Machine Learning so  used Neural Networks (bidirectional LSTM) for sentence segmentation and classical machine learning for paragraphs.

Dor: Clustering search queries, i.e. grouping together searches in Fiverr website’s search box, that belong to buyers with the same buying intent, so that the analysts team can better understand current search trends and identify new ones. 

 

 

What is the possible use of what you are working on, which is most interesting in your eyes?

Yoav: This project appealed to me on a few levels. Firstly, it involved a lot of research and experimentation and not just development. This was a great opportunity to see how real world NLP pipelines flow into classical machine learning algorithms. Seeing how stuff eventually worked nicely was very satisfying.

Additionally, building an entire process and pipeline from zero to a process that actually runs in production (and is used by other people) was a great opportunity to take ownership on building an end-to-end product. Finally, having done a project in a big company and working with a team of people from multiple parts of the organisation was an invaluable experience business-wise.

Anna: My tasks are about how to structuarize raw texts –  sequences of words without any punctuation marks and even sentence borders. It’s really difficult to read raw texts, so we need the system to postprocess speech recognition output, for example, while doing quomatic subtitles. 

Dor: I find it very interesting that the clustering pipeline I built not only identifies same clusters over time but also calculates the price of the average product sold in each cluster of searches, and so can automatically plot price changes that might reflect a trend of “demand higher than supply”.

 

 

Do you have a personal connection to the project? Does it connect to something from your inner world that intrigues you?

Yoav: I felt a connection to this project since Fiverr was a website I have known and used myself in the past, as a professional freelance musician. It was interesting to see how some of the machinery worked under the hood, as well as being exposed to the business side of a big online services marketplace.

Anna: Yes, I came from applied linguistics and I saw how perfect NLP algorithms gave poor results only because of the low quality of the input data. Sentence segmentation might not sound “cool”, but it’s a really necessary basis to construct any NLP system based on speech recognition output.

Finally it was  the most “research” project in the practice list, so I could spend a lot of time on  what I love –  read scientific papers in applied linguistics.

Dor: Since Fiverr is essentially a marketplace of buyers and sellers and I have a master’s degree in Economics, it really feels like I have a personal connection to the project. Some of the research challenges I struggled with in Fiverr are very similar to the ones I had in my academic thesis.

 

 

When you started your project, did you feel professionally prepared for it?

Yoav: I feel that the course at <itc> has equipped me with the right skills that enabled me to face unfamiliar technical and business problems, explore, research, and finally develop an end-to-end solution. Accompanied with the instruction and guidance of the project tech lead at Fiverr, those skills were put to use and my contribution turned out to be of real value.

Anna: Yes, I did, I felt very confident. The course at <itc> helped me a lot. For example, for the first baseline I simply adapted a BiLSTM model from one of our NLP assignments. I was sure that the model itself is correct because the task was approved by the checkers during the course.

Dor: I definitely had the theoretical basis to understand the two phases of the project – NLP representation and clustering – but lacked a concrete experience in those subjects.

 

What is the structure of your team in the company you work with?

Yoav: The core project team consisted of a tech lead (a senior data scientist) and two more data scientists (one of which was myself). Our team met (online) 2-3 times a week. Additionally, we had weekly meetings in a larger forum with Fiverr’s head data scientist as well as other data analytics and BI people from the extended data department.

Anna: We worked online, so I know nearly nothing about the company structure. I communicated with only a few people from the company – my supervisor, two other data scientists (all of them are <itc> alumni) and a team lead. 

Dor: Both Yoav and I worked together under the supervision of Eitan, a senior data scientist at the data department at Fiverr. There were other Data Scientists in the team we didn’t get the chance to meet, but we did meet quite a lot with other analysts and some team leaders that were the clients of the project.

 

Do you have any advice for those who will join <itc>?

Yoav: Personally, I really enjoyed the project as it was held in a large company where I felt that I am getting a lot of guidance, while at the same time allowed the freedom to explore and implement my own ideas. To future <itc> students, I advise to take this opportunity as a way to learn as much as possible, demonstrate your skills and creativity, and just enjoy doing real-world valuable work.

Anna: Don’t be shy! It is important for all the learning process, but extremely important when you are choosing the practice. So talk with people presenting the projects and with staff, try to understand how the company plans to work with you, should you make your own decisions or follow some prepared plan, who will be your mentor and etc. Figure out what is the most important for you and then make your choice. Try to make full use out of these 5 weeks (and all the course in general).

Dor: <itc> is a great place for hands-on studying Data Science – clear up your schedule and come with full passion to learn. I’d recommend to finish some basic online courses and read/hear supplementary material prior to the course – this might help better understand the material and achieve a higher level of knowledge. I also recommend to always look for a different partner to work with – I studied a lot from working in different groups with other people, and it was also much more fun to get to know other people.


    I agree to receive information from Israel Tech Challenge

    I agree to receive information from Israel Tech Challenge

    Share this post

    Share on facebook
    Share on linkedin
    Share on whatsapp
    Share on twitter
    Share on email
    Share on pinterest