Data Science Fellows Projects 2019

Hard disk drive failure prediction, using Machine and Deep learning methods

Project by Michael

Abstract

Dell provides a monitoring HDD support to predict the failure and anticipate the change of the component. Given a 15 million operational database, which include 63 features and more than 30 thousand drives, we had to build an end-to-end predictive model to forecast the failure of a drive within a time window.

 

Challenges

  • Data processing – large database with lot of missing days and values.
  • Time series modelling – data are sequential by drive with very different lengths and characteristics.
  • Train test split – be sure to not use future information to train the model and try to replicate the production test given a specific date.

 

Achievements (according to KPIs)

  • Feature selection and engineering.
  • Built a LSTM regression model to fill the missing values and both a HMM and a LSTM sequential models to predict the failure given multiple time windows.

 

Further development

  • Train and test the models on the entire data.
  • Deepen both data exploration and data modelling.

Share this post

Share on facebook
Share on twitter
Share on linkedin
Share on email