Celloscope aims to turn any smartphone into a personal medical monitor in order to empower stroke survivors to be able to have more control over their lives and a better rehabilitation process, through constant feedbacks and a connection with their therapist. The application should recognize which activity the patient is doing to be able to analyze the patient motions better. The project’s goal is to detect and tag different activities such as walking, running, climbing the stairs with the phone in the front pocket, back pocket, hand or purse/bag.
The first challenge is to understand the data of the gait analysis. Indeed, the gait cycle is composed of different phases for each foot:
- Stance: when the foot is on the ground
- Swing: when the foot is in the air
Then, we can also divide the total cycle into:
- Initial double limb stance (the two feet are on the floor)
- Single limb stance (stance right and swing left for example)
- Terminal double limb stance
- Swing (swing right and stance left)
- Double limb stance
It was also very important to understand what was a stride (interval between two sequential initial floor contacts by the same limb) because all the motions we analyzed were divided depending of a certain number of strides. Then, I had to understand the data itself and the preprocessing that was already made.
A technical challenge I encountered was that I had to use NoSQL to access the database that contains the data and I did not know anything about it. After some explanations and examples I realized it was not very far from SQL and I managed to use it. After all this data analysis and preprocessing that took around one week and was already very challenging, I could start the modeling part. I started by using very simple deep learning models such as Fully Connected Network to make a classification.
Then I analyzed the results by plotting some metrics (PR Curve, ROC Curve, confusion matrix…) and then checked the misclassified data and try to understand the reasons.
By analyzing it we noticed that actually the labels we were using were not always correct so the challenge of the project changed. The goal was not anymore only to get the best accurate model possible but also to improve the labels. That’s why we decided to use work without the labels and use Unsupervised Learning methods and see if we could distinguish many clusters, each of them would represent one activity or position of the phone.
In order to plot them, as each of them are composed of 303 dimensions, I needed to reduce it and I used Principal Component Analysis (PCA). The results were not satisfying at all and so we used another dimensional reduction method that is not linear: the Autoencoding.
After discovering what it was and its different variations, another challenge I met was to implement it. Then, using the reduced dimension data I plotted the points using the labels we had and we could easily notice that there were a clear distinction between walking with the phone in right or left pocket. I tried to use different Unsupervised Learning methods such as KMeans, DBSCAN, Spectral Clustering and Gaussian mixture to relabel the data but I couldn’t. Finally, I used SVM.
Finally, I put this work into production. The main challenge of that was to clean all the researches I made to combine all this process into one function that would be sent to the server.
To be able to do it I had the save my models (in order to not train them again on many data as it takes time) and from only one input send the output.
The production part was not something we used to do during the program and I think it improved a lot my computer science skills.
Achievements (according to KPIs)
I managed to recognize two positions of the phone during one activity: the walk. The work set the ground for the future development detailed below.
Once we progress with identifying different positions of the phone, we would like to extend it to other activities such as:
- Climbing down the stairs
- Climbing up the stairs
And also other position of the phone:
- In the hand
Julia’s project was to identify motion records’ contexts that include the user activity and the bodily position of the phone relative to the user. This task is one of the most fundamental stages of the gait analysis pipeline, which is the core of our product. Julia started from the understanding of the whole process of the motion analysis pipeline, from collected motion records to analyzed gait parameters. Then she applied simple ML tools to classify records collected during walking and to recognize the side of the pocket the phone was placed. Before she improved the methods, she investigated the differences between her results and the current process results.
Next, she worked on finding an embedding that distinguishes between activities and positions, using auto-encoders’ representations. Last, Julia examined how new activities that we don’t refer to them today behave along the process. For example, she examined how ‘going up-stairs’ look in the embedded space, and how fast and accurate we can detect a new activity like that one.
Julia’s work was impressive in the number of outputs and their quality. She did more than expected and more quickly, and always asked for feedback and new instructions. Her findings improved the current process, and her work is a good base for dealing with the challenge of the variety of activities and bodily positions. She is an excellent learner, eager to study new subjects, creative and independent. Julia has the character of a successful researcher and a great data scientist.