Abstract
Developing methods for identifying medication adverse events using a large medical dataset including millions of visits. The process includes constructing case-studies of established pairs of medication and adverse events, defining relevant experiment and control groups, outcomes and covariates of interest, and relevant time windows. The end-to-end pipeline includes data acquisition, cleaning, preparation, analysis and visualization using SQL and Python. These models will then be used to expand the capabilities of interpreting patient symptoms, and augment physician decision making in tailoring medication use.
Challenges
- Learning the relevant medical background and company’s terminology
- Getting familiar with the DB structure, tables and fields
- Defining the suitable research design (e.g. time order of medical conditions, drugs and adverse events)
- Finding ways to cut long running times and make the code more efficient
Achievements (according to KPIs)
- Applying efficient SQL and Python code
- Applying two end-to-end pipelines of medication and adverse event pairs
- Thinking of relevant confounding variables and identifying systematic biases
- Cutting running times from a few hours to less than a minute
- The research will continue to further development steps
Further development
- Add matching between groups
- Run the code on the larger datasets (“Phase2”)
- Use more than one sample for each patient, potentially by using mixed effect models with patient as random effect
- Use the absence of an adverse event (“sign = 0”) as relevant information
- Tune different parameters of the model (e.g. time interval between medicine and adverse event)
- Think and check more medication and adverse event pairs
- Build a generalized model of adverse event detection on the population level, and adverse event classification in the individual patient
- Use this model to expand the capabilities of interpreting patient symptoms, and augment physician decision making in tailoring medication use