Applied reinforcement learning project inspired by a medical control paper
Chemotherapy Dose Control with Q-Learning
Reinforcement learning for adaptive treatment policies under constraints
This project sits at the intersection of control, reinforcement learning, and applied modeling. A pharmacological system is simulated with differential equations, while a Q-learning agent learns dosing policies that reduce tumor burden while managing normal-cell preservation constraints.
50k
Training episodes
15
Simulated patients
43 days
Mean eradication
Problem
A treatment policy must balance competing objectives: reducing tumor cells quickly without destroying healthy cells or violating scenario-specific safety constraints.
Approach
Modeled the treatment dynamics, formalized the control problem as an MDP, trained a tabular Q-learning agent over 50,000 episodes, then evaluated nominal, constrained, and perturbed patient scenarios including pregnancy and comorbidity cases.
Results
Across the 15-patient simulation, the mean tumor eradication time was about 43 days, and the learned controller remained stable under parameter perturbations ranging from minus 10 percent to plus 15 percent.
What is in the repository
Role and scope
Mathematical modeling, RL agent design, robustness analysis, and scenario simulation
Project context
Applied reinforcement learning project inspired by a medical control paper
Main stack