The DNA fragment assembly problem is an important and active field of research with applications in every field of biology. Exploratory work has previously been done to attempt to apply reinforcement learning to fragment assembly - namely, Q-learning was used on an episodic Markov Decision Process (MDP) model of the problem. For my final project in Stat 234 (a graduate reinforcement learning seminar taught by Professor Susan Murphy), I implemented Real-Time Dynamic Programming (RTDP), a learning algorithm that is more suited for searching sparse MDPs, and found improved performance over Q-learning in simulations on sythetic microgenomes. However, issues with memory requirements and the lack of flexibility when using an MDP formulation re-establish that such approaches are likely infeasible at scale.