To realize the potential of autonomous underwater robots that scale up our observational capacity in the ocean, new techniques are needed. Fleets of autonomous robots could be used to study complex marine systems and animals with either new imaging configurations or by tracking tagged animals to study their behavior. These activities can then inform and create new policies for community conservation. The role of animal connectivity via active movement of animals represents a major knowledge gap related to the distribution of deep ocean populations. Tracking underwater targets represents a major challenge for observing biological processes in situ, and methods to robustly respond to a changing environment during monitoring missions are needed. Analytical techniques for optimal sensor placement and path planning to locate underwater targets are not straightforward in such cases. The aim of this study was to investigate the use of reinforcement learning as a tool for range-only underwater target-tracking optimization, whose promising capabilities have been demonstrated in terrestrial scenarios. To evaluate its usefulness, a reinforcement learning method was implemented as a path planning system for an autonomous surface vehicle while tracking an underwater mobile target. A complete description of an open-source model, performance metrics in simulated environments, and evaluated algorithms based on more than 15 hours of at-sea field experiments are presented. These efforts demonstrate that deep reinforcement learning is a powerful approach that enhances the abilities of autonomous robots in the ocean and encourages the deployment of algorithms like these for monitoring marine biological systems in the future.