Nathan S. de Lara

Learning and Searching hoping to find algorithms that do that too

profile_pic.jpg

Toronto, Ontario

nathan[lastname]1@gmail.com

Google Scholar

I am a Master of Science (MSc) student at the University of Toronto in the Robot Vision and Learning Lab Supervised by Prof. Florian Shkurti. Previously, I obtained a B.A. at McGill University where I was fortunate to conduct research advised advised by Prof. Doina Precup and Prof. Russell Steele.

My main driving question during my masters: How can RL scale better than BC for Pre-Training?

Recent works suggest RL should outperform BC when trained on large noisy datasets containing suboptimal demonstrations. This setting bears a strong resemblance to the majority of internet data out there. Yet, RL has failed to be used in favour of BC for large-scale pre-training. My research goal is to help get RL to a place where it reliably surpasses BC. In my masters I have been focusing on answering questions which I believe will help this push.

News

Sep 17, 2025 Our paper STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation was accepted to NeurIPS 2025 as a spotlight! 🎉
Sep 01, 2024 Started my Masters of Science at the University of Toronto with Prof. Florian Shkurti
Aug 05, 2024 Presented work on the representation collapse experience by recurrent networks when applied Continual Reinforcement Learning at the Can’t Believe It’s Not Better Workshop: Failure Modes of Sequential Decision-Making in Practice hosted at RLC
Apr 15, 2024 Graduated from McGill University with a Bachelor of Arts in the Honours Mathematics and Computer Science Program!

Selected Publications

  1. stitch_ope.png
    STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
    Hossein Goli, Michael Gimelfarb, Nathan Samuel Lara, and 3 more authors
    2025
  2. icbc_workshop.png
    Recurrent Policies Are Not Enough for Continual Reinforcement Learning
    Nathan Samuel Lara, Veronica Chelu, and Doina Precup
    In I Can’t Believe It’s Not Better Workshop: Failure Modes of Sequential Decision-Making in Practice (RLC 2024), 2024
  3. deepvent_graph.png
    Towards safe mechanical ventilation treatment using deep offline reinforcement learning
    Flemming Kondrup, Thomas Jiralerspong, Elaine Lau, and 5 more authors
    In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, 2023