Cohere For AI - Guest Speaker: Costa Huang, ML Engineer @ Hugging Face


Date: Oct 02, 2023

Time: 6:30 PM - 7:30 PM

Location: Online

Join our Reinforcement Learning Group as they welcome Costa Huang, Machine Learning Engineer at Hugging Face to present "Cleanba: A Reproducible Distributed Deep Reinforcement Learning Platform"

Description: Distributed Deep Reinforcement Learning (DRL) aims to leverage more computational resources to train autonomous agents with less training time. Despite recent progress in the field, reproducibility issues have not been sufficiently explored. This paper first shows that the typical actor-learner framework can have reproducibility issues even if hyperparameters are controlled. We then introduce Cleanba, a new open-source platform for distributed DRL that proposes a highly reproducible architecture. Cleanba implements highly-optimized distributed variants of PPO and IMPALA. Our Atari experiments show that these variants can obtain equivalent or higher scores than moolib and torchbeast's IMPALA, but with shorter training time and more reproducible learning curves.