NitroGen: An Open Foundation Model for Generalist Gaming Agents
Paper
•
2601.02427
•
Published
•
37
NitroGen is a unified vision-to-action foundation model designed to play video games directly from raw frames. It is a generalist agent trained via large-scale behavior cloning on 40,000 hours of gameplay across over 1,000 games. It maps RGB video footage to gamepad actions.
NitroGen works best on games designed for gamepad controls (e.g., action, platformer, and racing games) and is less effective on games that rely heavily on mouse and keyboard (e.g., RTS, MOBA).
To use NitroGen, clone and install the repository:
git clone https://github.com/MineDojo/NitroGen.git
cd NitroGen
pip install -e .
hf download nvidia/NitroGen ng.pt
python scripts/serve.py <path_to_ng.pt>
python scripts/play.py --process '<game_executable_name>.exe'
If you find NitroGen useful in your research, please cite:
@misc{magne2026nitrogen,
title={NitroGen: An Open Foundation Model for Generalist Gaming Agents},
author={Loïc Magne and Anas Awadalla and Guanzhi Wang and Yinzhen Xu and Joshua Belofsky and Fengyuan Hu and Joohwan Kim and Ludwig Schmidt and Georgia Gkioxari and Jan Kautz and Yisong Yue and Yejin Choi and Yuke Zhu and Linxi "Jim" Fan},
year={2026},
eprint={2601.02427},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2601.02427},
}
Governing Terms: NVIDIA License. The model uses a SigLip2 backbone which is licensed under Apache 2.0.