Skip to content

Latest commit

 

History

History
29 lines (24 loc) · 1.72 KB

README.md

File metadata and controls

29 lines (24 loc) · 1.72 KB

Off policy RL for diffusion samplers

This is the repository that contains source code for the Off policy RL for diffusion samplers website.

If you find the work useful for your work please cite:

@inproceedings{
  venkatraman2024amortizing,
  title={Amortizing intractable inference in diffusion models for vision, language, and control},
  author={Siddarth Venkatraman and Moksh Jain and Luca Scimeca and Minsu Kim and Marcin Sendera and Mohsin Hasan and Luke Rowe and Sarthak Mittal and Pablo Lemos and Emmanuel Bengio and Alexandre Adam and Jarrid Rector-Brooks and Yoshua Bengio and Glen Berseth and Nikolay Malkin},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
  year={2024},
  url={https://openreview.net/forum?id=gVTkMsaaGI}
}

@inproceedings{
  sendera2024improved,
  title={Improved off-policy training of diffusion samplers},
  author={Marcin Sendera and Minsu Kim and Sarthak Mittal and Pablo Lemos and Luca Scimeca and Jarrid Rector-Brooks and Alexandre Adam and Yoshua Bengio and Nikolay Malkin},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
  year={2024},
  url={https://openreview.net/forum?id=vieIamY2Gi}
}

The website is forked from Understanding RLHF website.

Website License

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.