Off policy RL for diffusion samplers

This is the repository that contains source code for the Off policy RL for diffusion samplers website.

If you find the work useful for your work please cite:

@inproceedings{
  venkatraman2024amortizing,
  title={Amortizing intractable inference in diffusion models for vision, language, and control},
  author={Siddarth Venkatraman and Moksh Jain and Luca Scimeca and Minsu Kim and Marcin Sendera and Mohsin Hasan and Luke Rowe and Sarthak Mittal and Pablo Lemos and Emmanuel Bengio and Alexandre Adam and Jarrid Rector-Brooks and Yoshua Bengio and Glen Berseth and Nikolay Malkin},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
  year={2024},
  url={https://openreview.net/forum?id=gVTkMsaaGI}
}

@inproceedings{
  sendera2024improved,
  title={Improved off-policy training of diffusion samplers},
  author={Marcin Sendera and Minsu Kim and Sarthak Mittal and Pablo Lemos and Luca Scimeca and Jarrid Rector-Brooks and Alexandre Adam and Yoshua Bengio and Nikolay Malkin},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
  year={2024},
  url={https://openreview.net/forum?id=vieIamY2Gi}
}

The website is forked from Understanding RLHF website.

Website License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Off policy RL for diffusion samplers

Website License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Off policy RL for diffusion samplers

Website License