Upcoming 2021 summer projects to look forward to #324

findmyway · 2021-06-14T02:39:03Z

findmyway
Jun 14, 2021
Maintainer

A brief summary

GSoC

This summer, we'll have one GSoC student (@Sid-Bhatia-0 ) working on adding more multi-agent environments into GridWorlds.jl. In the meanwhile, he will try to address #121 and test our existing algorithms with environments written on GPU directly. Some ongoing work are tracked here

OSPP

Establish a General Pipeline for Offline Reinforcement Learning Evaluation

@Mobius1D is the only student who applied to this project. He proved his understanding of this package in some previous PRs. So I'd be happy to mentor him. He'll work on creating an independent offline dataset package and add some common benchmark algorithms like BCQ, CQL.

Enriching Offline Reinforcement Learning Algorithms in ReinforcementLearning.jl

@pilgrimygy and @Mobius1D are the only two students who applied to this project. @pilgrimygy has worked on POMDP before and also contributed several meaningful PRs in this package. Based on the rule, only one student can be selected for each project. So I'd select @pilgrimygy for this project. He'll cooperate with @Mobius1D and work on adding more advanced offline RL algorithms.

Implement Multi-Agent Reinforcement Learning Algorithms in Julia

This project received four proposals from @harshit2000 , Yudong Zhao, @peterchen96 , and @yangzm11. However, I can't find any Julia related contributions from all their public information, which makes me hard to decide who is more appropriate for this project. To make it fair, I encourage them try to implement one multi-agent algorithm independently before the Jun 20th (the last day to submit my decision). NFSP is recommended. I'll email this message to them separately. No matter whom I select in the end, I hope you all can enjoy this process and learn something new. 💪

For me, I'll work on a new actor based system to improve the distributed rl algorithms and may cooperate with @jonathan-laurent in the meanwhile. Some potential outputs are:

Parameter server (A2C)
Async offline training pipeline (this may be general enough to serve as a data loader)
Model as service (this can be made very fast with adaptive memory usage. Flux as service #154 ) The batched oracle in ALphaZeor.jl can also benefit from it.
Easier implementation of evolution algorithms

Cheers!

findmyway · 2021-08-10T07:13:07Z

findmyway
Aug 10, 2021
Maintainer Author

Hi @peterchen96 @pilgrimygy @Mobius1D ,

Considering that we are reaching the first evaluation of OSPP, I'd suggest you take a break from the code development and write a technical report this week. Basically, it may contain the following parts:

What you've done until now? (With links to your code)
How are they implemented?
How can others leverage what you've done?
What have you learned during this process?
Your plan in the next several weeks?

You should assume that readers are new to RL.jl, or even without any knowledge of RL. The draft version can be submitted through PR to the blog subfolder first. And I'll provide some feedback on it. Once it gets merged, you should submit it in the OSPP system portal before next Monday.

For those pending PRs, I'll try to review them ASAP.

Jun Tian

3 replies

pilgrimygy Aug 10, 2021
Collaborator

I will finish my technical report as soon as possible.

peterchen96 Aug 10, 2021
Collaborator

Got it! I'll try to upload the report before Friday. By the way, does the report have a template? @findmyway

findmyway Aug 10, 2021
Maintainer Author

It's said there's no template this year. But you need to fill in all the necessary information in the portal. I can't see the portal on the students' side so let me know if you have any problems when submitting the report.

findmyway · 2021-08-16T06:34:34Z

findmyway
Aug 16, 2021
Maintainer Author

Hi @peterchen96 @pilgrimygy @Mobius1D ,

Based on your first evaluation reports, I'd like to change our schedule slightly. Instead of posting your plans and works at github discussions, please make PR to update your reports continuously. That would make things easier for you (and me) in the second evaluation.

Thanks for your great work!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upcoming 2021 summer projects to look forward to #324

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Upcoming 2021 summer projects to look forward to #324

findmyway Jun 14, 2021 Maintainer

A brief summary

GSoC

OSPP

Establish a General Pipeline for Offline Reinforcement Learning Evaluation

Enriching Offline Reinforcement Learning Algorithms in ReinforcementLearning.jl

Implement Multi-Agent Reinforcement Learning Algorithms in Julia

Replies: 2 comments · 3 replies

findmyway Aug 10, 2021 Maintainer Author

pilgrimygy Aug 10, 2021 Collaborator

peterchen96 Aug 10, 2021 Collaborator

findmyway Aug 10, 2021 Maintainer Author

findmyway Aug 16, 2021 Maintainer Author

findmyway
Jun 14, 2021
Maintainer

Replies: 2 comments 3 replies

findmyway
Aug 10, 2021
Maintainer Author

pilgrimygy Aug 10, 2021
Collaborator

peterchen96 Aug 10, 2021
Collaborator

findmyway Aug 10, 2021
Maintainer Author

findmyway
Aug 16, 2021
Maintainer Author