Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The results for DST environment #19

Open
thisishale opened this issue Nov 12, 2020 · 1 comment
Open

The results for DST environment #19

thisishale opened this issue Nov 12, 2020 · 1 comment

Comments

@thisishale
Copy link

Hello, I tried the synthetic code on FTN env and the test results are close to the paper. However, for the DST env, I get 0.77 and 0.66 F1 for two runs. But in the paper, it is much higher. Do the settings below (the setting in the readme file) apply to the DST env too?
(I only changed --env-name to dst and --episode-num to 2000.)

python train.py --env-name dst --method crl-envelope --model linear --gamma 0.99 --mem-size 4000 --batch-size 256 --lr 1e-3 --epsilon 0.5 --epsilon-decay --weight-num 32 --episode-num 2000 --optimizer Adam --save crl/envelope/saved/ --log crl/envelope/logs/ --update-freq 100 --beta 0.01 --name 0

@thisishale
Copy link
Author

Also, there is a part of code that I can not relate to the paper. In the paper it is stated that the scalarized version is similar to CN+DER. The CN+DER in the reference paper outputs a vector Q for each action. In the code (naive), the output of the Q network is a scalar for each action. What am I missing here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant