The results for DST environment #19

thisishale · 2020-11-12T10:42:01Z

Hello, I tried the synthetic code on FTN env and the test results are close to the paper. However, for the DST env, I get 0.77 and 0.66 F1 for two runs. But in the paper, it is much higher. Do the settings below (the setting in the readme file) apply to the DST env too?
(I only changed --env-name to dst and --episode-num to 2000.)

python train.py --env-name dst --method crl-envelope --model linear --gamma 0.99 --mem-size 4000 --batch-size 256 --lr 1e-3 --epsilon 0.5 --epsilon-decay --weight-num 32 --episode-num 2000 --optimizer Adam --save crl/envelope/saved/ --log crl/envelope/logs/ --update-freq 100 --beta 0.01 --name 0

The text was updated successfully, but these errors were encountered:

thisishale · 2020-11-12T12:46:36Z

Also, there is a part of code that I can not relate to the paper. In the paper it is stated that the scalarized version is similar to CN+DER. The CN+DER in the reference paper outputs a vector Q for each action. In the code (naive), the output of the Q network is a scalar for each action. What am I missing here?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The results for DST environment #19

The results for DST environment #19

thisishale commented Nov 12, 2020

thisishale commented Nov 12, 2020

The results for DST environment #19

The results for DST environment #19

Comments

thisishale commented Nov 12, 2020

thisishale commented Nov 12, 2020