-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICLR Reproducibility 2019: AutoLoss #147
base: master
Are you sure you want to change the base?
Conversation
Hi, please find below a review submitted by one of the reviewers: Score: 3 CODE COMMUNICATION WITH THE ORIGINAL AUTHORS HYPERPARAMETER SEARCH ABLATION STUDIES DISCUSSION ON RESULTS RECOMMENDATIONS FOR REPODUCIBILITY OVERALL ORGANIZATION AND CLARITY W1 The organization of the paper is quite confusing, as it goes back and forth on a set of 3 tasks. I would have been much clearer to compare AutoLoss vs Baseline vs Hand crafted Schedules for each task separately. W2 It is not clear whether the baseline and hand crafted schedules were in the original ICLR submission or are introduced by the report authors. W3 The references to Figure 3, 4 and 5 and confusing, as they both show two plots. W4 When commenting on Figure 4, the text talks about the 'training loss (blue)', but the blue line is labelled as 'classification loss'. This is confusing. W5 Graphs have more colors in the legend than in the actual graph. W6 The lambda hyperparamter is not defined. Maybe it is in the ICLR submission and, if this is the case, it should be briefly defined again in this report. The text should be self-contained. W7 In Section 3.1, there is a claim that the model overfits, but I do not really see where or how. I would expect a validation loss growing while a training loss would decrease, which I do not see in any of the two plots in the referred Figure 2. W8 In Section 3.1 it refers to 'as we can see' (where ?), or in Section 3.2 it refers to 'above' instead of referring to the exact Section. All these vague references make it very hard for a reader to follow the report and need to be treated carefully. |
Hi, please find below a review submitted by one of the reviewers: Score: 3 Problem Statement: The problem statement provided in the report is clear. Code: Both the authors of the report and the writers of the paper have released their codes. The report does not mention whether the authors used the code provided by the writers. We would like to know the degree to which the code developed by the authors is based on that of the writers. The code provided by the authors is well structured and is accompanied with detailed instructions required for its execution. However, we did not execute the code. Communication with original authors: The authors communicated with the writers. However, the authors claim that the writers did not provide sufficient support for a complete review. Specifically, the authors claim that they were unable to obtain the hyper-parameters used by the writers. Hyperparameter search: The authors perform a grid search with various settings of the regularizer lambda, as done by the authors. However, the authors do not experiment with the hyper-parameters beyond that. The authors do not experiment with other hyperparameters of the controller, albeit it is not a must to perform these hyper-parameter searches in my opinion. Ablation Study: Discussion on results: It would help me evaluate the report better. Recommendations for reproducibility: It is unclear - Are figures 4 and 5 separately plotted figures Auto-loss schedules with handcrafted and joint minimization schedules on the same task? There is a lack of clarity due to seemingly missing lines in the plots (lines are plotted on top of each other). Please mention that the lines are plotted on top of each other in the plot captions. Overall organization and clarity: I am willing to upgrade my score if the report addresses my concerns. Confidence : 4 |
Hi, please find below a review submitted by one of the reviewers: Score: 4 Some points regarding discussion:
Overall organization needs a bit more work. Instead of going back and forth, I recommend trying to frame the main claims given in the paper and then present the reproducibility study alongside with it. |
Submission of AutoLoss reproducibility report for ICLR 2019 challenge.
Issue number: #89