-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Add compute_statistics
subcommand
#336
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Francesc Marti Escofet <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how you integrated it into the cli tools. In general, I think it would be good to have this functionality in terratorch.utils
so that it can also be called from python code. Maybe you could just add a new statistics.py
and the Trainer calls this function with the train dataloader?
I am not sure how we could handle custom datasets which do not fit the expected pattern. Maybe there could be a second option to only pass a folder instead of a config?
Not sure if this could generalise better. What do you think could work well?
The main problem I had while integrating it into the CLI is the fact that an subcommand requires some parameters such as At the end, if we also allow to pass a folder we would need to build the dataset in the method and require more parameters to build the generic dataset. With this way, the user can use their custom datasets (the only requirement is that the datamodule returns a dataloader on I agree with saving the output in yaml and separating it into a new file, I'll do it |
Signed-off-by: Francesc Marti Escofet <[email protected]>
Thanks for the changes! You are right, we can expect some checks by the user to make sure it's working correctly.
What do you think? About passing a dataset folder, I agree with you that it probably makes it more complicated. Users have to create a config anyway at some point, so they can just do it before computing the statistics. |
Signed-off-by: Francesc Marti Escofet <[email protected]>
@blumenstiel The issue with changing it in the |
@fmartiescofet The dataests are build in the |
Signed-off-by: Francesc Marti Escofet <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
@Joao-L-S-Almeida @romeokienzler I think we can merge this one. |
This PR add the
compute_statistics
subcommand to the terratorch CLI to compute the mean and std of a dataset.It can be called with the same config file as any other subcommand:
terratorch compute_statistic --config <file>.yaml
The
kwargs
in the method is required as the Lightning CLI requires the model parameter and we need to consume it, see here.