Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalise scripts so any user can run them #16

Open
sadielbartholomew opened this issue Oct 29, 2020 · 1 comment
Open

Generalise scripts so any user can run them #16

sadielbartholomew opened this issue Oct 29, 2020 · 1 comment

Comments

@sadielbartholomew
Copy link
Contributor

The scripts under scripts/ rely on hard-coded filesystem paths, e.g. from compute_ivt.py:

#--------------Globals------------------------------------------
#-----------uflux----------------------
UFLUX_FILE='/home/guangzhi/datasets/erai_qflux/uflux_m1-60_6_2007_cln-cea-proj.nc'
UFLUX_VARID='uflux'

#-----------vflux----------------------
VFLUX_FILE='/home/guangzhi/datasets/erai_qflux/vflux_m1-60_6_2007_cln-cea-proj.nc'
VFLUX_VARID='vflux'

OUTPUTFILE='/home/guangzhi/datasets/quicksave2/THR/ivt_m1-60_6_2007_crop2.nc';

and therefore when any user other than yourself tries to run them they run into obvious errors relating to those files not being found, for example I observe:

$ python compute_ivt.py 
Traceback (most recent call last):
  File "compute_ivt.py", line 37, in <module>
    ufluxNV=funcs.readNC(UFLUX_FILE, UFLUX_VARID)
  File "/home/sadie/IPART/ipart/utils/funcs.py", line 775, in readNC
    fin=Dataset(abpath_in, 'r')
  File "netCDF4/_netCDF4.pyx", line 2358, in netCDF4._netCDF4.Dataset.__init__
  File "netCDF4/_netCDF4.pyx", line 1926, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'/home/guangzhi/datasets/erai_qflux/uflux_m1-60_6_2007_cln-cea-proj.nc'
$ python detect_ARs.py
Traceback (most recent call last):
  File "detect_ARs.py", line 187, in <module>
    quNV=funcs.readNC(UQ_FILE_NAME, UQ_VAR)
  File "/home/sadie/IPART/ipart/utils/funcs.py", line 775, in readNC
    fin=Dataset(abpath_in, 'r')
  File "netCDF4/_netCDF4.pyx", line 2358, in netCDF4._netCDF4.Dataset.__init__
  File "netCDF4/_netCDF4.pyx", line 1926, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'/home/guangzhi/datasets/erai_qflux/uflux_m1-60_6_2007_cln-cea-proj.nc'

I appreciate your intention was to include these as templates, but I think in this form, where they rely on resources not included in the repo and reference your personal paths and hence error immediately, they are not very useful to users. This is in contrast to the notebooks which are very useful as there is no user-specific stipulation and all of the datasets are provided in the repo, so everything should work for anyone, as they did when I tested them, assuming they have installed IPART and dependencies and have the right general environment.

To allow users to make good use of the scripts, I suggest adding new datasets to the repo that can be pointed to via those variables such as UFLUX_FILE that users can use to run the scripts on, with some brief guidance stating what users should change to point to their own datasets, etc., instead, to explore the scripts and capability.

(As noted when reviewing towards openjournals/joss-reviews#2407.)

@Xunius
Copy link
Collaborator

Xunius commented Oct 29, 2020

It is not practical to add data to the repo for this type of task. The data people use for detecting such things came in GBs for real world applications. The notebook is just a toy example. So, no, I'm not adding new datasets, and I think the scripts folder is optional to begin with, when installing via conda one doesn't even get the scripts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants