Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Script to upload operational files #65

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
7f2b46f
validate_url is set to false and timeout is set to 4 seconds which ma…
RohanSunkarapalli May 22, 2023
b5f9de1
black
May 26, 2023
ce82fbe
Merge branch 'AlabamaWaterInstitute:main' into main
RohanSunkarapalli May 30, 2023
d5c408d
The retrofiles script has been cleaned and it has functions, variable…
RohanSunkarapalli May 30, 2023
1dcff21
Merge branch 'AlabamaWaterInstitute:main' into main
RohanSunkarapalli May 31, 2023
c881048
Modified the script to make it more efficient
RohanSunkarapalli Jun 7, 2023
0bea2dc
Modified the script to run more efficiently
RohanSunkarapalli Jun 7, 2023
4dd015d
Added URL check and updated URL's input
RohanSunkarapalli Jun 8, 2023
d2dc329
Corrected year condition for forcing and model_output to generate URL's
RohanSunkarapalli Jun 8, 2023
37d81b2
format with black
Jun 13, 2023
9c08b9d
use helper file/function for multi-thread file check
Jun 13, 2023
ebb04b6
change test to use public Google URL
Jun 16, 2023
8a1c0e4
Merge branch 'AlabamaWaterInstitute:main' into main
RohanSunkarapalli Jun 16, 2023
e8f6f10
Merge branch 'AlabamaWaterInstitute:main' into main
RohanSunkarapalli Oct 11, 2023
099ec78
Python script to upload operational data into aws bucket
RohanSunkarapalli Oct 12, 2023
6588065
Delete nwm_filenames/operational_aws_api/.ipynb_checkpoints directory
RohanSunkarapalli Oct 12, 2023
6da5c26
Delete nwm_filenames/operational_aws/.ipynb_checkpoints directory
RohanSunkarapalli Oct 12, 2023
a765eff
Test cases added using pytest
RohanSunkarapalli Oct 15, 2023
c51b295
Corrected the bug in the url and added few specific test-cases
RohanSunkarapalli Oct 19, 2023
7c85e4a
Delete nwm_filenames/operational_aws/.ipynb_checkpoints directory
RohanSunkarapalli Oct 19, 2023
59ddc50
Delete nwm_filenames/operational_aws_api/.ipynb_checkpoints directory
RohanSunkarapalli Oct 19, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file not shown.
Binary file not shown.
Binary file not shown.
38 changes: 38 additions & 0 deletions nwm_filenames/operational_aws/filename_helpers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#from concurrent.futures import ThreadPoolExecutor
import gevent
import requests
from functools import partial
from tqdm import tqdm

def check_valid_urls(file_list, session=None):
"""if not session:
session = requests.Session()"""
t = tqdm(range(len(file_list)))
check_url_part = partial(check_url, t)
"""with ThreadPoolExecutor(max_workers=10) as executor:
valid_file_list = list(executor.map(check_url_part, file_list))"""
valid_file_list = [gevent.spawn(check_url_part, file_name) for file_name in file_list]
gevent.joinall(valid_file_list)
return [file.get() for file in valid_file_list if file.get() is not None]


def check_url(t, file):
filename = file.split("/")[-1]
try:
with requests.head(file) as response:
if response.status_code == 200:
t.set_description(f"Found: {filename}")
t.update(1)
t.refresh()
return file
else:
t.set_description(f"Not Found: {filename}")
t.update(1)
t.refresh()
return None
#response = session.head(file, timeout=1)
except requests.exceptions.RequestException:
t.set_description(f"Not Found: {filename}")
t.update(1)
t.refresh()
return None
Loading