Skip to content

Commit

Permalink
Next dev merge/shoal xarray (#129)
Browse files Browse the repository at this point in the history
* Rewrote shoal mask to use dask

Merge/ariza (#130)

Initial implementation of the seabed masking.
  • Loading branch information
ruxandra-valcu authored and beatfactor committed May 24, 2024
1 parent 3534890 commit 255591a
Show file tree
Hide file tree
Showing 18 changed files with 1,097 additions and 619 deletions.
6 changes: 6 additions & 0 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ on:

env:
NUM_WORKERS: 2
TEST_DATA_FOLDER: ${{ github.workspace }}/test_data

jobs:
test:
Expand Down Expand Up @@ -62,6 +63,11 @@ jobs:
# Check data endpoint
curl http://localhost:8080/data/
# Add the following steps for downloading and unzipping test data
- name: Download Test Data from Google Drive
run: |
wget --load-cookies /tmp/cookies.txt "https://drive.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://drive.google.com/uc?export=download&id=1ofiSQ4zDwXfHE65tow4_jDIceBYHNW_8' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1ofiSQ4zDwXfHE65tow4_jDIceBYHNW_8" -O test_data.zip && rm -rf /tmp/cookies.txt
unzip -n test_data.zip -d ${{ env.TEST_DATA_FOLDER }}
- name: Finding changed files
id: files
uses: Ana06/[email protected]
Expand Down
27 changes: 27 additions & 0 deletions .github/workflows/windows-utils.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ on:

env:
CONDA_ENV: echopype
TEST_DATA_FOLDER: ${{ github.workspace }}\\test_data

jobs:
windows-test:
Expand Down Expand Up @@ -64,6 +65,32 @@ jobs:
- name: Install echopype
run: |
python -m pip install -e .
# Add steps for downloading and unzipping test data
- name: Create Test Data Directory
run: New-Item -ItemType Directory -Force -Path ${{ env.TEST_DATA_FOLDER }}
- name: Install wget
run: choco install wget
shell: powershell
#- name: Download Test Data from Google Drive
# run: |
# $downloadLink = "https://drive.google.com/uc?export=download&id=1ofiSQ4zDwXfHE65tow4_jDIceBYHNW_8"
# $downloadPage = Invoke-WebRequest -Uri $downloadLink -OutFile "download.html"
# $confirmCode = (Select-String -Path "download.html" -Pattern 'confirm=([0-9A-Za-z_]+)' -AllMatches).Matches.Groups[1].Value
# $downloadFileLink = "https://drive.google.com/uc?export=download&confirm=$confirmCode&id=1ofiSQ4zDwXfHE65tow4_jDIceBYHNW_8"
# Invoke-WebRequest -Uri $downloadFileLink -OutFile "test_data.zip"
- name: Download Test Data from Google Drive
run: |
Remove-Item alias:wget
wget --quiet --save-cookies cookies.txt --keep-session-cookies --no-check-certificate 'https://drive.google.com/uc?export=download&id=1ofiSQ4zDwXfHE65tow4_jDIceBYHNW_8' -O- | Out-String | Select-String -Pattern 'confirm=([0-9A-Za-z_]+)' | %{ $_.Matches.Groups[1].Value } > confirm.txt
$confirmCode = Get-Content confirm.txt
wget --load-cookies cookies.txt "https://drive.google.com/uc?export=download&confirm=$confirmCode&id=1ofiSQ4zDwXfHE65tow4_jDIceBYHNW_8" -O test_data.zip
Remove-Item cookies.txt -Force
Remove-Item confirm.txt -Force
shell: powershell

- name: Unzip Test Data
run: Expand-Archive -LiteralPath "test_data.zip" -DestinationPath ${{ env.TEST_DATA_FOLDER }} -Force

- name: Running all Tests
run: |
pytest -vvv -rx --cov=echopype --cov-report=xml --log-cli-level=WARNING --disable-warnings echopype/tests/utils |& tee ci_${{ matrix.python-version }}_test_log.log
36 changes: 25 additions & 11 deletions echopype/clean/signal_attenuation.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,24 @@
import numpy as np
import xarray as xr

from ..utils.mask_transformation_xr import lin as _lin, line_to_square, log as _log

DEFAULT_RYAN_PARAMS = {"r0": 180, "r1": 280, "n": 30, "thr": -6, "start": 0}
from ..utils.mask_transformation_xr import (
lin as _lin,
line_to_square,
log as _log,
rolling_median_block,
)

# import dask.array as da


DEFAULT_RYAN_PARAMS = {
"r0": 180,
"r1": 280,
"n": 30,
"thr": -6,
"start": 0,
"dask_chunking": {"ping_time": 100, "range_sample": 100},
}
DEFAULT_ARIZA_PARAMS = {"offset": 20, "thr": (-40, -35), "m": 20, "n": 50}


Expand Down Expand Up @@ -45,15 +60,17 @@ def _ryan(source_Sv: xr.DataArray, desired_channel: str, parameters=DEFAULT_RYAN
Returns:
xr.DataArray: boolean array with AS mask, with ping_time and range_sample dims
"""
parameter_names = ("r0", "r1", "n", "thr", "start")
parameter_names = ("r0", "r1", "n", "thr", "start", "dask_chunking")
if not all(name in parameters.keys() for name in parameter_names):
raise ValueError(
"Missing parameters - should be r0, r1, n, thr, start, are" + str(parameters.keys())
"Missing parameters - should be r0, r1, n, thr, start, dask_chunking are"
+ str(parameters.keys())
)
r0 = parameters["r0"]
r1 = parameters["r1"]
n = parameters["n"]
thr = parameters["thr"]
dask_chunking = parameters["dask_chunking"]
# start = parameters["start"]

channel_Sv = source_Sv.sel(channel=desired_channel)
Expand Down Expand Up @@ -85,13 +102,10 @@ def _ryan(source_Sv: xr.DataArray, desired_channel: str, parameters=DEFAULT_RYAN
layer_mask = (Sv["range_sample"] >= up) & (Sv["range_sample"] <= lw)
layer_Sv = Sv.where(layer_mask)

# Creating shifted arrays for block comparison
shifted_arrays = [layer_Sv.shift(ping_time=i) for i in range(-n, n + 1)]
block = xr.concat(shifted_arrays, dim="shifted_ping_time")
layer_Sv_chunked = layer_Sv.chunk(dask_chunking)

# Computing the median of the block and the pings
ping_median = layer_Sv.median(dim="range_sample", skipna=True)
block_median = block.median(dim=["range_sample", "shifted_ping_time"], skipna=True)
block_median = rolling_median_block(layer_Sv_chunked.data, window_half_size=n, axis=0)
ping_median = layer_Sv_chunked.median(dim="range_sample", skipna=True)

# Creating the mask based on the threshold
mask_condition = (ping_median - block_median) > thr
Expand Down
10 changes: 4 additions & 6 deletions echopype/clean/transient_noise.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,14 +27,15 @@
lin as _lin,
line_to_square,
log as _log,
rolling_median_block,
)

RYAN_DEFAULT_PARAMS = {
"m": 5,
"n": 5,
"n": 20,
"thr": 20,
"excludeabove": 250,
"operation": "mean",
"operation": "median",
"dask_chunking": {"ping_time": 100, "range_sample": 100},
}
FIELDING_DEFAULT_PARAMS = {
Expand Down Expand Up @@ -231,10 +232,7 @@ def _fielding(

ping_median = Sv_range.median(dim="range_sample", skipna=True)
ping_75q = Sv_range.reduce(np.nanpercentile, q=75, dim="range_sample")

shifted_arrays = [Sv_range.shift(ping_time=i) for i in range(-n, n + 1)]
block = xr.concat(shifted_arrays, dim="shifted_ping_time")
block_median = block.median(dim=["range_sample", "shifted_ping_time"], skipna=True)
block_median = rolling_median_block(Sv_range.data, window_half_size=n, axis=0)

# identify columns in which noise can be found
noise_col = (ping_75q < maxts) & ((ping_median - block_median) < thr[0])
Expand Down
1 change: 1 addition & 0 deletions echopype/mask/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from .api import apply_mask, frequency_differencing, get_seabed_mask, get_shoal_mask

__all__ = ["frequency_differencing", "apply_mask", "get_seabed_mask", "get_shoal_mask"]

31 changes: 7 additions & 24 deletions echopype/mask/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -579,11 +579,6 @@ def get_shoal_mask(
mask: xr.DataArray
A DataArray containing the mask for the Sv data. Regions satisfying the thresholding
criteria are filled with ``True``, else the regions are filled with ``False``.
mask_: xr.DataArray
A DataArray containing the mask for areas in which shoals were searched.
Edge regions are filled with 'False', whereas the portion
in which shoals could be detected is 'True'
Raises
------
Expand Down Expand Up @@ -630,10 +625,6 @@ def get_shoal_mask_multichannel(
A DataArray containing the multichannel mask for the Sv data.
Regions satisfying the thresholding criteria are filled with ``True``,
else the regions are filled with ``False``.
mask_: xr.DataArray
A DataArray containing the multichannel mask for areas in which shoals were searched.
Edge regions are filled with 'False', whereas the portion
in which shoals could be detected is 'True'
Raises
Expand Down Expand Up @@ -669,10 +660,9 @@ def get_seabed_mask(
a Dataset. This input must correspond to a Dataset that has the
coordinate ``channel`` and variables ``frequency_nominal`` and ``Sv``.
desired_channel: str - channel to generate the mask for
desired_freuency: int - desired frequency, in case the channel isn't directly specified
method: str with either "ariza", "experimental", "blackwell_mod",
"blackwell", "deltaSv", "maxSv"
based on the preferred method for seabed mask generation
desired_frequency: int - desired frequency, in case the channel isn't directly specified
method: str with either "ariza", "blackwell", based on the preferred method
for seabed mask generation
Returns
-------
xr.DataArray
Expand All @@ -682,8 +672,7 @@ def get_seabed_mask(
Raises
------
ValueError
If neither "ariza", "experimental", "blackwell_mod",
"blackwell", "deltaSv", "maxSv" are given
If neither "ariza", "blackwell" are given
Notes
-----
Expand All @@ -696,11 +685,7 @@ def get_seabed_mask(
source_Sv = get_dataset(source_Sv)
mask_map = {
"ariza": seabed._ariza,
"experimental": seabed._experimental,
"blackwell": seabed._blackwell,
"blackwell_mod": seabed._blackwell_mod,
"delta_Sv": seabed._deltaSv,
"max_Sv": seabed._maxSv,
}

if method not in mask_map.keys():
Expand Down Expand Up @@ -729,9 +714,8 @@ def get_seabed_mask_multichannel(
else it specifies the path to a zarr or netcdf file containing
a Dataset. This input must correspond to a Dataset that has the
coordinate ``channel`` and variables ``frequency_nominal`` and ``Sv``.
method: str with either "ariza", "experimental", "blackwell_mod",
"blackwell", "deltaSv", "maxSv"
based on the preferred method for seabed mask generation
method: str with either "ariza", "blackwell"
based on the preferred method for seabed mask generation
Returns
-------
xr.DataArray
Expand All @@ -741,8 +725,7 @@ def get_seabed_mask_multichannel(
Raises
------
ValueError
If neither "ariza", "experimental", "blackwell_mod",
"blackwell", "deltaSv", "maxSv" are given
If neither "ariza" or "blackwell" are given
Notes
-----
Expand Down
Loading

0 comments on commit 255591a

Please sign in to comment.