Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix(requirements): bump dependencies, mainly to eliminate third-party security issues #1383

Merged
merged 53 commits into from
Jul 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
71a54ff
Fix(requirements): bump dependencies, mainly to eliminate third-party…
RollerKnobster Jun 24, 2024
4064a8e
Fix(requirements): bump test dependencies, mainly to eliminate third-…
RollerKnobster Jun 24, 2024
695c592
Fix(requirements): use scikeras to bring back keras wrappers
RollerKnobster Jun 24, 2024
9a75758
Fix: apply formatting
RollerKnobster Jun 24, 2024
01eac86
Fix: add `super().__init__(**kwargs)` call to `KerasBaseEstimator`
RollerKnobster Jun 24, 2024
d6faadc
Fix: rename `build_fn` to `model`
RollerKnobster Jun 24, 2024
d47855e
Fix: move `super().__init__(**kwargs)` to start of `KerasBaseEstimato…
RollerKnobster Jun 24, 2024
77bc40c
Fix: i'm done with comments for now
RollerKnobster Jun 24, 2024
bede872
Fix: remove redundant `BaseEstimator` inheritance in `KerasBaseEstima…
RollerKnobster Jun 24, 2024
06de56f
Fix: remove redundant `BaseWrapper` alias for `KerasRegressor`
RollerKnobster Jun 24, 2024
5eee699
Fix: add super call to KerasBaseEstimator
RollerKnobster Jun 24, 2024
3b2bca3
Fix: no comments
RollerKnobster Jun 24, 2024
78fde46
Fix: refactor `__call__` method of models to `_prepare_model`
RollerKnobster Jun 24, 2024
31a59a8
Fix: move assignment of `kind` above the `_prepare_model` call
RollerKnobster Jun 24, 2024
eaf24b2
Fix: prepare model right before calling fit in keras models
RollerKnobster Jun 24, 2024
757d207
Fix: rename `self.history` to `self._history` in `KerasBaseEstimator`
RollerKnobster Jun 24, 2024
b64dc08
Fix: rename `self.kind` and `self.kwargs` to `self._kind` and `self._…
RollerKnobster Jun 24, 2024
fe3530b
Fix: refactor `getattr(keras.optimizers, optimizer)` to `keras.optimi…
RollerKnobster Jun 24, 2024
9200d07
Fix: refactor `getattr(keras.optimizers, optimizer)` to `keras.optimi…
RollerKnobster Jun 24, 2024
3f2a313
Fix: refactor `getattr(keras.optimizers, optimizer)` to `keras.optimi…
RollerKnobster Jun 24, 2024
c520dbb
Fix: remove `save_format` param from `save_model` in `KerasBaseEstima…
RollerKnobster Jun 24, 2024
b93af57
Fix: remove `save_format` param from `save_model` in `KerasBaseEstima…
RollerKnobster Jun 24, 2024
3696b11
Fix: formatting
RollerKnobster Jun 24, 2024
92a537f
Fix: change saving to tempfile
RollerKnobster Jun 24, 2024
7c8c140
Fix: formatting
RollerKnobster Jun 24, 2024
0150f0f
Fix: save model as .keras temp file
RollerKnobster Jun 24, 2024
8f738fb
Fix: save model as .keras temp file
RollerKnobster Jun 25, 2024
a8dbb8c
Fix: save model as .keras temp file
RollerKnobster Jun 25, 2024
7f4e2b3
Fix: load model as .keras temp file
RollerKnobster Jun 25, 2024
5920784
Fix: load model as .keras temp file
RollerKnobster Jun 25, 2024
f138c27
Fix: load model as .keras temp file
RollerKnobster Jun 25, 2024
c6000a2
Fix: adjust test for argo versions
RollerKnobster Jun 25, 2024
dd2f98a
Fix: save bytes instead of bytesio to model state
RollerKnobster Jun 25, 2024
0c2aef8
Fix: skip loading unitialized model from state
RollerKnobster Jun 25, 2024
c2dbc6b
Fix: adjust test for serializer
RollerKnobster Jun 25, 2024
0ca65e3
Fix: adjust test for serializer
RollerKnobster Jun 25, 2024
8432d16
Fix: rename `KerasLSTMBaseEstimator` attributes to underscored prefixed
RollerKnobster Jun 25, 2024
45155e4
Fix: rename `KerasLSTMBaseEstimator` attributes to underscored prefixed
RollerKnobster Jun 25, 2024
a7ba4fc
Fix: do not propagate kwargs to `super().__init__` of KerasBaseEstimator
RollerKnobster Jun 25, 2024
c97c5a5
Fix: do not propagate kwargs to `super().__init__` of KerasBaseEstimator
RollerKnobster Jun 25, 2024
ba15f75
Fix: do not propagate kwargs to `super().__init__` of KerasBaseEstimator
RollerKnobster Jun 25, 2024
17ad386
Fix: do not propagate kwargs to `super().__init__` of KerasBaseEstimator
RollerKnobster Jun 25, 2024
033d2df
Fix: propagate batch_size to `super().__init__` of KerasBaseEstimator
RollerKnobster Jun 25, 2024
aca1630
Fix: set `input_shape` to tensorflow layers definition in `KerasRawMo…
RollerKnobster Jun 25, 2024
96ba313
Fix: store history for model in `_history` and use proper `regressor_…
RollerKnobster Jun 25, 2024
796f9a1
Fix: formatting
RollerKnobster Jun 25, 2024
e25a71e
Fix: adjust `model` and `history` attributes access in `KerasBaseEsti…
RollerKnobster Jun 26, 2024
6852cbc
Fix: adjust `lstm` `optimizer_kwargs` and `input.shape` access
RollerKnobster Jun 26, 2024
6d827e6
Fix: add input_shape to Dense layers in kerasraw test
RollerKnobster Jun 26, 2024
657d6e9
Merge with main
RollerKnobster Jul 8, 2024
ad19212
Formatting
RollerKnobster Jul 8, 2024
e94c9f6
Formatting
RollerKnobster Jul 8, 2024
1fd82d5
Add gunicorn as base requirement
RollerKnobster Jul 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions docs/_static/architecture_diagram.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,14 @@
from diagrams.k8s.storage import PV
from diagrams.custom import Custom

directory=os.path.dirname(__file__)
directory = os.path.dirname(__file__)

with Diagram("Gordo flow", filename=os.path.join(directory, "architecture_diagram"), outformat="png", show=False) as diag:
with Diagram(
"Gordo flow",
filename=os.path.join(directory, "architecture_diagram"),
outformat="png",
show=False,
) as diag:
with Cluster("K8s"):
gordo = CRD("Gordo")
api = API("")
Expand Down
15 changes: 11 additions & 4 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,11 @@
author = "Equinor ASA"
version = gordo.__version__
_parsed_version = parse_version(version)
commit = f"{version}" if type(_parsed_version) is GordoRelease and not _parsed_version.suffix else "HEAD"
commit = (
f"{version}"
if type(_parsed_version) is GordoRelease and not _parsed_version.suffix
else "HEAD"
)

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
Expand All @@ -44,7 +48,7 @@
"IPython.sphinxext.ipython_console_highlighting",
"sphinx_copybutton",
"sphinx_click",
"nbsphinx"
"nbsphinx",
]

root_doc = "index"
Expand All @@ -59,8 +63,11 @@
_ignore_linkcode_infos = [
# caused "OSError: could not find class definition"
{"module": "gordo_core.utils", "fullname": "PredictionResult"},
{'module': 'gordo.workflow.config_elements.schemas', 'fullname': 'Model.Config.extra'},
{'module': 'gordo.reporters.postgres', 'fullname': 'Machine.DoesNotExist'}
{
"module": "gordo.workflow.config_elements.schemas",
"fullname": "Model.Config.extra",
},
{"module": "gordo.reporters.postgres", "fullname": "Machine.DoesNotExist"},
]


Expand Down
36 changes: 18 additions & 18 deletions gordo/machine/model/anomaly/diff.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,13 +95,13 @@ def get_metadata(self):
if hasattr(self, "aggregate_threshold_"):
metadata["aggregate-threshold"] = self.aggregate_threshold_
if hasattr(self, "feature_thresholds_per_fold_"):
metadata[
"feature-thresholds-per-fold"
] = self.feature_thresholds_per_fold_.to_dict()
metadata["feature-thresholds-per-fold"] = (
self.feature_thresholds_per_fold_.to_dict()
)
if hasattr(self, "aggregate_thresholds_per_fold_"):
metadata[
"aggregate-thresholds-per-fold"
] = self.aggregate_thresholds_per_fold_
metadata["aggregate-thresholds-per-fold"] = (
self.aggregate_thresholds_per_fold_
)
# Window threshold metadata
if hasattr(self, "window"):
metadata["window"] = self.window
Expand All @@ -111,23 +111,23 @@ def get_metadata(self):
hasattr(self, "smooth_feature_thresholds_")
and self.smooth_aggregate_threshold_ is not None
):
metadata[
"smooth-feature-thresholds"
] = self.smooth_feature_thresholds_.tolist()
metadata["smooth-feature-thresholds"] = (
self.smooth_feature_thresholds_.tolist()
)
if (
hasattr(self, "smooth_aggregate_threshold_")
and self.smooth_aggregate_threshold_ is not None
):
metadata["smooth-aggregate-threshold"] = self.smooth_aggregate_threshold_

if hasattr(self, "smooth_feature_thresholds_per_fold_"):
metadata[
"smooth-feature-thresholds-per-fold"
] = self.smooth_feature_thresholds_per_fold_.to_dict()
metadata["smooth-feature-thresholds-per-fold"] = (
self.smooth_feature_thresholds_per_fold_.to_dict()
)
if hasattr(self, "smooth_aggregate_thresholds_per_fold_"):
metadata[
"smooth-aggregate-thresholds-per-fold"
] = self.smooth_aggregate_thresholds_per_fold_
metadata["smooth-aggregate-thresholds-per-fold"] = (
self.smooth_aggregate_thresholds_per_fold_
)

if isinstance(self.base_estimator, GordoBase):
metadata.update(self.base_estimator.get_metadata())
Expand Down Expand Up @@ -241,9 +241,9 @@ def cross_validate(
smooth_aggregate_threshold_fold = (
scaled_mse.rolling(self.window).min().max()
)
self.smooth_aggregate_thresholds_per_fold_[
f"fold-{i}"
] = smooth_aggregate_threshold_fold
self.smooth_aggregate_thresholds_per_fold_[f"fold-{i}"] = (
smooth_aggregate_threshold_fold
)

smooth_tag_thresholds_fold = mae.rolling(self.window).min().max()
smooth_tag_thresholds_fold.name = f"fold-{i}"
Expand Down
3 changes: 1 addition & 2 deletions gordo/machine/model/factories/lstm_autoencoder.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
from typing import Tuple, Union, Dict, Any

import tensorflow
from tensorflow import keras
from tensorflow.keras.optimizers import Optimizer
from tensorflow.keras.layers import Dense, LSTM
from tensorflow.keras.models import Sequential as KerasSequential
from tensorflow import keras

from gordo.machine.model.register import register_model_builder
from gordo.machine.model.factories.utils import hourglass_calc_dims, check_dim_func_len
Expand Down Expand Up @@ -189,7 +189,6 @@ def lstm_hourglass(
compile_kwargs: Dict[str, Any] = dict(),
**kwargs,
) -> tensorflow.keras.models.Sequential:

"""

Builds an hourglass shaped neural network, with decreasing number of neurons
Expand Down
12 changes: 6 additions & 6 deletions gordo/machine/model/register.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,22 +48,22 @@ def special_keras_model_builder(n_features, ...):
def __init__(self, type: str):
self.type = type

def __call__(self, build_fn: Callable[..., keras.models.Model]):
self._register(self.type, build_fn)
return build_fn
def __call__(self, model: Callable[..., keras.models.Model]):
self._register(self.type, model)
return model

@classmethod
def _register(cls, type: str, build_fn: Callable[[int, Any], GordoBase]):
def _register(cls, type: str, model: Callable[[int, Any], GordoBase]):
"""
Registers a given function as an available factory under
this type.
"""
cls._validate_func(build_fn)
cls._validate_func(model)

# Add function to available factories under this type
if type not in cls.factories:
cls.factories[type] = dict()
cls.factories[type][build_fn.__name__] = build_fn
cls.factories[type][model.__name__] = model

@staticmethod
def _validate_func(func):
Expand Down
16 changes: 10 additions & 6 deletions gordo/machine/model/transformers/imputer.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,14 +71,18 @@ def fit(self, X: Union[pd.DataFrame, np.ndarray], y=None):

# Calculate a 1d arrays of fill values for each feature
self._posinf_fill_values = _posinf_fill_values.apply(
lambda val: val + self.delta
if max_allowable_value - self.delta > val
else max_allowable_value
lambda val: (
val + self.delta
if max_allowable_value - self.delta > val
else max_allowable_value
)
)
self._neginf_fill_values = _neginf_fill_values.apply(
lambda val: val - self.delta
if min_allowable_value + self.delta < val
else min_allowable_value
lambda val: (
val - self.delta
if min_allowable_value + self.delta < val
else min_allowable_value
)
)

return self
Expand Down
10 changes: 6 additions & 4 deletions gordo/machine/model/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,9 +111,11 @@ def make_base_dataframe(

# Calculate the end times if possible, or also all 'None's
end_series = start_series.map(
lambda start: (start + frequency).isoformat()
if isinstance(start, datetime) and frequency is not None
else None
lambda start: (
(start + frequency).isoformat()
if isinstance(start, datetime) and frequency is not None
else None
)
)

# Convert to isoformatted string for JSON serialization.
Expand All @@ -134,7 +136,7 @@ def make_base_dataframe(
# the multiindex column dataframe, and naming their second level labels as needed.
name: str
values: np.ndarray
for (name, values) in filter(lambda nv: nv[1] is not None, names_n_values):
for name, values in filter(lambda nv: nv[1] is not None, names_n_values):

_tags = tags if name == "model-input" else target_tag_list

Expand Down
6 changes: 3 additions & 3 deletions gordo/serializer/from_definition.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,9 +176,9 @@ def _build_step(
import_str = list(step.keys())[0]

try:
StepClass: Union[
None, FeatureUnion, Pipeline, BaseEstimator
] = import_location(import_str)
StepClass: Union[None, FeatureUnion, Pipeline, BaseEstimator] = (
import_location(import_str)
)
except (ImportError, ValueError):
StepClass = None

Expand Down
8 changes: 5 additions & 3 deletions gordo/serializer/into_definition.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,9 +172,11 @@ def load_definition_from_params(params: dict, tuples_to_list: bool = True) -> di
# TODO: Make this more robust, probably via another function to parse the iterable recursively
# TODO: b/c it _could_, in theory, be a dict of {str: BaseEstimator} or similar.
definition[param] = [
_decompose_node(leaf[1], tuples_to_list=tuples_to_list)
if isinstance(leaf, tuple)
else leaf
(
_decompose_node(leaf[1], tuples_to_list=tuples_to_list)
if isinstance(leaf, tuple)
else leaf
)
for leaf in param_val
]

Expand Down
8 changes: 5 additions & 3 deletions gordo/server/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,9 +131,11 @@ def dataframe_to_dict(df: pd.DataFrame) -> dict:
data.index = data.index.astype(str)
if isinstance(df.columns, pd.MultiIndex):
return {
col: data[col].to_dict()
if isinstance(data[col], pd.DataFrame)
else pd.DataFrame(data[col]).to_dict()
col: (
data[col].to_dict()
if isinstance(data[col], pd.DataFrame)
else pd.DataFrame(data[col]).to_dict()
)
for col in data.columns.get_level_values(0)
}
else:
Expand Down
3 changes: 1 addition & 2 deletions gordo/util/version.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,7 @@

class Version(metaclass=ABCMeta):
@abstractmethod
def get_version(self):
...
def get_version(self): ...


class Special(Enum):
Expand Down
4 changes: 2 additions & 2 deletions gordo/workflow/config_elements/normalized_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,11 +119,11 @@ def __init__(
if gordo_version is None:
gordo_version = __version__
default_globals = self.get_default_globals(gordo_version)
default_globals["runtime"]["influx"][ # type: ignore
default_globals["runtime"]["influx"][
"resources"
] = _calculate_influx_resources( # type: ignore
len(config["machines"])
)
) # type: ignore

passed_globals = load_globals_config(
config.get("globals", dict()), join_json_paths("globals", json_path)
Expand Down
4 changes: 2 additions & 2 deletions pytest.ini
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ addopts =
--doctest-glob='*.md'
--doctest-glob='*.rst'
--junitxml=junit/junit.xml
--cov-report=xml
--cov=gordo
; --cov-report=xml
; --cov=gordo
flakes-ignore =
__init__.py UnusedImport
test_*.py UnusedImport
Expand Down
8 changes: 2 additions & 6 deletions requirements/full_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,6 @@ graphql-core==3.2.3
# graphql-relay
graphql-relay==3.2.0
# via graphene
greenlet==3.0.3
# via sqlalchemy
grpcio==1.64.1
# via
# tensorboard
Expand Down Expand Up @@ -289,7 +287,7 @@ opt-einsum==3.3.0
# via tensorflow
optree==0.11.0
# via keras
packaging==21.3
packaging==24.1
# via
# -r requirements.in
# azureml-core
Expand Down Expand Up @@ -352,9 +350,7 @@ pyopenssl==24.1.0
# azureml-core
# ndg-httpsclient
pyparsing==3.1.2
# via
# matplotlib
# packaging
# via matplotlib
pysocks==1.7.1
# via requests
python-dateutil==2.9.0.post0
Expand Down
2 changes: 1 addition & 1 deletion requirements/mlflow_requirements.in
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
mlflow~=2.14
azureml-core~=1.49
azureml-core~=1.56.0
5 changes: 2 additions & 3 deletions requirements/requirements.in
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
dictdiffer~=0.8
dataclasses-json~=0.3
gunicorn~=22.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep this dependency. I know this is a transitional dependency through mlflow, but mlflow is extras_require. A user can install gordo without it

jinja2~=3.1
python-dateutil~=2.8
tensorflow~=2.16.0
scikeras~=0.13.0
gunicorn~=22.0
# There's a bug in keras 3.4.0 with loading models (https://github.com/keras-team/keras/issues/19921)
keras<3.4.0
Flask>=2.2.5,<3.0.0
simplejson~=3.17
prometheus_client~=0.7
# Due to azureml-core 1.49.0 depends on packaging<22.0
packaging>=21.0,<22.0
packaging>=24.0
gordo-client~=6.2
11 changes: 5 additions & 6 deletions requirements/test_requirements.in
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
-c full_requirements.txt
docker>=4.0,<7.0
pytest~=7.2
docker~=7.1.0
pytest~=8.2
pytest-xdist~=3.2
pytest-mock~=3.6
pytest-mypy~=0.10
Expand All @@ -9,10 +9,9 @@ pytest-cov~=4.0
pytest-benchmark~=4.0
pytest-flakes~=4.0
mock~=5.0
responses~=0.23
# Due to packaging>22.0 in black 23.0, azureml-core~=1.49 requires packaging<22.0
black>=22.0,<23.0
notebook~=6.4
responses~=0.25.3
black~=24.4.2
notebook~=7.2.1
nbconvert~=7.4
types-simplejson
types-python-dateutil
Expand Down
Loading
Loading