Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve SMT Method Changes / Support for JENN Models #1217

Merged
merged 6 commits into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 10 additions & 4 deletions docs/source/chapt_surrogates/mlaiplugin.rst
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,16 @@ https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegr

Surrogate Modeling Toolbox is an open-source Python package supporting a number of surrogate
modeling methods, including gradient-enhanced neural network (GENN) models. GENN models train
parameters by minimizing a modified Least Squares Estimator which accounts for partial derivative predictions, leading to better accuracy on fewer training points compared to non-gradient-enhanced models. Gradient methods are applicable when training use cases where
parameters by minimizing a modified Least Squares Estimator which accounts for partial
derivative predictions, leading to better accuracy on fewer training points compared to
non-gradient-enhanced models. Gradient methods are applicable when training use cases where
system data is generally known, such as continuous physics-based problems like aerodynamics.
If gradient data is not known, users may run a gradient generation tool provided within FOQUS and can consults the tool documentation here: :ref:`gengrad`. Users may find further information on GENN models within Surrogate Modeling Toolbox in
the documentation: https://smt.readthedocs.io/en/stable/_src_docs/surrogate_models/genn.html.
If gradient data is not known, users may run a gradient generation tool provided within FOQUS
and can consults the tool documentation here: :ref:`gengrad`. Users may find further information
on GENN models within Surrogate Modeling Toolbox in the documentation:
https://smt.readthedocs.io/en/stable/_src_docs/surrogate_models/genn.html. SMT GENN model training
leverages an external dependency JENN (Jacobian-Enhanced Neural Networks) package which may also be
utilized independently. Users may find further information on JENN models in the documentation: https://pypi.org/project/jenn/.

The examples files located in *FOQUS.examples.other_files.ML_AI_Plugin* show how users
may train new models or re-save loaded models with a custom layer.
Expand Down Expand Up @@ -263,7 +269,7 @@ to obtain the correct output values for the entered inputs.
To run the models, copy the appropriate model files or folders ('h5_model.h5',
'saved_model/', 'json_model.json', 'json_model_weights.h5') and any custom layer
scripts ('model_name.py') into the working directory folder 'user_ml_ai_models'.
As mentioned earlier, PyTorch, Scikit-learn and Surrogate Modeling Toolbox models only require the model file ('pt_model.pt', 'skl_model.pkl' or 'smt_model.pkl').
As mentioned earlier, PyTorch, Scikit-learn and Surrogate Modeling Toolbox models only require the model file ('pt_model.pt', 'skl_model.pkl', 'smt_model.pkl', or 'jenn_model.pkl').
For example, the model name below is 'mea_column_model' and is saved in H5 format,
and the files *FOQUS.examples.other_files.ML_AI_Plugin.TensorFlow_2-10_Models.mea_column_model.h5*
and *FOQUS.examples.other_files.ML_AI_Plugin.mea_column_model.py* should be copied to
Expand Down
6 changes: 5 additions & 1 deletion docs/source/references.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,4 +63,8 @@ L. Buitinck, G. Louppe, M.Blondel, et al., "API design for machine learning soft

.. _Bouhlel_2019:

M. A. Bouhlel, J. T. Hwang, N. Bartoli, et al., "A Python surrogate modeling framework with derivatives." Advances in Engineering Software, Vol 135 (pp. 102662), September 2019.
M. A. Bouhlel, J. T. Hwang, N. Bartoli, et al., "A Python surrogate modeling framework with derivatives." Advances in Engineering Software, Vol 135 (pp. 102662), September 2019.

.. _Berguin_2019:

S. H. Berguin. "Gradient-Enhanced Neural Network." URL: https://github.com/shb84/JENN/blob/master/docs/theory.pdf.
Binary file not shown.
Binary file not shown.
Binary file not shown.
151 changes: 151 additions & 0 deletions examples/other_files/ML_AI_Plugin/mea_column_model_training_jenn.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
#################################################################################
# FOQUS Copyright (c) 2012 - 2024, by the software owners: Oak Ridge Institute
# for Science and Education (ORISE), TRIAD National Security, LLC., Lawrence
# Livermore National Security, LLC., The Regents of the University of
# California, through Lawrence Berkeley National Laboratory, Battelle Memorial
# Institute, Pacific Northwest Division through Pacific Northwest National
# Laboratory, Carnegie Mellon University, West Virginia University, Boston
# University, the Trustees of Princeton University, The University of Texas at
# Austin, URS Energy & Construction, Inc., et al. All rights reserved.
#
# Please see the file LICENSE.md for full copyright and license information,
# respectively. This file is also available online at the URL
# "https://github.com/CCSI-Toolset/FOQUS".
#################################################################################
import numpy as np
import pandas as pd

# from smt.utils.neural_net.model import Model
from jenn.model import NeuralNet
import pickle
from types import SimpleNamespace


# Example follows the sequence below:
# 1) Code at end of file to import data and create model
# 2) Call create_model() to define inputs and outputs
# 3) Call CustomLayer to define network structure, which uses
# call() to define layer connections and get_config to attach
# attributes to CustomLayer class object
# 4) Back to create_model() to compile and train model
# 5) Back to code at end of file to save, load and test model


# method to create model
def create_model(x_train, y_train, grad_train):

n_m, n_x = np.shape(x_train)
_, n_y = np.shape(y_train)

# check dimensions using grad_train
assert np.shape(grad_train) == (n_y, n_m, n_x)

# reshape arrays
X = np.reshape(x_train, (n_x, n_m))
Y = np.reshape(y_train, (n_y, n_m))
J = np.reshape(grad_train, (n_y, n_x, n_m))

# set up and train model
idx = 0
best_SSE = [0, 1e100]
best_model = None
best_y_pred = None

runs = 20000 # reduce number of runs to reduce runtime
for i in range(runs):
idx += 1

hidden_layer_sizes = [6, 6]
model = NeuralNet(
[X.shape[0]] + hidden_layer_sizes + [Y.shape[0]],
hidden_activation="relu",
output_activation="linear",
)
model.parameters.initialize()

model.fit(
x=X, # input data
y=Y, # output data
dydx=J, # gradient data
is_normalize=False,
alpha=0.500, # learning rate that controls optimizer step size
lambd=0.000, # lambd = 0. = no regularization, lambd > 0 = regularization
gamma=1.000, # gamma = 0. = no grad-enhancement, gamma > 0 = grad-enhancement
beta1=0.90, # tuning parameter to control ADAM optimization
beta2=0.99, # tuning parameter to control ADAM optimization
epochs=1, # number of passes through data
batch_size=None, # used to divide data into training batches (use for large data sets)
max_iter=200, # number of optimizer iterations per mini-batch
shuffle=True,
random_state=None,
is_backtracking=False,
is_verbose=False,
)

y_pred = np.transpose(model.predict(np.transpose(x_train)))
SSE = sum((y_pred - y_train) ** 2)

# y0 is 2.1 orders of magnitude larger than y1, so adjust the SSE check
# CO2 capture rate and SRD should both be positive for all predictions
# SSE = [75589.3371621 13214.64474031] from running 1000
# SSE = [39801.73811642 436.51381078] from running 20000
if (SSE[0] / 21 + SSE[1]) < (best_SSE[0] / 21 + best_SSE[1]) and np.all(
y_pred
) > 0:
best_SSE = SSE
best_model = model
best_y_pred = y_pred

print(np.round(idx / runs * 100, 3), " % complete")

best_model.custom = SimpleNamespace(
input_labels=xlabels,
output_labels=zlabels,
input_bounds=xdata_bounds,
output_bounds=ydata_bounds,
normalized=False, # JENN models are normalized during training, this should always be False
)

return best_model, best_y_pred, best_SSE


# Main code

# import data
data = pd.read_csv(r"MEA_carbon_capture_dataset_mimo.csv")
grad0_data = pd.read_csv(r"gradients_output0.csv", index_col=0) # ignore 1st col
grad1_data = pd.read_csv(r"gradients_output1.csv", index_col=0) # ignore 1st col

xdata = data.iloc[:, :6] # there are 6 input variables/columns
ydata = data.iloc[:, 6:] # the rest are output variables/columns
xlabels = xdata.columns.tolist() # set labels as a list (default) from pandas
zlabels = ydata.columns.tolist() # is a set of IndexedDataSeries objects
xdata_bounds = {i: (xdata[i].min(), xdata[i].max()) for i in xdata} # x bounds
ydata_bounds = {j: (ydata[j].min(), ydata[j].max()) for j in ydata} # y bounds

xmax, xmin = xdata.max(axis=0), xdata.min(axis=0)
ymax, ymin = ydata.max(axis=0), ydata.min(axis=0)
xdata, ydata = np.array(xdata), np.array(ydata) # (n_m, n_x) and (n_m, n_y)
gdata = np.stack([np.array(grad0_data), np.array(grad1_data)]) # (2, n_m, n_x)

model_data = np.concatenate(
(xdata, ydata), axis=1
) # JENN requires a Numpy array as input

# define x and y data, not used but will add to variable dictionary
xdata = model_data[:, :-2]
ydata = model_data[:, -2:]

# create model
model, y_pred, SSE = create_model(x_train=xdata, y_train=ydata, grad_train=gdata)

with open("mea_column_model_jenn.pkl", "wb") as file:
pickle.dump(model, file)

# load model as pickle format
with open("mea_column_model_jenn.pkl", "rb") as file:
loaded_model = pickle.load(file)


print(y_pred)
print("SSE = ", SSE)
Comment on lines +112 to +151
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to wrap this code in a main() function, so that it's not run on import?

def main():
    # move code here


if __name__ == "__main__":
    main()

Please disregard if the current structure is required (e.g. the code needs to be at the module scope for the dynamic module import to work correctly).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be possible for the code that calls the training method, as all of the training examples are only used to generate the model files. The training methods that create the models may need to be on the module level - Keras for sure needs this to register and load the model class object.

Some of the Keras example files are vehicles for the class objects, e.g. mea_column_model_customnormform.py must be copied to the working directory during testing to load the registered model class in the plugin code, and those non-training files don't have any "main" code to execute.

I can update that here, or in another PR.

130 changes: 74 additions & 56 deletions examples/other_files/ML_AI_Plugin/mea_column_model_training_smtgenn.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@
#################################################################################
import numpy as np
import pandas as pd
from smt.utils.neural_net.model import Model

# from smt.utils.neural_net.model import Model
from jenn.model import NeuralNet
from smt.surrogate_models import GENN
import pickle
from types import SimpleNamespace

Expand All @@ -30,93 +33,108 @@


# method to create model
def create_model(x_train, z_train, grad_train):

# already have X, Y and J, don't need to create and populate GENN() to
# load SMT data into Model(); GENN() doesn't support multiple outputs
def create_model(x_train, y_train):

# Model() does support multiple outputs, so we just need to reshape the
# arrays so that Model() can use them
# we have x_train = (n_m, n_x), z_train = (n_m, n_y) and grad_train = (n_y, n_m, n_x)
n_m, n_x = np.shape(x_train)
_, n_y = np.shape(z_train)

# check dimensions using grad_train
assert np.shape(grad_train) == (n_y, n_m, n_x)
_, n_y = np.shape(y_train)

# reshape arrays
X = np.reshape(x_train, (n_x, n_m))
Y = np.reshape(z_train, (n_y, n_m))
J = np.reshape(grad_train, (n_y, n_x, n_m))
X = np.reshape(x_train, (n_m, n_x))
Y = np.reshape(y_train, (n_m, n_y))

# set up and train model

# Train neural net
model = Model.initialize(
X.shape[0], Y.shape[0], deep=2, wide=6
) # 2 hidden layers with 6 neurons each
model.train(
X=X, # input data
Y=Y, # output data
J=J, # gradient data
num_iterations=25, # number of optimizer iterations per mini-batch
mini_batch_size=int(
np.floor(n_m / 5)
), # used to divide data into training batches (use for large data sets)
num_epochs=20, # number of passes through data
alpha=0.15, # learning rate that controls optimizer step size
beta1=0.99, # tuning parameter to control ADAM optimization
beta2=0.99, # tuning parameter to control ADAM optimization
lambd=0.1, # lambd = 0. = no regularization, lambd > 0 = regularization
gamma=0.0001, # gamma = 0. = no grad-enhancement, gamma > 0 = grad-enhancement
seed=None, # set to value for reproducibility
silent=True, # set to True to suppress training output
)

model.custom = SimpleNamespace(
idx = 0
best_SSE = [0, 1e100]
best_model = None
best_y_pred = None

runs = 20000 # reduce number of runs to reduce runtime
for i in range(runs):
idx += 1

model = GENN()

options = {
"hidden_layer_sizes": [6, 6], # 2 layer with 6 nodes each
"num_iterations": 200, # number of optimizer iterations per mini-batch
# "mini_batch_size": None, # used to divide data into training batches (use for large data sets)
"num_epochs": 1, # number of passes through data
"alpha": 0.500, # learning rate that controls optimizer step size
"beta1": 0.900, # tuning parameter to control ADAM optimization
"beta2": 0.990, # tuning parameter to control ADAM optimization
"lambd": 0.000, # lambd = 0. = no regularization, lambd > 0 = regularization
"gamma": 0.000, # gamma = 0. = no grad-enhancement, gamma > 0 = grad-enhancement
# "seed": None, # set to value for reproducibility
"is_print": False, # set to False to suppress training output
}

for key in options:
model.options[key] = options[key]

model.load_data(X, Y)
model.train()

y_pred = model.predict_values(X)
SSE = sum((y_pred - Y) ** 2)

# y0 is 2.1 orders of magnitude larger than y1, so adjust the SSE check
# CO2 capture rate and SRD should both be positive for all predictions
# SSE = [26344.00972484 65.81581468] from running 1000
# SSE = [23809.9512858 64.66887398] from running 20000
if (SSE[0] / 21 + SSE[1]) < (best_SSE[0] / 21 + best_SSE[1]) and np.all(
y_pred
) > 0:
best_SSE = SSE
best_model = model
best_y_pred = y_pred

print(np.round(idx / runs * 100, 3), " % complete")

best_model.custom = SimpleNamespace(
input_labels=xlabels,
output_labels=zlabels,
output_labels=ylabels,
input_bounds=xdata_bounds,
output_bounds=zdata_bounds,
output_bounds=ydata_bounds,
normalized=False, # SMT GENN models are normalized during training, this should always be False
)

return model
return best_model, best_y_pred, best_SSE


# Main code

# import data
data = pd.read_csv(r"MEA_carbon_capture_dataset_mimo.csv")
grad0_data = pd.read_csv(r"gradients_output0.csv", index_col=0) # ignore 1st col
grad1_data = pd.read_csv(r"gradients_output1.csv", index_col=0) # ignore 1st col

xdata = data.iloc[:, :6] # there are 6 input variables/columns
zdata = data.iloc[:, 6:] # the rest are output variables/columns
ydata = data.iloc[:, 6:] # the rest are output variables/columns
xlabels = xdata.columns.tolist() # set labels as a list (default) from pandas
zlabels = zdata.columns.tolist() # is a set of IndexedDataSeries objects
ylabels = ydata.columns.tolist() # is a set of IndexedDataSeries objects
xdata_bounds = {i: (xdata[i].min(), xdata[i].max()) for i in xdata} # x bounds
zdata_bounds = {j: (zdata[j].min(), zdata[j].max()) for j in zdata} # z bounds
ydata_bounds = {j: (ydata[j].min(), ydata[j].max()) for j in ydata} # y bounds

xmax, xmin = xdata.max(axis=0), xdata.min(axis=0)
zmax, zmin = zdata.max(axis=0), zdata.min(axis=0)
xdata, zdata = np.array(xdata), np.array(zdata) # (n_m, n_x) and (n_m, n_y)
gdata = np.stack([np.array(grad0_data), np.array(grad1_data)]) # (2, n_m, n_x)
ymax, ymin = ydata.max(axis=0), ydata.min(axis=0)
xdata, ydata = np.array(xdata), np.array(ydata) # (n_m, n_x) and (n_m, n_y)

model_data = np.concatenate(
(xdata, zdata), axis=1
(xdata, ydata), axis=1
) # Surrogate Modeling Toolbox requires a Numpy array as input

# define x and z data, not used but will add to variable dictionary
# define x and y data, not used but will add to variable dictionary
xdata = model_data[:, :-2]
zdata = model_data[:, -2:]
ydata = model_data[:, -2:]

# create model
model = create_model(x_train=xdata, z_train=zdata, grad_train=gdata)
model, y_pred, SSE = create_model(x_train=xdata, y_train=ydata)

with open("mea_column_model_smt.pkl", "wb") as file:
with open("mea_column_model_smtgenn.pkl", "wb") as file:
pickle.dump(model, file)

# load model as pickle format
with open("mea_column_model_smt.pkl", "rb") as file:
with open("mea_column_model_smtgenn.pkl", "rb") as file:
loaded_model = pickle.load(file)


print(y_pred)
print("SSE = ", SSE)
Comment on lines 104 to +140
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment for using if __name__ == "__main__":, if possible.

3 changes: 2 additions & 1 deletion foqus_lib/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,8 @@ def install_ml_ai_model_files(
ts_models_base_path / "mea_column_model_customnormform_json_weights.h5",
other_models_base_path / "mea_column_model_customnormform_pytorch.pt",
other_models_base_path / "mea_column_model_customnormform_scikitlearn.pkl",
other_models_base_path / "mea_column_model_smt.pkl",
other_models_base_path / "mea_column_model_smtgenn.pkl",
other_models_base_path / "mea_column_model_jenn.pkl",
]:
shutil.copy2(path, models_dir)
# unzip the zip file (could be generalized later to more files if needed)
Expand Down
Loading
Loading