Resolve SMT Method Changes / Support for JENN Models #1217

bpaul4 · 2024-04-10T22:57:08Z

Fixes/Addresses:

Resolves #1205 by updating SMT GENN syntax to the latest supported version, and by adding usage of JENN (a new dependency of SMT GENN) as well.

Legal Acknowledgement

By contributing to this software project, I agree to the following terms and conditions for my contribution:

I agree my contributions are submitted under the copyright and license terms described in the LICENSE.md file at the top level of this directory.
I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

lbianchi-lbl

A minor question about the possibility of using if __name__ == "__main__": for scripts, otherwise looks good as far as I can tell.

lbianchi-lbl · 2024-04-23T23:41:22Z

examples/other_files/ML_AI_Plugin/mea_column_model_training_jenn.py

+# Main code
+
+# import data
+data = pd.read_csv(r"MEA_carbon_capture_dataset_mimo.csv")
+grad0_data = pd.read_csv(r"gradients_output0.csv", index_col=0)  # ignore 1st col
+grad1_data = pd.read_csv(r"gradients_output1.csv", index_col=0)  # ignore 1st col
+
+xdata = data.iloc[:, :6]  # there are 6 input variables/columns
+ydata = data.iloc[:, 6:]  # the rest are output variables/columns
+xlabels = xdata.columns.tolist()  # set labels as a list (default) from pandas
+zlabels = ydata.columns.tolist()  # is a set of IndexedDataSeries objects
+xdata_bounds = {i: (xdata[i].min(), xdata[i].max()) for i in xdata}  # x bounds
+ydata_bounds = {j: (ydata[j].min(), ydata[j].max()) for j in ydata}  # y bounds
+
+xmax, xmin = xdata.max(axis=0), xdata.min(axis=0)
+ymax, ymin = ydata.max(axis=0), ydata.min(axis=0)
+xdata, ydata = np.array(xdata), np.array(ydata)  # (n_m, n_x) and (n_m, n_y)
+gdata = np.stack([np.array(grad0_data), np.array(grad1_data)])  # (2, n_m, n_x)
+
+model_data = np.concatenate(
+    (xdata, ydata), axis=1
+)  # JENN requires a Numpy array as input
+
+# define x and y data, not used but will add to variable dictionary
+xdata = model_data[:, :-2]
+ydata = model_data[:, -2:]
+
+# create model
+model, y_pred, SSE = create_model(x_train=xdata, y_train=ydata, grad_train=gdata)
+
+with open("mea_column_model_jenn.pkl", "wb") as file:
+    pickle.dump(model, file)
+
+# load model as pickle format
+with open("mea_column_model_jenn.pkl", "rb") as file:
+    loaded_model = pickle.load(file)
+
+
+print(y_pred)
+print("SSE = ", SSE)


Would it be possible to wrap this code in a main() function, so that it's not run on import?

def main(): # move code here if __name__ == "__main__": main()

Please disregard if the current structure is required (e.g. the code needs to be at the module scope for the dynamic module import to work correctly).

That should be possible for the code that calls the training method, as all of the training examples are only used to generate the model files. The training methods that create the models may need to be on the module level - Keras for sure needs this to register and load the model class object.

Some of the Keras example files are vehicles for the class objects, e.g. mea_column_model_customnormform.py must be copied to the working directory during testing to load the registered model class in the plugin code, and those non-training files don't have any "main" code to execute.

I can update that here, or in another PR.

lbianchi-lbl · 2024-04-23T23:42:01Z

examples/other_files/ML_AI_Plugin/mea_column_model_training_smtgenn.py

 # Main code

 # import data
 data = pd.read_csv(r"MEA_carbon_capture_dataset_mimo.csv")
-grad0_data = pd.read_csv(r"gradients_output0.csv", index_col=0)  # ignore 1st col
-grad1_data = pd.read_csv(r"gradients_output1.csv", index_col=0)  # ignore 1st col

 xdata = data.iloc[:, :6]  # there are 6 input variables/columns
-zdata = data.iloc[:, 6:]  # the rest are output variables/columns
+ydata = data.iloc[:, 6:]  # the rest are output variables/columns
 xlabels = xdata.columns.tolist()  # set labels as a list (default) from pandas
-zlabels = zdata.columns.tolist()  # is a set of IndexedDataSeries objects
+ylabels = ydata.columns.tolist()  # is a set of IndexedDataSeries objects
 xdata_bounds = {i: (xdata[i].min(), xdata[i].max()) for i in xdata}  # x bounds
-zdata_bounds = {j: (zdata[j].min(), zdata[j].max()) for j in zdata}  # z bounds
+ydata_bounds = {j: (ydata[j].min(), ydata[j].max()) for j in ydata}  # y bounds

 xmax, xmin = xdata.max(axis=0), xdata.min(axis=0)
-zmax, zmin = zdata.max(axis=0), zdata.min(axis=0)
-xdata, zdata = np.array(xdata), np.array(zdata)  # (n_m, n_x) and (n_m, n_y)
-gdata = np.stack([np.array(grad0_data), np.array(grad1_data)])  # (2, n_m, n_x)
+ymax, ymin = ydata.max(axis=0), ydata.min(axis=0)
+xdata, ydata = np.array(xdata), np.array(ydata)  # (n_m, n_x) and (n_m, n_y)

 model_data = np.concatenate(
-    (xdata, zdata), axis=1
+    (xdata, ydata), axis=1
 )  # Surrogate Modeling Toolbox requires a Numpy array as input

-# define x and z data, not used but will add to variable dictionary
+# define x and y data, not used but will add to variable dictionary
 xdata = model_data[:, :-2]
-zdata = model_data[:, -2:]
+ydata = model_data[:, -2:]

 # create model
-model = create_model(x_train=xdata, z_train=zdata, grad_train=gdata)
+model, y_pred, SSE = create_model(x_train=xdata, y_train=ydata)

-with open("mea_column_model_smt.pkl", "wb") as file:
+with open("mea_column_model_smtgenn.pkl", "wb") as file:
    pickle.dump(model, file)

 # load model as pickle format
-with open("mea_column_model_smt.pkl", "rb") as file:
+with open("mea_column_model_smtgenn.pkl", "rb") as file:
    loaded_model = pickle.load(file)
+
+
+print(y_pred)
+print("SSE = ", SSE)


See previous comment for using if __name__ == "__main__":, if possible.

codecov · 2024-04-24T15:20:48Z

Codecov Report

Attention: Patch coverage is 64.10256% with 14 lines in your changes are missing coverage. Please review.

Project coverage is 38.72%. Comparing base (8da0631) to head (f17afdd).

Files	Patch %	Lines
foqus_lib/framework/graph/node.py	64.10%	10 Missing and 4 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1217      +/-   ##
==========================================
+ Coverage   38.54%   38.72%   +0.17%     
==========================================
  Files         164      164              
  Lines       37032    37067      +35     
  Branches     6132     6140       +8     
==========================================
+ Hits        14274    14354      +80     
+ Misses      21619    21569      -50     
- Partials     1139     1144       +5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

bpaul4 added 4 commits April 10, 2024 15:49

update smt genn to new syntax, create example of jenn and retrain models

0cd0ac5

update code, tests and dependency list

90c88e2

update documentation

d80f62d

Merge branch 'fix-smt' of https://github.com/bpaul4/FOQUS into fix-smt

61eda40

bpaul4 self-assigned this Apr 10, 2024

run black

5cd98bc

bpaul4 requested a review from lbianchi-lbl April 10, 2024 23:05

ksbeattie added the Priority:High High Priority Issue or PR label Apr 16, 2024

lbianchi-lbl previously approved these changes Apr 23, 2024

View reviewed changes

Update copyright year in new file

f17afdd

bpaul4 dismissed lbianchi-lbl’s stale review via f17afdd April 24, 2024 14:49

lbianchi-lbl self-requested a review April 30, 2024 19:58

lbianchi-lbl approved these changes Apr 30, 2024

View reviewed changes

lbianchi-lbl merged commit 0b7f3c8 into CCSI-Toolset:master Apr 30, 2024
31 checks passed

bpaul4 deleted the fix-smt branch October 1, 2024 14:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve SMT Method Changes / Support for JENN Models #1217

Resolve SMT Method Changes / Support for JENN Models #1217

bpaul4 commented Apr 10, 2024

lbianchi-lbl left a comment

lbianchi-lbl Apr 23, 2024

bpaul4 Apr 24, 2024

lbianchi-lbl Apr 23, 2024

codecov bot commented Apr 24, 2024

Resolve SMT Method Changes / Support for JENN Models #1217

Resolve SMT Method Changes / Support for JENN Models #1217

Conversation

bpaul4 commented Apr 10, 2024

Fixes/Addresses:

Legal Acknowledgement

lbianchi-lbl left a comment

Choose a reason for hiding this comment

lbianchi-lbl Apr 23, 2024

Choose a reason for hiding this comment

bpaul4 Apr 24, 2024

Choose a reason for hiding this comment

lbianchi-lbl Apr 23, 2024

Choose a reason for hiding this comment

codecov bot commented Apr 24, 2024

Codecov Report