-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve SMT Method Changes / Support for JENN Models #1217
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minor question about the possibility of using if __name__ == "__main__":
for scripts, otherwise looks good as far as I can tell.
# Main code | ||
|
||
# import data | ||
data = pd.read_csv(r"MEA_carbon_capture_dataset_mimo.csv") | ||
grad0_data = pd.read_csv(r"gradients_output0.csv", index_col=0) # ignore 1st col | ||
grad1_data = pd.read_csv(r"gradients_output1.csv", index_col=0) # ignore 1st col | ||
|
||
xdata = data.iloc[:, :6] # there are 6 input variables/columns | ||
ydata = data.iloc[:, 6:] # the rest are output variables/columns | ||
xlabels = xdata.columns.tolist() # set labels as a list (default) from pandas | ||
zlabels = ydata.columns.tolist() # is a set of IndexedDataSeries objects | ||
xdata_bounds = {i: (xdata[i].min(), xdata[i].max()) for i in xdata} # x bounds | ||
ydata_bounds = {j: (ydata[j].min(), ydata[j].max()) for j in ydata} # y bounds | ||
|
||
xmax, xmin = xdata.max(axis=0), xdata.min(axis=0) | ||
ymax, ymin = ydata.max(axis=0), ydata.min(axis=0) | ||
xdata, ydata = np.array(xdata), np.array(ydata) # (n_m, n_x) and (n_m, n_y) | ||
gdata = np.stack([np.array(grad0_data), np.array(grad1_data)]) # (2, n_m, n_x) | ||
|
||
model_data = np.concatenate( | ||
(xdata, ydata), axis=1 | ||
) # JENN requires a Numpy array as input | ||
|
||
# define x and y data, not used but will add to variable dictionary | ||
xdata = model_data[:, :-2] | ||
ydata = model_data[:, -2:] | ||
|
||
# create model | ||
model, y_pred, SSE = create_model(x_train=xdata, y_train=ydata, grad_train=gdata) | ||
|
||
with open("mea_column_model_jenn.pkl", "wb") as file: | ||
pickle.dump(model, file) | ||
|
||
# load model as pickle format | ||
with open("mea_column_model_jenn.pkl", "rb") as file: | ||
loaded_model = pickle.load(file) | ||
|
||
|
||
print(y_pred) | ||
print("SSE = ", SSE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to wrap this code in a main()
function, so that it's not run on import?
def main():
# move code here
if __name__ == "__main__":
main()
Please disregard if the current structure is required (e.g. the code needs to be at the module scope for the dynamic module import to work correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be possible for the code that calls the training method, as all of the training examples are only used to generate the model files. The training methods that create the models may need to be on the module level - Keras for sure needs this to register and load the model class object.
Some of the Keras example files are vehicles for the class objects, e.g. mea_column_model_customnormform.py
must be copied to the working directory during testing to load the registered model class in the plugin code, and those non-training files don't have any "main" code to execute.
I can update that here, or in another PR.
# Main code | ||
|
||
# import data | ||
data = pd.read_csv(r"MEA_carbon_capture_dataset_mimo.csv") | ||
grad0_data = pd.read_csv(r"gradients_output0.csv", index_col=0) # ignore 1st col | ||
grad1_data = pd.read_csv(r"gradients_output1.csv", index_col=0) # ignore 1st col | ||
|
||
xdata = data.iloc[:, :6] # there are 6 input variables/columns | ||
zdata = data.iloc[:, 6:] # the rest are output variables/columns | ||
ydata = data.iloc[:, 6:] # the rest are output variables/columns | ||
xlabels = xdata.columns.tolist() # set labels as a list (default) from pandas | ||
zlabels = zdata.columns.tolist() # is a set of IndexedDataSeries objects | ||
ylabels = ydata.columns.tolist() # is a set of IndexedDataSeries objects | ||
xdata_bounds = {i: (xdata[i].min(), xdata[i].max()) for i in xdata} # x bounds | ||
zdata_bounds = {j: (zdata[j].min(), zdata[j].max()) for j in zdata} # z bounds | ||
ydata_bounds = {j: (ydata[j].min(), ydata[j].max()) for j in ydata} # y bounds | ||
|
||
xmax, xmin = xdata.max(axis=0), xdata.min(axis=0) | ||
zmax, zmin = zdata.max(axis=0), zdata.min(axis=0) | ||
xdata, zdata = np.array(xdata), np.array(zdata) # (n_m, n_x) and (n_m, n_y) | ||
gdata = np.stack([np.array(grad0_data), np.array(grad1_data)]) # (2, n_m, n_x) | ||
ymax, ymin = ydata.max(axis=0), ydata.min(axis=0) | ||
xdata, ydata = np.array(xdata), np.array(ydata) # (n_m, n_x) and (n_m, n_y) | ||
|
||
model_data = np.concatenate( | ||
(xdata, zdata), axis=1 | ||
(xdata, ydata), axis=1 | ||
) # Surrogate Modeling Toolbox requires a Numpy array as input | ||
|
||
# define x and z data, not used but will add to variable dictionary | ||
# define x and y data, not used but will add to variable dictionary | ||
xdata = model_data[:, :-2] | ||
zdata = model_data[:, -2:] | ||
ydata = model_data[:, -2:] | ||
|
||
# create model | ||
model = create_model(x_train=xdata, z_train=zdata, grad_train=gdata) | ||
model, y_pred, SSE = create_model(x_train=xdata, y_train=ydata) | ||
|
||
with open("mea_column_model_smt.pkl", "wb") as file: | ||
with open("mea_column_model_smtgenn.pkl", "wb") as file: | ||
pickle.dump(model, file) | ||
|
||
# load model as pickle format | ||
with open("mea_column_model_smt.pkl", "rb") as file: | ||
with open("mea_column_model_smtgenn.pkl", "rb") as file: | ||
loaded_model = pickle.load(file) | ||
|
||
|
||
print(y_pred) | ||
print("SSE = ", SSE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comment for using if __name__ == "__main__":
, if possible.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1217 +/- ##
==========================================
+ Coverage 38.54% 38.72% +0.17%
==========================================
Files 164 164
Lines 37032 37067 +35
Branches 6132 6140 +8
==========================================
+ Hits 14274 14354 +80
+ Misses 21619 21569 -50
- Partials 1139 1144 +5 ☔ View full report in Codecov by Sentry. |
Fixes/Addresses:
Resolves #1205 by updating SMT GENN syntax to the latest supported version, and by adding usage of JENN (a new dependency of SMT GENN) as well.
Legal Acknowledgement
By contributing to this software project, I agree to the following terms and conditions for my contribution: