Training a Gaussian Process model in MOOSE #26267
-
Hello! I am currently trying to use the stochastic tools module to train a gaussian process model, but am having issues with the hyperparameters. My basic setup is this: I load in my training data from CSV files (using CSVSampler for X and VectorPostprocessor for Y) and specify the "Signal Variance", "Noise Variance", and "Length Factor" for a Matern covariance function with p = 1 (nu = 3/2). I also run the same model (same inputs, hyperparameters, and covariance function) in python using sklearn just to have a reference point. I am able to get a good, general model working on python, but when I use those same hyperparameters in MOOSE, when I go to evaluate the model using my testing data, it only predicts a vector of ~0's (the given values are ~4*10^-301, which is zero). I have gone into both codes and found the actual used hyperparameters for these evaluations, and they do match. I believe the issue has to do with how I am training the algorithm, but I am not sure where. An issue that may be wrong is in the length factor HP. For this one, MOOSE requires an input for each feature, but since I have 102 features, I added in a few lines that takes in the input length factor and copies this into a 1x102 vector, then uses that vector for the input. I am not sure if this is maybe causing an unforeseen consequence somewhere, so I thought I'd mention it. I have also tried using the 'adam' tuning algorithm, but this also does not give any better results. (I have tired with batch sizes from 1-50, itterations from 10-1000, and with learning rates from 0.001 - 10). Here is the training input file:
To make my life simpler, I already standardized my X, so that is why those options are false. I also run this through a shell script where I input my hyperparameters there, so you can ignore the values on the top of the file. Those are just there to initialize the variables. Any advise on this? I have tried a large range of hyperparameters for the MOOSE GP model, but nothing has given results better than a vector of zeros (my actual Y's are in the ~2000 range). I am pretty stuck, so I am willing to try just about anything. Thanks! Charlie |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 8 replies
-
Hi Som! To alter the length factor, I appended the code in MaternHalfIntCovariance.C to be:
(Before, it just took _length_factor to be the input value). Is this done correctly? In the evaluation stage, I print all the HPs that are loaded in, and this one seems to be correctly loaded in as a vector of 102 of the same points. My feature size is high, but we do plan on keeping it at this scale. However, we do have a large number of training points (~2000). I also used a much smaller dataset (8 features, ~1000 points), and it was still predicting zeros, so the error persists. The output of both is in the form gp_surroagte, gp_surrogate_std and really just repeats like that for all points. So for some reason, it is only predicting zeros with 0-std. I assume it has to be with the code I added in (as this is the only difference between my code and the original), but I am not sure where the error would arise? Thanks! Charlie |
Beta Was this translation helpful? Give feedback.
I was on personal leave and hence my responses have been delayed so far.
For the Matern Half Int covariance, p should be a positive integer. See the documentation here: https://mooseframework.inl.gov/source/surrogates/MaternHalfIntCovariance.html
As far as the zero predictions go, this typically relates to problem with your training data. Some problems could be:
standardize_params = 'true'
andstandardize_data = 'true'
Standardize the data and params and see how it goes. Also, when doing predicti…