Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using loaded trained model #244

Open
mirdones opened this issue Nov 12, 2019 · 5 comments
Open

Error when using loaded trained model #244

mirdones opened this issue Nov 12, 2019 · 5 comments

Comments

@mirdones
Copy link

I am training and saving a PPO agent as in https://github.com/tensorflow/agents/blob/master/tf_agents/agents/ppo/examples/v2/train_eval.py

The code used is:

            if global_step_val % policy_checkpoint_interval == 0:
                policy_checkpointer.save(global_step=global_step_val)
                saved_model_path = os.path.join(
                    saved_model_dir, 'policy_' + ('%d' % global_step_val).zfill(9))
                if not os.path.exists(saved_model_path):
                    os.makedirs(saved_model_path)
                saved_model.save(saved_model_path)

After training I am trying to load and use the model as in:

policy = tf.saved_model.load(model_path)

timestep = eval_env.reset()

action = policy.action(timestep)

I am getting the following error:

ValueError: Could not find matching function to call loaded from the SavedModel. Got:
  Positional arguments (3 total):
    * TimeStep(step_type=<tf.Tensor 'time_step:0' shape=() dtype=int32>, reward=<tf.Tensor 'time_step_1:0' shape=() dtype=float32>, discount=<tf.Tensor 'time_step_2:0' shape=() dtype=float32>, observation=<tf.Tensor 'time_step_3:0' shape=(520,) dtype=float32>)
    * ()
    * None
  Keyword arguments: {}

Expected these arguments to match one of the following 2 option(s):

Option 1:
  Positional arguments (3 total):
    * TimeStep(step_type=TensorSpec(shape=(None,), dtype=tf.int32, name='step_type'), reward=TensorSpec(shape=(None,), dtype=tf.float32, name='reward'), discount=TensorSpec(shape=(None,), dtype=tf.float32, name='discount'), observation=TensorSpec(shape=(None, 520), dtype=tf.float32, name='observation'))
    * ([TensorSpec(shape=(None, 2048), dtype=tf.float32, name='0/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='0/1')], [TensorSpec(shape=(None, 2048), dtype=tf.float32, name='1/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='1/1')], [TensorSpec(shape=(None, 2048), dtype=tf.float32, name='2/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='2/1')])
    * None
  Keyword arguments: {}

Option 2:
  Positional arguments (3 total):
    * TimeStep(step_type=TensorSpec(shape=(None,), dtype=tf.int32, name='time_step/step_type'), reward=TensorSpec(shape=(None,), dtype=tf.float32, name='time_step/reward'), discount=TensorSpec(shape=(None,), dtype=tf.float32, name='time_step/discount'), observation=TensorSpec(shape=(None, 520), dtype=tf.float32, name='time_step/observation'))
    * ([TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/0/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/0/1')], [TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/1/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/1/1')], [TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/2/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/2/1')])
    * None
  Keyword arguments: {}
@sguada
Copy link
Member

sguada commented Nov 12, 2019

Policies expect batched time_steps, eex. TimeStep(step_type=TensorSpec(shape=(None,),

So make sure to wrap the eval_env with BatchedPyEnvironment or with TFPyEnvironment

@mirdones
Copy link
Author

I added the line:

eval_env = BatchedPyEnvironment((eval_env,))

and got the error:

ValueError: Could not find matching function to call loaded from the SavedModel. Got:
  Positional arguments (3 total):
    * TimeStep(step_type=<tf.Tensor 'time_step:0' shape=(1,) dtype=int32>, reward=<tf.Tensor 'time_step_1:0' shape=(1,) dtype=float32>, discount=<tf.Tensor 'time_step_2:0' shape=(1,) dtype=float32>, observation=<tf.Tensor 'time_step_3:0' shape=(1, 520) dtype=float32>)
    * ()
    * None
  Keyword arguments: {}

Expected these arguments to match one of the following 2 option(s):

Option 1:
  Positional arguments (3 total):
    * TimeStep(step_type=TensorSpec(shape=(None,), dtype=tf.int32, name='step_type'), reward=TensorSpec(shape=(None,), dtype=tf.float32, name='reward'), discount=TensorSpec(shape=(None,), dtype=tf.float32, name='discount'), observation=TensorSpec(shape=(None, 520), dtype=tf.float32, name='observation'))
    * ([TensorSpec(shape=(None, 2048), dtype=tf.float32, name='0/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='0/1')], [TensorSpec(shape=(None, 2048), dtype=tf.float32, name='1/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='1/1')], [TensorSpec(shape=(None, 2048), dtype=tf.float32, name='2/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='2/1')])
    * None
  Keyword arguments: {}

Option 2:
  Positional arguments (3 total):
    * TimeStep(step_type=TensorSpec(shape=(None,), dtype=tf.int32, name='time_step/step_type'), reward=TensorSpec(shape=(None,), dtype=tf.float32, name='time_step/reward'), discount=TensorSpec(shape=(None,), dtype=tf.float32, name='time_step/discount'), observation=TensorSpec(shape=(None, 520), dtype=tf.float32, name='time_step/observation'))
    * ([TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/0/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/0/1')], [TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/1/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/1/1')], [TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/2/0'), TensorSpec(shape=(None, 2048), dtype=tf.float32, name='policy_state/2/1')])
    * None
  Keyword arguments: {}

But my problem persists as I intend to use the trained agent receiving entries from the real world, not from the simulated environment I use to train it faster.

@sibyjackgrove
Copy link

I am getting a similar error when I tried to load a model that used an RNN model, but not when I use the feedforward model #246.

@jtermaat
Copy link

jtermaat commented Apr 8, 2021

I am getting a similar issue, does anyone know of a workaround for saving and loading a policy with tf_agents?

@username123062
Copy link

When I tried to convert a TF 2.8 pb file to TF 1.15, the same error came to me. Could you help me please? Here is the code:

=========================code start===============================

import os ## tf22_saved_model_to_tf115_pb.py
import tensorflow as tf
from tensorflow import keras
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
import pdb

TF2_model_input = 'CLIP_RN50' # the input path
TF115_model_output = 'dir_for_saving_TF1.15_pb'
#os.makedirs(TF115_model_output, exist_ok = True)

model = tf.keras.models.load_model(TF2_model_input) # read the model ckpt in TF2.8

full_model = tf.function(lambda x,y: model(x,y)) # change the model to concrete function
full_model = full_model.get_concrete_function(tf.TensorSpec(shape=(None, None, None, 3), dtype=tf.float32, name='input/0'),tf.TensorSpec(shape=(None, None, None), dtype=tf.int64, name='input/1')) # Note: if the input information is not in the inherited tf.keras.Model, the Concrete Function is needed to define the input information via TensorSpec.

frozen_func = convert_variables_to_constants_v2(full_model) # change the model parameters to constants

frozen_func.graph.as_graph_def() # change the model graph to graph def

layers = [op.name for op in frozen_func.graph.get_operations()] # debug, check the parameters and parameter names (needed in TF1.15)
for layer in layers:
print(layer)
print("Frozen model inputs: ", frozen_func.inputs)
print("Frozen model outputs: ", frozen_func.outputs)

tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
logdir=frozen_out_path,
name="tf1.15_frozen_graph_model.pb",
as_text=False) # save the Frozen graph using TF1.15

=========================code end===============================

=========================Error start ================================
WARNING:tensorflow:No training configuration found in save file, so the model was not compiled. Compile it manually.
Traceback (most recent call last):
File "convert_tf2pb_to_tf115pb.py", line 19, in
full_model = full_model.get_concrete_function(tf.TensorSpec(shape=(None, None, None, 3), dtype=tf.float32, name='input/0'),tf.TensorSpec(shape=(None, None, None), dtype=tf.int64, name='input/1'))
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1264, in get_concrete_function
concrete = self._get_concrete_function_garbage_collected(*args, **kwargs)
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1244, in _get_concrete_function_garbage_collected
self._initialize(args, kwargs, add_initializers_to=initializers)
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 786, in _initialize
*args, **kwds))
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2983, in _get_concrete_function_internal_garbage_collected
graph_function, _ = self._maybe_define_function(args, kwargs)
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3292, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3140, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1161, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 677, in wrapped_fn
out = weak_wrapped_fn().wrapped(*args, **kwds)
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1147, in autograph_handler
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

File "convert_tf2pb_to_tf115pb.py", line 18, in None  *
    lambda serving_default_image,serving_default_text: model(serving_default_image,serving_default_text))
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
    raise e.with_traceback(filtered_tb) from None
File "/218019043/software/anaconda3/envs/dassl/lib/python3.7/site-packages/keras/saving/saved_model/utils.py", line 166, in replace_training_and_call
    return wrapped_call(*args, **kwargs)

ValueError: Exception encountered when calling layer "clip" (type CLIP).

Could not find matching concrete function to call loaded from the SavedModel. Got:
  Positional arguments (2 total):
    * Tensor("input:0", shape=(None, None, None, 3), dtype=float32)
    * True
  Keyword arguments: {}

 Expected these arguments to match one of the following 4 option(s):

Option 1:
  Positional arguments (2 total):
    * (TensorSpec(shape=(None, None, None, 3), dtype=tf.float32, name='input/0'), TensorSpec(shape=(None, None, None), dtype=tf.int64, name='input/1'))
    * False
  Keyword arguments: {}

Option 2:
  Positional arguments (2 total):
    * (TensorSpec(shape=(None, None, None, 3), dtype=tf.float32, name='input/0'), TensorSpec(shape=(None, None, None), dtype=tf.int64, name='input/1'))
    * True
  Keyword arguments: {}

Option 3:
  Positional arguments (2 total):
    * (TensorSpec(shape=(None, None, None, 3), dtype=tf.float32, name='image'), TensorSpec(shape=(None, None, None), dtype=tf.int64, name='text'))
    * False
  Keyword arguments: {}

Option 4:
  Positional arguments (2 total):
    * (TensorSpec(shape=(None, None, None, 3), dtype=tf.float32, name='image'), TensorSpec(shape=(None, None, None), dtype=tf.int64, name='text'))
    * True
  Keyword arguments: {}

Call arguments received:
  • args=('tf.Tensor(shape=(None, None, None, 3), dtype=float32)', 'tf.Tensor(shape=(None, None, None), dtype=int64)')
  • kwargs=<class 'inspect._empty'>

=========================Error end ================================

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants