ValueError: actor_network output spec does not match action spec #548

Fabien-Couthouis · 2021-02-02T08:40:28Z

Hello,
I am trying to train a PPO agent with the default actor_distribution_network but I get this error:

Traceback (most recent call last):
  File "run_shake_training_ppo.py", line 285, in <module>
    run_training()
  File "run_shake_training_ppo.py", line 83, in run_training
    agent = load_agent(train_env)
  File "run_shake_training_ppo.py", line 124, in load_agent
    agent = PPOClipAgent(
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\gin\config.py", line 1069, in gin_wrapper
    utils.augment_exception_message_and_reraise(e, err_str)
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\gin\utils.py", line 41, in augment_exception_message_and_reraise
    raise proxy.with_traceback(exception.__traceback__) from None
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\gin\config.py", line 1046, in gin_wrapper
    return fn(*new_args, **new_kwargs)
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\tf_agents\agents\ppo\ppo_clip_agent.py", line 199, in __init__
    super(PPOClipAgent, self).__init__(
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\gin\config.py", line 1069, in gin_wrapper
    utils.augment_exception_message_and_reraise(e, err_str)
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\gin\utils.py", line 41, in augment_exception_message_and_reraise
    raise proxy.with_traceback(exception.__traceback__) from None
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\gin\config.py", line 1046, in gin_wrapper
    return fn(*new_args, **new_kwargs)
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\tf_agents\agents\ppo\ppo_agent.py", line 346, in __init__
    ppo_policy.PPOPolicy(
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\gin\config.py", line 1069, in gin_wrapper
    utils.augment_exception_message_and_reraise(e, err_str)
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\gin\utils.py", line 41, in augment_exception_message_and_reraise
    raise proxy.with_traceback(exception.__traceback__) from None
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\gin\config.py", line 1046, in gin_wrapper
    return fn(*new_args, **new_kwargs)
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\tf_agents\agents\ppo\ppo_policy.py", line 116, in __init__
    distribution_utils.assert_specs_are_compatible(
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\tf_agents\distributions\utils.py", line 633, in assert_specs_are_compatible
    tf.nest.map_structure(compare_output_to_spec, event_spec, spec)
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\tensorflow\python\util\nest.py", line 635, in map_structure
    structure[0], [func(*x) for x in entries],
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\tensorflow\python\util\nest.py", line 635, in <listcomp>
    structure[0], [func(*x) for x in entries],
  File "C:\Users\username\Miniconda3\envs\tf-agents\lib\site-packages\tf_agents\distributions\utils.py", line 630, in compare_output_to_spec
    raise ValueError("{}:\n{}\nvs.\n{}".format(message_prefix, event_spec,
ValueError: actor_network output spec does not match action spec:
TensorSpec(shape=(), dtype=tf.int64, name=None)
vs.
BoundedTensorSpec(shape=(1,), dtype=tf.int64, name='action', minimum=array(0, dtype=int64), maximum=array(126, dtype=int64))

In my Python env, I have: self._action_spec = BoundedArraySpec((1,), dtype=np.int64, name="action", minimum=0, maximum=126). Then I wrap my Python env in a TFPyEnvironment.

Note that training is working when I comment the following lines in tf_agents.agents.ppo.ppo_policy.py:

distribution_utils.assert_specs_are_compatible(
            actor_output_spec, action_spec,
            'actor_network output spec does not match action spec')

Does anyone have an idea to fix the error above?
Thanks!

Versions:

OS: Windows 10 version 2004 (build 19041.746)
Python 3.8.5
tf_agents: 0.7.1
tensorflow: 2.4.1
tensorflow-probability: 0.11.1
numpy: 1.20.0

The text was updated successfully, but these errors were encountered:

egonina · 2021-02-04T18:49:12Z

Hi, sorry you are having issues running ppo training. When you say "training is working" do you mean it trains to the expected value? Can you dig into the network/action spec mismatch a bit more to see why the output layer spec doesn't match the action_spec? Looks like the shape is mismatched (() vs (1,))

Fabien-Couthouis · 2021-02-05T08:40:59Z

Hi and thanks for helping!

First, my action specs defined in my env are: self._action_spec = BoundedArraySpec((1,), dtype=np.int64, name="action", minimum=0, maximum=126).

When I say "training is working", I mean training starts without error, with logits outputs from the actor model having the good shape (i.e.: <tf.Tensor: shape=(1, 1, 127), dtype=float32)>).
However, event_shape of the same output_action is: TensorShape([]) and it seems like this event_shape is compared to my actions shape in distributions.utils.assert_specs_are_compatible and produces the error:

nest_utils.assert_same_structure(
        event_spec,
        spec,
        message=("{}:\n{}\nvs.\n{}".format(message_prefix, event_spec, spec)))

event_spec: dtype:tf.int64 name:None shape:TensorShape([])

spec: dtype:tf.int64 maximum:array(126, dtype=int64) minimum:array(0, dtype=int64) name:'action' shape:TensorShape([1])

What is this event_spec and why are we comparing actions shape and model output shape, as the models outputs are logits that are changed in actions in the PPO policy? Is the event_shape the shape of actions took by the policy? If so, where is this event_spec computed?

I am pretty new to tf-agents so I am probably wrong somewhere.

Edit: It was my bad, I solved my issue by changing my action shape, from self._action_spec = BoundedArraySpec(shape=(1,), dtype=np.int64, name="action", minimum=0, maximum=126). to self._action_spec = BoundedArraySpec(shape=(), dtype=np.int64, name="action", minimum=0, maximum=126), as my action shape is only one integer.

cedavidyang · 2021-08-28T01:39:13Z

I'm having the same issue. In my case, my action_spec is not an integer. Instead, each action is represented by an array with two entries: action_spec = BoundedArraySpec(shape=(2,), dtype=np.int32, name="action", minimum=0, maximum=4). Any suggestion for this use case?

summer-yue · 2021-10-07T23:56:08Z

@cedavidyang I responded in #656 let's follow up there.

Fabien-Couthouis closed this as completed Feb 5, 2021

cedavidyang mentioned this issue Sep 7, 2021

PPO policy with ActorDistributionNetwork and discrete action array #656

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: actor_network output spec does not match action spec #548

ValueError: actor_network output spec does not match action spec #548

Fabien-Couthouis commented Feb 2, 2021 •

edited

Loading

egonina commented Feb 4, 2021

Fabien-Couthouis commented Feb 5, 2021 •

edited

Loading

cedavidyang commented Aug 28, 2021

summer-yue commented Oct 7, 2021

ValueError: actor_network output spec does not match action spec #548

ValueError: actor_network output spec does not match action spec #548

Comments

Fabien-Couthouis commented Feb 2, 2021 • edited Loading

egonina commented Feb 4, 2021

Fabien-Couthouis commented Feb 5, 2021 • edited Loading

cedavidyang commented Aug 28, 2021

summer-yue commented Oct 7, 2021

Fabien-Couthouis commented Feb 2, 2021 •

edited

Loading

Fabien-Couthouis commented Feb 5, 2021 •

edited

Loading