Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Getting Started first example executed with TF2, is followed by an error #45821

Open
Deonixlive opened this issue Jun 8, 2024 · 2 comments
Assignees
Labels
bug Something that is supposed to be working; but isn't P2 Important issue, but not time-critical rllib RLlib related issues

Comments

@Deonixlive
Copy link

Deonixlive commented Jun 8, 2024

What happened + What you expected to happen

I tried the getting started commands at https://docs.ray.io/en/latest/rllib/rllib-training.html
With pip install tensorflow[and-cuda] followed by pip install "ray[rllib]".

Then I tried the example: rllib train --algo DQN --env CartPole-v1 --framework tf2 --stop '{"training_iteration": 30}'
This is followed by an ValueError instead of a saved checkpoint with a trained model.

(DQN pid=210953) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::DQN.init() (pid=210953, ip=192.168.1.207, actor_id=94093f51e79110d273a302e501000000, repr=DQN)
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/algorithms/algorithm.py", line 554, in init
(DQN pid=210953) super().init(
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/tune/trainable/trainable.py", line 158, in init
(DQN pid=210953) self.setup(copy.deepcopy(self.config))
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/algorithms/algorithm.py", line 640, in setup
(DQN pid=210953) self.workers = EnvRunnerGroup(
(DQN pid=210953) ^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py", line 169, in init
(DQN pid=210953) self._setup(
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py", line 260, in _setup
(DQN pid=210953) self._local_worker = self._make_worker(
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/env/env_runner_group.py", line 1108, in _make_worker
(DQN pid=210953) worker = cls(
(DQN pid=210953) ^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/evaluation/rollout_worker.py", line 532, in init
(DQN pid=210953) self._update_policy_map(policy_dict=self.policy_dict)
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1737, in _update_policy_map
(DQN pid=210953) self._build_policy_map(
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1848, in _build_policy_map
(DQN pid=210953) new_policy = create_policy_for_framework(
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework
(DQN pid=210953) return policy_class(observation_space, action_space, merged_config)
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/policy/eager_tf_policy.py", line 167, in init
(DQN pid=210953) super(TracedEagerPolicy, self).init(*args, **kwargs)
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/policy/eager_tf_policy.py", line 429, in init
(DQN pid=210953) self.model = make_model(self, observation_space, action_space, config)
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/algorithms/dqn/dqn_tf_policy.py", line 181, in build_q_model
(DQN pid=210953) q_model = ModelCatalog.get_model_v2(
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2
(DQN pid=210953) return wrapper(
(DQN pid=210953) ^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/algorithms/dqn/distributional_q_tf_model.py", line 165, in init
(DQN pid=210953) q_out = build_action_value(name + "/action_value/", self.model_out)
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/ray/rllib/algorithms/dqn/distributional_q_tf_model.py", line 135, in build_action_value
(DQN pid=210953) logits = tf.expand_dims(tf.ones_like(action_scores), -1)
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/tensorflow/python/ops/weak_tensor_ops.py", line 88, in wrapper
(DQN pid=210953) return op(*args, **kwargs)
(DQN pid=210953) ^^^^^^^^^^^^^^^^^^^
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
(DQN pid=210953) raise e.with_traceback(filtered_tb) from None
(DQN pid=210953) File "/home/dime/miniconda3/envs/rllib/lib/python3.11/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in tf_tensor
(DQN pid=210953) raise ValueError(
(DQN pid=210953) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like:
(DQN pid=210953)
(DQN pid=210953) (DQN pid=210953) x = Input(...) (DQN pid=210953) ... (DQN pid=210953) tf_fn(x) # Invalid. (DQN pid=210953)
(DQN pid=210953)
(DQN pid=210953) What you should do instead is wrap tf_fn in a layer:
(DQN pid=210953)
(DQN pid=210953) (DQN pid=210953) class MyLayer(Layer): (DQN pid=210953) def call(self, x): (DQN pid=210953) return tf_fn(x) (DQN pid=210953) (DQN pid=210953) x = MyLayer()(x) (DQN pid=210953)

Versions / Dependencies

  • Ubuntu 22.04.4 LTS
  • Python 3.11.0
  • Tensorflow 2.16.1
  • Ray 2.24.0

Reproduction script

pip install tensorflow[and-cuda]
pip install "ray[rllib]"

rllib train --algo DQN --env CartPole-v1 --framework tf2 --stop '{"training_iteration": 30}'

Issue Severity

Medium: It is a significant difficulty but I can work around it.

@Deonixlive Deonixlive added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jun 8, 2024
@anyscalesam anyscalesam added the rllib RLlib related issues label Jun 12, 2024
@simonsays1980 simonsays1980 added P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jun 28, 2024
@RocketRider
Copy link
Contributor

RocketRider commented Jul 2, 2024

This can probably be fixed easily. Looks similar to this: #45562
Currently it only works if we set keras 3 to legacy mode.

@Niqnil
Copy link

Niqnil commented Jul 18, 2024

I think I ran into the same error.

Versions / Dependencies
Ubuntu 22.04.4
Python 3.10.12
Tensorflow 2.17.0
Ray 2.32.0

import gymnasium as gym
import ray
from ray import tune
from ray.rllib.algorithms.ppo import PPOConfig


class CartPoleEnv(gym.Env):
    def __init__(self, config):
        self.env = gym.make("CartPole-v1")
        self.action_space = self.env.action_space
        self.observation_space = self.env.observation_space

    def reset(self):
        return self.env.reset()

    def step(self, action):
        return self.env.step(action)


ray.init()

config = (
    PPOConfig()
    .environment(CartPoleEnv)
    .framework("tf2")
    .training(model={"use_lstm": True})
)

tune.run(
    "PPO",
    config=config.to_dict(),
    stop={"training_iteration": 1},
)

ray.shutdown()

Error

2024-07-18 18:28:56,348 INFO worker.py:1779 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265
/home/user/Code/proj/.venv/lib/python3.10/site-packages/gymnasium/spaces/box.py:130: UserWarning: WARN: Box bound precision lowered by casting to float32
gym.logger.warn(f"Box bound precision lowered by casting to {self.dtype}")
/home/user/Code/proj/.venv/lib/python3.10/site-packages/gymnasium/utils/passive_env_checker.py:164: UserWarning: WARN: The obs returned by the reset() method was expecting numpy array dtype to be float32, actual type: float64
logger.warn(
/home/user/Code/proj/.venv/lib/python3.10/site-packages/gymnasium/utils/passive_env_checker.py:188: UserWarning: WARN: The obs returned by the reset() method is not within the observation space.
logger.warn(f"{pre} is not within the observation space.")
╭────────────────────────────────────────────────────────────╮
│ Configuration for experiment PPO_2024-07-18_18-28-57 │
├────────────────────────────────────────────────────────────┤
│ Search algorithm BasicVariantGenerator │
│ Scheduler FIFOScheduler │
│ Number of trials 1 │
╰────────────────────────────────────────────────────────────╯

View detailed results here: /home/user/ray_results/PPO_2024-07-18_18-28-57
To visualize your results with TensorBoard, run: tensorboard --logdir /tmp/ray/session_2024-07-18_18-28-54_522281_28467/artifacts/2024-07-18_18-28-57/PPO_2024-07-18_18-28-57/driver_artifacts
2024-07-18 18:28:57,598 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_cpus_per_worker has been deprecated. Use AlgorithmConfig.num_cpus_per_env_runner instead. This will raise an error in the future!
2024-07-18 18:28:57,598 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_gpus_per_worker has been deprecated. Use AlgorithmConfig.num_gpus_per_env_runner instead. This will raise an error in the future!
2024-07-18 18:28:57,599 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_learner_workers has been deprecated. Use AlgorithmConfig.num_learners instead. This will raise an error in the future!
2024-07-18 18:28:57,599 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_cpus_per_learner_worker has been deprecated. Use AlgorithmConfig.num_cpus_per_learner instead. This will raise an error in the future!
2024-07-18 18:28:57,599 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_gpus_per_learner_worker has been deprecated. Use AlgorithmConfig.num_gpus_per_learner instead. This will raise an error in the future!

Trial status: 1 PENDING
Current time: 2024-07-18 18:28:57. Total running time: 0s
Logical resource usage: 0/16 CPUs, 0/0 GPUs
╭────────────────────────────────────────╮
│ Trial name status │
├────────────────────────────────────────┤
│ PPO_CartPoleEnv_8a5aa_00000 PENDING │
╰────────────────────────────────────────╯
(PPO pid=29645) 2024-07-18 18:29:00,592 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_cpus_per_worker has been deprecated. Use AlgorithmConfig.num_cpus_per_env_runner instead. This will raise an error in the future!
(PPO pid=29645) 2024-07-18 18:29:00,592 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_gpus_per_worker has been deprecated. Use AlgorithmConfig.num_gpus_per_env_runner instead. This will raise an error in the future!
(PPO pid=29645) 2024-07-18 18:29:00,592 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_learner_workers has been deprecated. Use AlgorithmConfig.num_learners instead. This will raise an error in the future!
(PPO pid=29645) 2024-07-18 18:29:00,592 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_cpus_per_learner_worker has been deprecated. Use AlgorithmConfig.num_cpus_per_learner instead. This will raise an error in the future!
(PPO pid=29645) 2024-07-18 18:29:00,593 WARNING deprecation.py:50 -- DeprecationWarning: AlgorithmConfig.num_gpus_per_learner_worker has been deprecated. Use AlgorithmConfig.num_gpus_per_learner instead. This will raise an error in the future!
2024-07-18 18:29:04,542 ERROR tune_controller.py:1331 -- Trial task failed for trial PPO_CartPoleEnv_8a5aa_00000
Traceback (most recent call last):
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future
result = ray.get(future)
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
return fn(*args, **kwargs)
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/_private/worker.py", line 2656, in get
values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/_private/worker.py", line 873, in get_objects
raise value
ray.exceptions.ActorDiedError: The actor died because of an error raised in its creation task, ray::PPO.init() (pid=29645, ip=192.168.1.58, actor_id=deea0a06320955d8b84b3e7e01000000, repr=PPO)
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 241, in _setup
self.add_workers(
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 801, in add_workers
raise result.get()
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 500, in _fetch_result
result = ray.get(ready)
ray.exceptions.ActorDiedError: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29732, ip=192.168.1.58, actor_id=fa44599552da1c71bf3027bd01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7e80b3636d10>)
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 521, in init
self._update_policy_map(policy_dict=self.policy_dict)
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1728, in _update_policy_map
self._build_policy_map(
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1839, in _build_policy_map
new_policy = create_policy_for_framework(
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework
return policy_class(observation_space, action_space, merged_config)
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy.py", line 167, in init
super(TracedEagerPolicy, self).init(*args, **kwargs)
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo_tf_policy.py", line 81, in init
base.init(
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 120, in init
self.model = self.make_model()
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 271, in make_model
return ModelCatalog.get_model_v2(
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2
return wrapper(
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/tf/recurrent_net.py", line 195, in init
mask=tf.sequence_mask(seq_in),
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in tf_tensor
raise ValueError(
ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like:

x = Input(...)
...
tf_fn(x)  # Invalid.

What you should do instead is wrap tf_fn in a layer:

class MyLayer(Layer):
    def call(self, x):
        return tf_fn(x)

x = MyLayer()(x)

During handling of the above exception, another exception occurred:

ray::PPO.init() (pid=29645, ip=192.168.1.58, actor_id=deea0a06320955d8b84b3e7e01000000, repr=PPO)
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 532, in init
super().init(
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 158, in init
self.setup(copy.deepcopy(self.config))
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 618, in setup
self.workers = EnvRunnerGroup(
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 193, in init
raise e.args[0].args[2]
ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like:

x = Input(...)
...
tf_fn(x)  # Invalid.

What you should do instead is wrap tf_fn in a layer:

class MyLayer(Layer):
    def call(self, x):
        return tf_fn(x)

x = MyLayer()(x)

Trial PPO_CartPoleEnv_8a5aa_00000 errored after 0 iterations at 2024-07-18 18:29:04. Total running time: 6s
Error file: /tmp/ray/session_2024-07-18_18-28-54_522281_28467/artifacts/2024-07-18_18-28-57/PPO_2024-07-18_18-28-57/driver_artifacts/PPO_CartPoleEnv_8a5aa_00000_0_2024-07-18_18-28-57/error.txt
2024-07-18 18:29:04,547 INFO tune.py:1009 -- Wrote the latest version of all result files and experiment state to '/home/user/ray_results/PPO_2024-07-18_18-28-57' in 0.0037s.

Trial status: 1 ERROR
Current time: 2024-07-18 18:29:04. Total running time: 6s
Logical resource usage: 0/16 CPUs, 0/0 GPUs
╭────────────────────────────────────────╮
│ Trial name status │
├────────────────────────────────────────┤
│ PPO_CartPoleEnv_8a5aa_00000 ERROR │
╰────────────────────────────────────────╯

Number of errored trials: 1
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Trial name # failures error file │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ PPO_CartPoleEnv_8a5aa_00000 1 /tmp/ray/session_2024-07-18_18-28-54_522281_28467/artifacts/2024-07-18_18-28-57/PPO_2024-07-18_18-28-57/driver_artifacts/PPO_CartPoleEnv_8a5aa_00000_0_2024-07-18_18-28-57/error.txt │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/user/Code/proj/proj/rllib_ex.py", line 29, in
tune.run(
File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/tune/tune.py", line 1035, in run
raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_CartPoleEnv_8a5aa_00000])
(PPO pid=29645) 2024-07-18 18:29:04,525 ERROR actor_manager.py:523 -- Ray error, taking actor 1 out of service. The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29732, ip=192.168.1.58, actor_id=fa44599552da1c71bf3027bd01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7e80b3636d10>)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 521, in init
(PPO pid=29645) self._update_policy_map(policy_dict=self.policy_dict)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1728, in _update_policy_map
(PPO pid=29645) self._build_policy_map(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1839, in _build_policy_map
(PPO pid=29645) new_policy = create_policy_for_framework(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework
(PPO pid=29645) return policy_class(observation_space, action_space, merged_config)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy.py", line 167, in init
(PPO pid=29645) super(TracedEagerPolicy, self).init(*args, **kwargs)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo_tf_policy.py", line 81, in init
(PPO pid=29645) base.init(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 120, in init
(PPO pid=29645) self.model = self.make_model()
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 271, in make_model
(PPO pid=29645) return ModelCatalog.get_model_v2(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2
(PPO pid=29645) return wrapper(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/tf/recurrent_net.py", line 195, in init
(PPO pid=29645) mask=tf.sequence_mask(seq_in),
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
(PPO pid=29645) raise e.with_traceback(filtered_tb) from None
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in tf_tensor
(PPO pid=29645) raise ValueError(
(PPO pid=29645) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like:
(PPO pid=29645)
(PPO pid=29645) (PPO pid=29645) x = Input(...) (PPO pid=29645) ... (PPO pid=29645) tf_fn(x) # Invalid. (PPO pid=29645)
(PPO pid=29645)
(PPO pid=29645) What you should do instead is wrap tf_fn in a layer:
(PPO pid=29645)
(PPO pid=29645) (PPO pid=29645) class MyLayer(Layer): (PPO pid=29645) def call(self, x): (PPO pid=29645) return tf_fn(x) (PPO pid=29645) (PPO pid=29645) x = MyLayer()(x) (PPO pid=29645)
(PPO pid=29645) 2024-07-18 18:29:04,526 ERROR actor_manager.py:523 -- Ray error, taking actor 2 out of service. The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29733, ip=192.168.1.58, actor_id=5bfd0c39db45fb7efe6bcfb501000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7aabd5b52c80>)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 521, in init
(PPO pid=29645) self._update_policy_map(policy_dict=self.policy_dict)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1728, in _update_policy_map
(PPO pid=29645) self._build_policy_map(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1839, in _build_policy_map
(PPO pid=29645) new_policy = create_policy_for_framework(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework
(PPO pid=29645) return policy_class(observation_space, action_space, merged_config)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy.py", line 167, in init
(PPO pid=29645) super(TracedEagerPolicy, self).init(*args, **kwargs)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo_tf_policy.py", line 81, in init
(PPO pid=29645) base.init(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 120, in init
(PPO pid=29645) self.model = self.make_model()
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 271, in make_model
(PPO pid=29645) return ModelCatalog.get_model_v2(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2
(PPO pid=29645) return wrapper(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/tf/recurrent_net.py", line 195, in init
(PPO pid=29645) mask=tf.sequence_mask(seq_in),
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
(PPO pid=29645) raise e.with_traceback(filtered_tb) from None
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in tf_tensor
(PPO pid=29645) raise ValueError(
(PPO pid=29645) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like:
(PPO pid=29645)
(PPO pid=29645) (PPO pid=29645) x = Input(...) (PPO pid=29645) ... (PPO pid=29645) tf_fn(x) # Invalid. (PPO pid=29645)
(PPO pid=29645)
(PPO pid=29645) What you should do instead is wrap tf_fn in a layer:
(PPO pid=29645)
(PPO pid=29645) (PPO pid=29645) class MyLayer(Layer): (PPO pid=29645) def call(self, x): (PPO pid=29645) return tf_fn(x) (PPO pid=29645) (PPO pid=29645) x = MyLayer()(x) (PPO pid=29645)
(PPO pid=29645) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPO.init() (pid=29645, ip=192.168.1.58, actor_id=deea0a06320955d8b84b3e7e01000000, repr=PPO)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 241, in _setup
(PPO pid=29645) self.add_workers(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 801, in add_workers
(PPO pid=29645) raise result.get()
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 500, in _fetch_result
(PPO pid=29645) result = ray.get(ready)
(PPO pid=29645) ray.exceptions.ActorDiedError: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29732, ip=192.168.1.58, actor_id=fa44599552da1c71bf3027bd01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7e80b3636d10>)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 521, in init
(PPO pid=29645) self._update_policy_map(policy_dict=self.policy_dict)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1728, in _update_policy_map
(PPO pid=29645) self._build_policy_map(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1839, in _build_policy_map
(PPO pid=29645) new_policy = create_policy_for_framework(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework
(PPO pid=29645) return policy_class(observation_space, action_space, merged_config)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy.py", line 167, in init
(PPO pid=29645) super(TracedEagerPolicy, self).init(*args, **kwargs)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo_tf_policy.py", line 81, in init
(PPO pid=29645) base.init(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 120, in init
(PPO pid=29645) self.model = self.make_model()
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 271, in make_model
(PPO pid=29645) return ModelCatalog.get_model_v2(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2
(PPO pid=29645) return wrapper(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/tf/recurrent_net.py", line 195, in init
(PPO pid=29645) mask=tf.sequence_mask(seq_in),
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
(PPO pid=29645) raise e.with_traceback(filtered_tb) from None
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in tf_tensor
(PPO pid=29645) raise ValueError(
(PPO pid=29645) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like:
(PPO pid=29645)
(PPO pid=29645) (PPO pid=29645) x = Input(...) (PPO pid=29645) ... (PPO pid=29645) tf_fn(x) # Invalid. (PPO pid=29645)
(PPO pid=29645)
(PPO pid=29645) What you should do instead is wrap tf_fn in a layer:
(PPO pid=29645)
(PPO pid=29645) (PPO pid=29645) class MyLayer(Layer): (PPO pid=29645) def call(self, x): (PPO pid=29645) return tf_fn(x) (PPO pid=29645) (PPO pid=29645) x = MyLayer()(x) (PPO pid=29645)
(PPO pid=29645)
(PPO pid=29645) During handling of the above exception, another exception occurred:
(PPO pid=29645)
(PPO pid=29645) ray::PPO.init() (pid=29645, ip=192.168.1.58, actor_id=deea0a06320955d8b84b3e7e01000000, repr=PPO)
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 532, in init
(PPO pid=29645) super().init(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 158, in init
(PPO pid=29645) self.setup(copy.deepcopy(self.config))
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 618, in setup
(PPO pid=29645) self.workers = EnvRunnerGroup(
(PPO pid=29645) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/env/env_runner_group.py", line 193, in init
(PPO pid=29645) raise e.args[0].args[2]
(PPO pid=29645) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like:
(PPO pid=29645)
(PPO pid=29645) (PPO pid=29645) x = Input(...) (PPO pid=29645) ... (PPO pid=29645) tf_fn(x) # Invalid. (PPO pid=29645)
(PPO pid=29645)
(PPO pid=29645) What you should do instead is wrap tf_fn in a layer:
(PPO pid=29645)
(PPO pid=29645) (PPO pid=29645) class MyLayer(Layer): (PPO pid=29645) def call(self, x): (PPO pid=29645) return tf_fn(x) (PPO pid=29645) (PPO pid=29645) x = MyLayer()(x) (PPO pid=29645)
(RolloutWorker pid=29732) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29732, ip=192.168.1.58, actor_id=fa44599552da1c71bf3027bd01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7e80b3636d10>)
(RolloutWorker pid=29732)
(RolloutWorker pid=29732)
(RolloutWorker pid=29732)
(RolloutWorker pid=29732)
(RolloutWorker pid=29733)
(RolloutWorker pid=29733)
(RolloutWorker pid=29733)
(RolloutWorker pid=29733)
(RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/tf/recurrent_net.py", line 195, in init [repeated 10x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)
(RolloutWorker pid=29733) self._update_policy_map(policy_dict=self.policy_dict) [repeated 2x across cluster]
(RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1728, in _update_policy_map [repeated 2x across cluster]
(RolloutWorker pid=29733) self._build_policy_map( [repeated 2x across cluster]
(RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1839, in _build_policy_map [repeated 2x across cluster]
(RolloutWorker pid=29733) new_policy = create_policy_for_framework( [repeated 2x across cluster]
(RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/utils/policy.py", line 138, in create_policy_for_framework [repeated 2x across cluster]
(RolloutWorker pid=29733) return policy_class(observation_space, action_space, merged_config) [repeated 2x across cluster]
(RolloutWorker pid=29733) super(TracedEagerPolicy, self).init(*args, **kwargs) [repeated 2x across cluster]
(RolloutWorker pid=29733) base.init( [repeated 2x across cluster]
(RolloutWorker pid=29733) self.model = self.make_model() [repeated 2x across cluster]
(RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/policy/eager_tf_policy_v2.py", line 271, in make_model [repeated 2x across cluster]
(RolloutWorker pid=29733) return ModelCatalog.get_model_v2( [repeated 2x across cluster]
(RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/ray/rllib/models/catalog.py", line 799, in get_model_v2 [repeated 2x across cluster]
(RolloutWorker pid=29733) return wrapper( [repeated 2x across cluster]
(RolloutWorker pid=29733) mask=tf.sequence_mask(seq_in), [repeated 2x across cluster]
(RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler [repeated 2x across cluster]
(RolloutWorker pid=29733) raise e.with_traceback(filtered_tb) from None [repeated 2x across cluster]
(RolloutWorker pid=29733) File "/home/user/Code/proj/.venv/lib/python3.10/site-packages/keras/src/backend/common/keras_tensor.py", line 91, in tf_tensor [repeated 2x across cluster]
(RolloutWorker pid=29733) raise ValueError( [repeated 2x across cluster]
(RolloutWorker pid=29733) ValueError: A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). You are likely doing something like: [repeated 2x across cluster]
(RolloutWorker pid=29733) ``` [repeated 8x across cluster]
(RolloutWorker pid=29733) x = Input(...) [repeated 2x across cluster]
(RolloutWorker pid=29733) ... [repeated 2x across cluster]
(RolloutWorker pid=29733) tf_fn(x) # Invalid. [repeated 2x across cluster]
(RolloutWorker pid=29733) What you should do instead is wrap tf_fn in a layer: [repeated 2x across cluster]
(RolloutWorker pid=29733) class MyLayer(Layer): [repeated 2x across cluster]
(RolloutWorker pid=29733) def call(self, x): [repeated 2x across cluster]
(RolloutWorker pid=29733) return tf_fn(x) [repeated 2x across cluster]
(RolloutWorker pid=29733) x = MyLayer()(x) [repeated 2x across cluster]
(RolloutWorker pid=29733) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=29733, ip=192.168.1.58, actor_id=5bfd0c39db45fb7efe6bcfb501000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7aabd5b52c80>)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't P2 Important issue, but not time-critical rllib RLlib related issues
Projects
None yet
Development

No branches or pull requests

6 participants