Running a subscriber on Humble with publishing from Jazzy / Rolling uses up all memory #797

urfeex · 2025-01-15T09:46:59Z

Bug report

While I am aware that inter-distribution traffic isn't supported, I would at least expect systems not to crash if that occurs. However, we noticed that running both, Humble and Jazzy nodes on the same network can cause the machines running Humble nodes to run out of memory, probably because of discovery traffic. It is sufficient to have a Humble subscriber and do a ros2 topic list from Jazzy to (sometimes) trigger the issue.

Required Info:

Operating System: Ubuntu 22.04 / 24.04
Installation type: binary
Version or commit hash: latest
DDS implementation:
- default -> fastrtps
Client library (if applicable):
- Checked using rclpy

Steps to reproduce issue

The following docker-compose file illustrates the issue. Running this will make your system run out of memory!!!

---
version: '2'

networks:
  rosdocker:
    driver: bridge

services:
  talker:
    image: ros:jazzy
    container_name: lister
    hostname: lister
    networks:
      - rosdocker
    environment:
      - "ROS_DOMAIN_ID=13"
    command: "ros2 topic list"
    restart: always  # It seems not to happen every time, hence the restart

  listener:
    image: ros:humble
    container_name: listener
    hostname: listener
    networks:
      - rosdocker
    environment:
      - "ROS_DOMAIN_ID=13"
    command: "ros2 topic echo /chatter std_msgs/String"

Expected behavior

Error messages, silent ignores, or magically just work. Note: As said earlier, I do not expect cross-distro communication to simply work, but I would expect stable behavior.

Actual behavior

The ros2 topic echo seems to cause unlimited memory consumption and makes the system run OOM in a matter of seconds.

Additional information

We haven't investigated things further down RMW code, but we do see the error output caught in Capture std::bad_alloc on deserializeROSmessage. (backport #665) #737 on the humble node.
I have tried the same thing with rmw_cyclone_cpp which throws warnings on the list and errors on the echo but doesn't crash or end up in OOM. So, this seems to be a rmw_fastrtps_cpp issue.

The text was updated successfully, but these errors were encountered:

fujitatomoya · 2025-01-15T17:46:49Z

@urfeex thanks for creating issue.

Note: As said earlier, I do not expect cross-distro communication to simply work, but I would expect stable behavior.

as you already mentioned, cross-distro communication is not supported. (interfaces are not guaranteed compatible, including breaking ABI/API)
that said, unfortunately we do not really expect the stable behavior...

you can keep this open, but i do not investigate any further on this issue.

urfeex · 2025-01-17T09:12:27Z

As I said earlier, this issue is not about cross-distro communication. It's about taking down a complete computer as soon as some random jazzy / rolling node shows up on the network sending discovery messages.

In my opinion it would be fine to silently ignore those, print errors all over the place, whatever. But having a PC use up all memory seems not like something that should be considered "just not supported". It was my impression that #737 was created also with the motivation to prevent crashes because of that scenario.

fujitatomoya · 2025-01-17T16:44:15Z

As I said earlier, this issue is not about cross-distro communication.

i think what you mean is application data-plane.

as soon as some random jazzy / rolling node shows up on the network sending discovery messages.

ROS 2 already communicate in discovery, so cross-distro communication is taking place in discovery to develop the endpoint connectivity.

In my opinion it would be fine to silently ignore those, print errors all over the place, whatever. But having a PC use up all memory seems not like something that should be considered "just not supported".

good point, totally agree this.

one thing i would like to ask you as a possible work-around. can you set the different ROS_DOMAIN_ID for jazzy and rolling? https://docs.ros.org/en/eloquent/Tutorials/Configuring-ROS2-Environment.html#the-ros-domain-id-variable

this should provide the logical partition for the discovery process, that means no discovery between jazzy and rolling at all.

urfeex · 2025-01-17T16:53:29Z

ROS 2 already communicate in discovery, so cross-distro communication is taking place in discovery to develop the endpoint connectivity.

Yes, that is clear to me. What I wanted to say is: We do not try to actively do any cross-distro communication or expect any cross-distro communication to work. We just want systems not to go down because of a participant in the same domain ID gets active on the same network. But I think that has become clear by now :-)

one thing i would like to ask you as a possible work-around. can you set the different ROS_DOMAIN_ID for jazzy and rolling?

Yes, setting the ROS_DOMAIN_ID has been identified as a workaround already, I should have mentioned that. Unfortunately, this only makes it less likely to happen.

In my opinion it would be fine to silently ignore those, print errors all over the place, whatever. But having a PC use up all memory seems not like something that should be considered "just not supported".

good point, totally agree this.

Does that mean you think searching for a solution for this might be the way to go? Can we support this in any way? I cannot promise any resources at the moment, though.

fujitatomoya · 2025-01-17T17:23:55Z

What I wanted to say is: We do not try to actively do any cross-distro communication or expect any cross-distro communication to work. We just want systems not to go down because of a participant in the same domain ID gets active on the same network.

yeah, this is not good user-experience, silently causing the problem. if that is not supported, disallow / warning notification would be much better for user.

Yes, setting the ROS_DOMAIN_ID has been identified as a workaround already, I should have mentioned that.

no worries, good to know that works.

Does that mean you think searching for a solution for this might be the way to go?

i do not think so, as far as i know there is nobody is planning for that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running a subscriber on Humble with publishing from Jazzy / Rolling uses up all memory #797

Running a subscriber on Humble with publishing from Jazzy / Rolling uses up all memory #797

urfeex commented Jan 15, 2025

fujitatomoya commented Jan 15, 2025

urfeex commented Jan 17, 2025

fujitatomoya commented Jan 17, 2025

urfeex commented Jan 17, 2025

fujitatomoya commented Jan 17, 2025

Running a subscriber on Humble with publishing from Jazzy / Rolling uses up all memory #797

Running a subscriber on Humble with publishing from Jazzy / Rolling uses up all memory #797

Comments

urfeex commented Jan 15, 2025

Bug report

Steps to reproduce issue

Expected behavior

Actual behavior

Additional information

fujitatomoya commented Jan 15, 2025

urfeex commented Jan 17, 2025

fujitatomoya commented Jan 17, 2025

urfeex commented Jan 17, 2025

fujitatomoya commented Jan 17, 2025