Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to gracefully terminate a python node #1383

Closed
christophfroehlich opened this issue Nov 22, 2024 · 2 comments · Fixed by #1400
Closed

how to gracefully terminate a python node #1383

christophfroehlich opened this issue Nov 22, 2024 · 2 comments · Fixed by #1400

Comments

@christophfroehlich
Copy link
Contributor

christophfroehlich commented Nov 22, 2024

I'm a little bit puzzled about the fact that the code in the finally branch has zero coverage. maybe rlcpy.ok() is always false here. Should the destroy_node() be called even if the context is not valid any more? @saikishor any idea?

image

Originally posted by @christophfroehlich in #1369 (comment)

This does not happen always, but if it enters there is a different error

2024-11-22T12:11:30.3702266Z [publisher_joint_trajectory_controller-2] Keyboard interrupt received. Shutting down node.
2024-11-22T12:11:30.3703121Z [publisher_joint_trajectory_controller-2] Traceback (most recent call last):
2024-11-22T12:11:30.3704880Z [publisher_joint_trajectory_controller-2]   File "/home/runner/work/ros2_controllers/ros2_controllers/.work/target_ws/install/ros2_controllers_test_nodes/lib/ros2_controllers_test_nodes/publisher_joint_trajectory_controller", line 33, in <module>
2024-11-22T12:11:30.3706968Z [publisher_joint_trajectory_controller-2]     sys.exit(load_entry_point('ros2-controllers-test-nodes==4.16.0', 'console_scripts', 'publisher_joint_trajectory_controller')())
2024-11-22T12:11:30.3708246Z [publisher_joint_trajectory_controller-2]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-11-22T12:11:30.3710170Z [publisher_joint_trajectory_controller-2]   File "/home/runner/work/ros2_controllers/ros2_controllers/.work/target_ws/install/ros2_controllers_test_nodes/lib/python3.12/site-packages/ros2_controllers_test_nodes/publisher_joint_trajectory_controller.py", line 196, in main
2024-11-22T12:11:30.3711758Z [publisher_joint_trajectory_controller-2]     rclpy.shutdown()
2024-11-22T12:11:30.3712778Z [publisher_joint_trajectory_controller-2]   File "/opt/ros/rolling/lib/python3.12/site-packages/rclpy/__init__.py", line 170, in shutdown
2024-11-22T12:11:30.3713790Z [publisher_joint_trajectory_controller-2]     _shutdown(context=context)
2024-11-22T12:11:30.9113697Z [publisher_joint_trajectory_controller-2]   File "/opt/ros/rolling/lib/python3.12/site-packages/rclpy/utilities.py", line 82, in shutdown
2024-11-22T12:11:30.9115644Z [publisher_joint_trajectory_controller-2]     context.shutdown()
2024-11-22T12:11:30.9117325Z [publisher_joint_trajectory_controller-2]   File "/opt/ros/rolling/lib/python3.12/site-packages/rclpy/context.py", line 179, in shutdown
2024-11-22T12:11:30.9118481Z [publisher_joint_trajectory_controller-2]     self._cleanup()
2024-11-22T12:11:30.9119840Z [publisher_joint_trajectory_controller-2]   File "/opt/ros/rolling/lib/python3.12/site-packages/rclpy/context.py", line 170, in _cleanup
2024-11-22T12:11:30.9121292Z [publisher_joint_trajectory_controller-2]     self.__context.shutdown()
2024-11-22T12:11:30.9122665Z [publisher_joint_trajectory_controller-2] rclpy._rclpy_pybind11.RCLError: failed to shutdown: rcl_shutdown already called on the given context, at ./src/rcl/init.c:290
2024-11-22T12:11:30.9125963Z [ERROR] [publisher_joint_trajectory_controller-2]: process has died [pid 30093, exit code 1, cmd '/home/runner/work/ros2_controllers/ros2_controllers/.work/target_ws/install/ros2_controllers_test_nodes/lib/ros2_controllers_test_nodes/publisher_joint_trajectory_controller --ros-args --params-file /home/runner/work/ros2_controllers/ros2_controllers/.work/target_ws/install/ros2_controllers_test_nodes/share/ros2_controllers_test_nodes/test/

https://github.com/ros-controls/ros2_controllers/actions/runs/11972094365/job/33378268325?pr=1314

@saikishor
Copy link
Member

Thanks for creating an issue!

@firesurfer
Copy link
Contributor

firesurfer commented Nov 26, 2024

I did some research when looking into this: ros2/rclpy#1287 and if I understood it correctly
the context becomes invalid after leaving the spin with an exception.

In our main python node we have this weird looking construct:

    retry_counter = 0
    while rclpy.ok():
        try:
            rclpy.spin(node)
            retry_counter += 1
        except (KeyboardInterrupt, rclpy.executors.ExternalShutdownException):
            break
        except Exception as ex:
            node.get_logger().error(str(ex))
            node.get_logger().error(str(traceback.format_exc()))

            if retry_counter > 10:
                break

The retry_counter was/(is ?) a workaround for a just once in a while occurring issue with rclpy where it would throw an exception but the context was still okey. The loggers are still operational afterwards btw. but do not log to ros_out anymore - only to the stdout.

TLDR:
I would say: Do not call shutdown when exiting spin with an exception as the context is already destroyed.
If you need ROS communication afterwards: You can create a new context and create new node in that context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants