Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hotfix/memfile-ack-logic-deadlock #1795

Merged
merged 5 commits into from
Nov 20, 2024

Conversation

rex-schilasky
Copy link
Contributor

Description

If the shm data transport layer acknowledgment feature is used and the ack time is set to a large value the publisher send may wait for a this long time on the ack event from the subscriber even the subscriber is no more existing (process is terminated for example). The SHM Datawriter Connect/Disconnect API is locked as long the Datawriter is waiting for the ack signal. That leads to a lock of the Registration layer finally.

Solution:

  1. Add RemoveSubscription to CDataWriterSHM to react on un-registrations from the registration layer in general
  2. Decouple Connect/Disconnect functionality of CSyncMemoryFile from SyncContent() to be able to modify the map of the memfile connection events even SyncContent() is waiting on a sync event.

… to "deadlock" in case of very long ack timeout setting
@rex-schilasky rex-schilasky added the cherry-pick-to-NONE Don't cherry-pick these changes label Nov 14, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

ecal/core/src/readwrite/shm/ecal_writer_shm.h Outdated Show resolved Hide resolved
rex-schilasky and others added 3 commits November 14, 2024 12:10
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
void CDataWriterSHM::RemoveSubscription(const std::string& host_name_, const int32_t process_id_, const std::string& topic_id_)
{
// we accept local disconnections only
if (host_name_ != m_attributes.host_name) return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that this is correct. If we communicate through docker images, we might have different host_name, but identical host_group.
Anyways, who applies the RemoveSubscription?
If you remove only subscriptions from IDs which have previously been subscribed, you don't need to do any checks here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a general issue, and will be adressed with #1804

Copy link
Contributor

@KerstinKeller KerstinKeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@rex-schilasky rex-schilasky merged commit 50ddf9f into master Nov 20, 2024
20 checks passed
@rex-schilasky rex-schilasky deleted the hotfix/memfile-ack-logic-deadlock branch November 20, 2024 09:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-to-NONE Don't cherry-pick these changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants