Fix permission denied on workload socket #5705
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Porting of : #5698
Restarting iotedge many time lead to permission issues
This doesn't seem to happen in 1.1.6. I tried many times but could not get it to fail however some customer experienced the same symptoms so this is most likely an issue across version.
The fact that it seems to fail more on some setup is something we could not explain.
Tests:
Tried all of those. A few times manually (3-5 times), a cycle of hundreds time each with a script (Ubuntu + Centos).
For each test we check that all module are up and running, that permissions and user are correct and that there is a listener on the socket:
1.1 sudo iotedge system restart
1.2 sudo iotedge system stop + delete all containers + sudo iotedge system restart
1.3 sudo iotedge system stop + delete /var/lib/aziot/edged/mnt/ folder + sudo iotedge system restart
1.4 sudo iotedge system stop + delete all container + delete /var/lib/aziot/edged/mnt/ folder + sudo iotedge system restart
1.5 sudo iotedge system stop + delete workloads inside /var/lib/aziot/edged/mnt + create a sudo dir inside with the name of the sockets + sudo iotedge system restart (NOT on Windows, since socket are dir and without the folder docker refuses to create the container and won't create a default one)
Script ubuntu: script_ubuntu.txt
Test result: ubuntu18.txt_release1_1.txt
Script windows:script_windows.txt
Test result:windows_result.txt
Note: edgeHub container is missing in one of the restart. However it is not in failed state (so no issue with socket). From experience, sometimes edgeAgent takes a while to get the deployment config on windows.
E2E test: https://dev.azure.com/msazure/One/_build/results?buildId=47950182&view=results
package#:https://dev.azure.com/msazure/One/_build/results?buildId=47915899&view=results
Azure IoT Edge PR checklist:
This checklist is used to make sure that common guidelines for a pull request are followed.
General Guidelines and Best Practices
Testing Guidelines
Draft PRs
Draft
mode if it is:Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned here for the PR title and description