-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move container build infrastructure to Ansible #1009
Conversation
Using a dockerfile to build, install and test the code can be problematic as we can't capture the log files to check what failed in case of failure. This PR converts the fedora dockerfile to Ansible, an open source IT automation tool. The tool can be used on the developers and the CI system to check whether a piece of code can be built, installed and tested. This is the first patch in a series, where I will convert the existing PR workflows to use Ansible instead of dockerfiles. Signed-off-by: Iker Pedrosa <[email protected]>
These ansible(1) playbooks still build Docker images, so I guess under the hood they still do the same thing that a Dockerfile and docker(1) could do. Can't we achieve the same with just docker(1) and a Dockerfile? That is, can you explain how is ansible(1) necessary (or better than docker(1)) to achieve this? |
Yes, they do the same, and if I could I would have stick with the docker and dockerfile architecture, but it isn't possible. At least if we want to obtain the logs without having to do some nasty things, like 1 and 2 (which in reality doesn't work because when something fails the container build never arrives to this point). In addition, when dockerfile build fails the container is destroyed, which prevents us from accessing it to see what has happened. Which brings us to the previous point, getting the logs. Using Ansible allows us to maintain a similar approach to what we had until now: build this project for different distributions, run the workflow locally, maintain some independence from Github actions, etc. On top of that, we'll be able to obtain all the logs, and if we are running the workflow locally and anything fails we'll be able to access the containers to debug what's happening. |
This does actually work (I could see the logs). I've seen it fail once (not exactly this; it was the same trick, but in a github actions), and documented it here: And I wrote some small Dockerfile to test this with a docker build:
$ sudo docker build .
Sending build context to Docker daemon 23.55kB
Step 1/2 : FROM debian
---> 2a033a8c6371
Step 2/2 : RUN bash -c "trap 'cat </etc/os-release >&2' ERR; false;"
---> Running in 3c6f20ae7c6b
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
The command '/bin/sh -c bash -c "trap 'cat </etc/os-release >&2' ERR; false;"' returned a non-zero code: 1 Or do you have any case where it failed and it didn't work? Edit: Now I realize you probably meant the I want to avoid the complexity of using ansible(1) if possible, and if trap(1) works, I prefer it. |
As an alternative, we could consider that |
The Regarding Finally, this would also give us the opportunity of capturing all the test logs when #835 is implemented and running. |
make check | ||
args: | ||
chdir: /usr/local/src/shadow/ | ||
ignore_errors: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this exactly mean? Why do we want to ignore errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default if an ansible action fails, then the ansible execution is stopped. I want to ignore the errors and continue the execution to run the last action where the logs are copied from the container to the host system. This way we can gather them for inspection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But does it report the error later? How will we know if ansible failed, if we ignore the errors? Sorry if this is obvious; I never used Ansible before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error is reported, but the execution continues. At the end of the ansible execution there's the PLAY RECAP
where we'll be able to check if anything failed. I created #1014 with an intentional failure to show how it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I forgot to set if: always() in the Github Action. It's fine now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! It seems to work now. BTW, dumb question: how do I find and read the logs (artifacts)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to open the action that failed, then click on summary, and finally scroll down to find all the artifacts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ughhh, and then download a .zip and extract it to find the logs. Can we (also) have a copy on stderr? I very much prefer scrolling. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will work until we start running the new tests and have several files to review
share/ansible/roles/ci_run/README.md
Outdated
Usage example: | ||
|
||
- hosts: builder | ||
connection: podman | ||
become: true | ||
roles: | ||
- role: ci_run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this usage. Where is this YAML code supposed to be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the ansible playbook: https://github.com/shadow-maint/shadow/pull/1009/files#diff-1968d6dc5c6937169e73de5fa2e83bf0d9810afcec288aa45b6cfabb40a5bf15R12-R17
Roles are self-contained units in Ansible. They are used for grouping tasks and other resources in a known file structure.
Create `build_container` and `ci_run` roles and move the fedora target to them. Signed-off-by: Iker Pedrosa <[email protected]>
Signed-off-by: Iker Pedrosa <[email protected]>
c8ceb95
to
db3c30c
Compare
Signed-off-by: Iker Pedrosa <[email protected]>
Signed-off-by: Iker Pedrosa <[email protected]>
Signed-off-by: Iker Pedrosa <[email protected]>
Distribution to run can be selected when running `ansible-playbook` by appending `-e 'distribution=fedora'` to the command. Signed-off-by: Iker Pedrosa <[email protected]>
7a426fe
to
ef6fbcf
Compare
Signed-off-by: Iker Pedrosa <[email protected]>
Signed-off-by: Iker Pedrosa <[email protected]>
Signed-off-by: Iker Pedrosa <[email protected]>
Signed-off-by: Iker Pedrosa <[email protected]>
- name: Build container | ||
run: | | ||
docker buildx build -f ./share/containers/${{ matrix.os }}.dockerfile . --output build-out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, what if we would map some volume to write the log files to it with docker(1)?
RUN bash -c "trap 'cat <tests/unit/test-suite.log | tee /some/mapped/volume/test-suite.log >&2' ERR; make check;"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would allow storing the artifacts without needing Ansible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't mount volumes while building a container image, and that's exactly what we are doing with dockerfile. I have tried to solve this problem in various ways and with the technology we use it is not possible 😓
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm; understood.
How about something like this?
docker build ... | tee >(sed -n '/BEGIN MARKER/,/END MARKER/p' >/host/log/file)
(probably needs redirecting stderr too (or only))
We would only need to find some consistent markers in the log.
So, the full docker logs would go to the output, and the specific error logs that we want, which would be delimited by those delimiters, would go to a specific file in the host (so the runner).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same answer as in #1009 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If each log output has a different marker, you can parse each one separately:
docker build ... \
| tee \
>(sed -n '/BEGIN A/,/END A/p' >/host/A.log) \
>(sed -n '/BEGIN B/,/END B/p' >/host/B.log) \
>(sed -n '/BEGIN C/,/END C/p' >/host/C.log);
I think this is ready for review, so I'm moving it out of draft state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks.
Using dockerfiles to build the containers in the CI prevented us from getting the logs when any of the build steps failed. Using Ansible will help us tackle this problem while still maintaining a certain independence from Github Actions.