Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matomo github/docker repos #3602

Open
halkeye opened this issue May 27, 2023 · 21 comments
Open

Matomo github/docker repos #3602

halkeye opened this issue May 27, 2023 · 21 comments

Comments

@halkeye
Copy link
Member

halkeye commented May 27, 2023

Service(s)

infra.ci.jenkins.io, Docker Hub, GitHub

Summary

Original request I never finished - #2684 (comment)
I recreated the repo - https://github.com/jenkins-infra/docker-mamoto
Build is working - https://infra.ci.jenkins.io/job/docker-jobs/job/docker-mamoto/job/main/55/pipeline-console/?selected-node=113
Buuuuut, i guess container repo was also removed, so needs to be re-created

Also related to #3530

Reproduction steps

No response

@dduportal
Copy link
Contributor

dduportal commented Jun 26, 2023

Why installing Matomo in our own cluster

What

  • Matomo (https://matomo.org/) is a PHP-based application to collect analytics data
  • It's persistent storage is MariadDB / MySQL: https://matomo.org/faq/on-premise/matomo-requirements/
  • Initial assesment for storage and scaling: migrate google analytics to v4 #3530 (comment)
  • It needs an entrypoint (e.g. domain + HTTP + TLS). Proposal: analytics.jenkins.io domain which would be a CNAME to matomo.jenkins.io (which would point to the service itself)
    • 1 level of indirection to allow switching platforms in the future in a green blue deployment
  • Baseline: hosting in the publick8s cluster with a managed MySQL database

How

  • Need a Docker image to have an immutable "Matomo + set of plugins" artefacts - https://github.com/jenkins-infra/docker-mamoto
    • Avoid any need for a Persistent Volume
  • Need an helm chart (either existing or a custom one)
  • Need a managed MySQL database (in azure)
  • Need DNS set up
  • Need a deployment configuration release
    • cluster: publick8s
    • Setup and credentials to define (including database connection)
    • TLS enabled

@dduportal
Copy link
Contributor

@halkeye do you see anything else?

@halkeye
Copy link
Member Author

halkeye commented Jun 28, 2023

nothing else comes to mind

the first two how's are already done by me, https://github.com/halkeye-docker/matomo is way more up to date as I didn't have enough permissions to iterate on infra.ci

@halkeye
Copy link
Member Author

halkeye commented Jun 28, 2023

Not sure the goals of this addition, but probably worth noting that another reason to switch is the built in gdpr support, if you use the existing database i have, you'll have most things anonymized, but especially referrals disabled (i noticed after a few days that peoples private jenkins installs were being logged to plugins.jenkins.io stats as referrals)

dduportal added a commit to jenkins-infra/azure that referenced this issue Oct 3, 2023
as per jenkins-infra/helpdesk#3602

need jenkins-infra/azure-net#132 from azure-net

---------

Signed-off-by: Damien Duportal <[email protected]>
Co-authored-by: Damien Duportal <[email protected]>
@dduportal
Copy link
Contributor

Update:

WiP:

  • Support of ARM64 for the image
  • Image updates tracking
  • Helm chart

@dduportal
Copy link
Contributor

Update:

  • ARM64 image is available

@dduportal
Copy link
Contributor

WiP on the helm chart: jenkins-infra/kubernetes-management#4032

@dduportal
Copy link
Contributor

Update:

WiP:

@dduportal
Copy link
Contributor

Update: database creation in jenkins-infra/azure#497

dduportal added a commit to jenkins-infra/azure that referenced this issue Oct 10, 2023
Related to jenkins-infra/helpdesk#3602

This PR adds a managed MySQL database for matomo with an associated user
and password.

The grants are also applied to this user as per
https://matomo.org/faq/how-to-install/faq_23484/. Note that the `FILE`
grant is not added because it would be global to the `public-db`
instance while we're not even sure it is needed (or if the mentioned
file load extension is present on Azure flexible instances)


(edit)
Note: the updatecli check is failing as usual when introducing a new
dependency. In order to validate it, I ran it locally (with the `scmid`
commented in the target) which updated the hcl file as expected.

Signed-off-by: Damien Duportal <[email protected]>
@dduportal
Copy link
Contributor

Update:

  • After a few misconfiguration, we succeeded in creating the database using Terraform:
  • Secrets updated with the database credentials.

WiP: installing an initial release without ingress

@dduportal
Copy link
Contributor

Update: issue while starting the pod (cc @halkeye if it rings a bell or if you have some thoughts) with the following error:

+ '[' '!' -e matomo.php ']'
+ tar cf - --one-file-system -C /usr/src/matomo .
+ tar xf -
tar: .: Cannot utime: Operation not permitted
tar: .: Cannot change mode to rwxr-xr-t: Operation not permitted
tar: Exiting with failure status due to previous errors

=> There are quite some differences between both explaining our error

Easy to reproduce:

$ docker run --rm -u 1001 docker.io/jenkinsciinfra/matomo:0.1.1 ls -la
+ '[' '!' -e matomo.php ']'
+ tar cf - --one-file-system -C /usr/src/matomo .
+ tar xf -
tar: .: Cannot utime: Operation not permitted
tar: .: Cannot change mode to rwxr-xr-t: Operation not permitted
tar: Exiting with failure status due to previous errors

Gotta try ith the 33 user

@dduportal
Copy link
Contributor

Specifying the user 33 goes further but fails due to the Apache configuration file being protected:

+ perl -pi -e 's/Listen\s+80$/Listen 8080/g; s{<VirtualHost *:80>}{<VirtualHost *:8080>}g' /etc/apache2/ports.conf /etc/apache2/sites-enabled/000-default.conf
Can't do inplace edit on /etc/apache2/ports.conf: Cannot make temp name: Permission denied.
Can't do inplace edit on /etc/apache2/sites-enabled/000-default.conf: Cannot make temp name: Permission denied

I believe we should try switching to the bitnami image

@halkeye
Copy link
Member Author

halkeye commented Oct 12, 2023

I was trying to make the docker file diskless by including plugins in the docker file, especially good for renovate/updatecli type thing, but php apps have everything nested.

I'm pretty sure bitnami needs persistent disk and you do updates in the app, not the docker image.

I think most of the stuff in entrypoint can get moved to dockerfile instead of entryfile.

honestly the only part of entrypoint that needs to be on startup is envsubst to allow database env variables.

@dduportal
Copy link
Contributor

I'm still not sure which way to go:

  • bitnami helm chart have a lot of sane defaults for production (non root, config, etc.)
  • but the official matomo image looks better.

Gotta try you suggestion because the bitnami image errors with

matomo 17:55:21.17 WARN  ==> The Apache configuration file '/opt/bitnami/apache/conf/httpd.conf' is not writable. Configurations based on environment variables will not be applied.
mkdir: cannot create directory '/opt/bitnami/apache/conf/bitnami/certs': Permission denied

@dduportal
Copy link
Contributor

I'm still not sure which way to go:

* bitnami helm chart have a lot of sane defaults for production (non root, config, etc.)

* but the official matomo image looks better.

Gotta try you suggestion because the bitnami image errors with

matomo 17:55:21.17 WARN  ==> The Apache configuration file '/opt/bitnami/apache/conf/httpd.conf' is not writable. Configurations based on environment variables will not be applied.
mkdir: cannot create directory '/opt/bitnami/apache/conf/bitnami/certs': Permission denied

gotcha, it's running now (had to remove the 33 UID). Now: connectivity from the cluster to the MySQL DB. Alsmost there!

Thanks for the explanation @halkeye , it helps!

@dduportal
Copy link
Contributor

OK, too much errors everywhere, I give up.

@halkeye do you have a recent installation working? I don't understand how the helm chart of bitnami can work with the setup in jenkins-infra/kubernetes-management#4032 : are you using this helm chart in your cluster?

It can't connect to the database while I can mysql and the custom entrypoint script of bitnami seems to have their own way of "health checking" but I'm unable to find a way to debug.

It seems there is a lot of thing: plugins in matomo, cronjobs that do whatever tasks, it's really hard to start this service without help.

@halkeye
Copy link
Member Author

halkeye commented Oct 16, 2023

https://github.com/halkeye-docker/matomo is the version i eventually got running locally. I'm pretty sure i mentioned it in the task, but I eventually went to my own fork because it was too hard to test with infra.ci (I think it wasn't building or something, or i needed to actually tag it, can't remember now)

https://github.com/jenkins-infra/kubernetes-management/compare/helpdesk-3530-matomo-2?expand=1 is what I had locally when i was testing it. I don't know if its any different than what you have so far.

@dduportal
Copy link
Contributor

https://github.com/halkeye-docker/matomo is the version i eventually got running locally. I'm pretty sure i mentioned it in the task, but I eventually went to my own fork because it was too hard to test with infra.ci (I think it wasn't building or something, or i needed to actually tag it, can't remember now)

https://github.com/jenkins-infra/kubernetes-management/compare/helpdesk-3530-matomo-2?expand=1 is what I had locally when i was testing it. I don't know if its any different than what you have so far.

Thanks Gavin! The problem is that this looks like a LOT of concepts (there are inline YAML for cronjobs doing things, plugins preinstalled in the helm chart, etc.). It's way more complicated to deploy and maintain in a (public) production context with so much moving pieces and no prior knowledge of the tooling.

Was there a particular reason to avoid the persistent volume? I would like to understand this as it might not be a constraint at all in the Jenkins case (I mean, azurefile are cheap and really good perfs. most of the time)

Besides, we are having weird issues with AKS, MySQL and ARM (on one side) and the bitnami helm-chart and MySQL on another.

What it means is that it will takes quite some time to bootstrap this in a "good enough for production" context and it needs some planning and back and forth to gain the required knowledge.

@halkeye
Copy link
Member Author

halkeye commented Oct 18, 2023

Was there a particular reason to avoid the persistent volume? I would like to understand this as it might not be a constraint at all in the Jenkins case (I mean, azurefile are cheap and really good perfs. most of the time)

Honestly, it was done because originally at work we don't have access to storage in our internal deployments, but also because the bitnami image extracts the tarball into the persistent storage (kinda like jenkins plugin does for plugins). So to upgrade you need to click something in the UI that triggers an upgrade. Which means you can't use updatecli or renovate to trigger upgrades. Plus really all you need to persist is config.ini.php

Speaking of config.ini.php, you may want to use the one from my fork, I believe matomo is a bit picky about the config file, and if keys are missing, triggers a config page instead of the application itself.

There's also the matomo cloud option.

@lemeurherve
Copy link
Member

Note: we should setup stats for every jenkins.io subdomain not analysed yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants