Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic due to incorrect agent config #3809

Closed
sbaier1 opened this issue Mar 4, 2025 · 1 comment
Closed

Panic due to incorrect agent config #3809

sbaier1 opened this issue Mar 4, 2025 · 1 comment

Comments

@sbaier1
Copy link

sbaier1 commented Mar 4, 2025

Describe the bug
Telepresence runs into a crashloop when a Pod on the cluster specifies a telepresence.getambassador.io/inject-container-ports annotation for a nonexistent port.

Context: I'm trying to get telepresence working with knative-serving (see also #1029 ) by injecting the agent via the knative Service object to avoid reconcile conflicts with telepresence intercept, which fails to adjust the deployment because knative serving will just overwrite it again.

However the agent installed by injection alone ( telepresence.getambassador.io/inject-traffic-agent: enabled ) targets the wrong port, so i tried specifying the name of the port on the deployment. This failed catastrophically: traffic-manager is now crashlooping and refusing to return to working order again.

Interestingly, none of the pods on the cluster contain an annotation referencing the port mentioned in the log below anymore, nor do any of the configmaps. i'm not sure where telepresence is getting this state from currently. might be retries on the mutating hook, not sure.

--> had to remove old knative revision objects to prune the existing deployments/services that still contained the incorrect annotation. interesting that telepresence even considers deployments that are scaled to 0, it seems.

2025-03-04 13:11:32.0719 error   agent-configs : found no container port that matches port annotation h2c
E0304 13:11:32.072164       1 panic.go:262] "Observed a panic" panic="runtime error: invalid memory address or nil pointer dereference" panicGoValue="\"invalid memory address or nil pointer dereference\"" stacktrace=<
	goroutine 249 [running]:
	k8s.io/apimachinery/pkg/util/runtime.logPanic({0x3834f50, 0x4e0b860}, {0x30c5a80, 0x4ce4590})
		k8s.io/[email protected]/pkg/util/runtime/runtime.go:107 +0xbc
	k8s.io/apimachinery/pkg/util/runtime.handleCrash({0x3834f50, 0x4e0b860}, {0x30c5a80, 0x4ce4590}, {0x4e0b860, 0x0, 0x1251ee5?})
		k8s.io/[email protected]/pkg/util/runtime/runtime.go:82 +0x5e
	k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000e63c00?})
		k8s.io/[email protected]/pkg/util/runtime/runtime.go:59 +0x108
	panic({0x30c5a80?, 0x4ce4590?})
		runtime/panic.go:785 +0x132
	github.com/telepresenceio/telepresence/v2/cmd/traffic/cmd/manager/mutator.(*configWatcher).store(0xc0001fd0a0, {0x3834fc0, 0xc0000b05a0}, {0x0, 0x0})
		github.com/telepresenceio/telepresence/v2/cmd/traffic/cmd/manager/mutator/watcher.go:707 +0x42
	github.com/telepresenceio/telepresence/v2/cmd/traffic/cmd/manager/mutator.(*configWatcher).updateWorkload(0xc0001fd0a0, {0x3834fc0, 0xc0000b05a0}, {0x3866500, 0xc0006985f8}, {0x0, 0x0}, 0xc000c14240?)
		github.com/telepresenceio/telepresence/v2/cmd/traffic/cmd/manager/mutator/workload_watcher.go:114 +0x88d
	github.com/telepresenceio/telepresence/v2/cmd/traffic/cmd/manager/mutator.(*configWatcher).watchWorkloads.func1({0x3471300?, 0xc000e19908?})
		github.com/telepresenceio/telepresence/v2/cmd/traffic/cmd/manager/mutator/workload_watcher.go:25 +0x95
	k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(...)
		k8s.io/[email protected]/tools/cache/controller.go:246
	k8s.io/client-go/tools/cache.(*processorListener).run.func1()
		k8s.io/[email protected]/tools/cache/shared_informer.go:978 +0x139
	k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
		k8s.io/[email protected]/pkg/util/wait/backoff.go:226 +0x33
	k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc001187f70, {0x380dc00, 0xc000dcd5f0}, 0x1, 0xc000640e70)
		k8s.io/[email protected]/pkg/util/wait/backoff.go:227 +0xaf
	k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000e5cf70, 0x3b9aca00, 0x0, 0x1, 0xc000640e70)
		k8s.io/[email protected]/pkg/util/wait/backoff.go:204 +0x7f
	k8s.io/apimachinery/pkg/util/wait.Until(...)
		k8s.io/[email protected]/pkg/util/wait/backoff.go:161
	k8s.io/client-go/tools/cache.(*processorListener).run(0xc0011547e0)
		k8s.io/[email protected]/tools/cache/shared_informer.go:972 +0x5a
	k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()
		k8s.io/[email protected]/pkg/util/wait/wait.go:72 +0x4c
	created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start in goroutine 176
		k8s.io/[email protected]/pkg/util/wait/wait.go:70 +0x73
 >
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x2d59ae2]

Please use telepresence loglevel debug to ensure we have the most helpful logs,
reproduce the error, and then run telepresence gather-logs to create a
zip file of all logs for Telepresence's components (root and user daemons,
traffic-manager, and traffic-agents) and attach it to this issue. See an
example command below:

telepresence loglevel debug

* reproduce the error *

telepresence gather-logs --output-file /tmp/telepresence_logs.zip

# To see all options, run the following command
telepresence gather-logs --help

To Reproduce
Steps to reproduce the behavior:

see above

Expected behavior
A clear and concise description of what you expected to happen.

Versions (please complete the following information):

  • Output of telepresence version (preferably while telepresence is connected)
telepresence version
OSS Client     : v2.21.3
OSS Root Daemon: v2.21.3
OSS User Daemon: v2.21.3
Traffic Manager: not connected

(not connected because manager is crashlooping, but version is 2025-03-04 13:11:28.9447 info OSS Traffic Manager v2.21.3 [uid:1000,gid:0)

  • Operating system of workstation running telepresence commands

macOS 15.3.1, installed via brew telepresence-oss

  • Kubernetes environment and Version [e.g. Minikube, bare metal, Google Kubernetes Engine]

K8s Server Version: v1.29.11 on AKS

Additional context
Add any other context about the problem here.

telepresence_logs_2025-03-04T14:14:17+01:00.zip

@thallgren
Copy link
Member

This problem is addressed in the coming 2.22.0 release. If you want, you can test it using 2.22.0-test.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants