Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix "failed to parse ringbuf skbdata" and oom killed #6

Closed
wants to merge 1 commit into from

Conversation

mozillazg
Copy link
Contributor

@mozillazg mozillazg commented Jan 11, 2025

before:

  1. attach a tc program to block send icmp to 1.1.1.1
  2. run sudo ./ktcpdump -v -i '__dev_queue_xmit+*' -d /usr/lib/debug/boot/vmlinux-`uname -r` 'dst host 1.1.1.1 and icmp'
  3. run ping 1.1.1.1
  4. ktcpdump will raise failed to parse ringbuf skbdata and killed by system due to oom.

after:

without failed to parse ringbuf skbdata, no longer killed due to oom.

Copy link
Owner

@jschwinger233 jschwinger233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened another pr #7 to fix the same issue.

The root cause is bpf didn't get has_mac correctly: __dev_queue_xmit doesn't reset skb->mac_header until https://elixir.bootlin.com/linux/v6.8/source/net/core/dev.c#L4267 is executed. The new fix takes that into consideration by comparing skb->network_header to skb->mac_header to make sure we won't read uninitialized skb->mac_header (0xffff).

Also I want to explain why I split skb_data from event: skb_data may have variable length (skb->data_end - skb->data), always transmitting a fixed length from bpf kernel to userspace is no doubt a waste of resource. The design can be seen as HTTP header Content-Length followed by payload of variable length, I believe this is a way more efficient to pass data.

Feel free to ask questions and review #7. Appreciate your contribution, let me treat you to a dinner next time we meet 😋

@mozillazg
Copy link
Contributor Author

@jschwinger233 got it. thanks for your explain.

@mozillazg mozillazg closed this Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants