[VPP-115] VPP crashes during start with RSS enabled for enic #1374

vvalderrv · 2025-01-31T06:53:36Z

Description

If RSS is enabled using new approach, where number of rx/tx rings can be specified per interface, enic driver segfauts. Stack-trace-1 shows more information. If this exercise is repeated couple of more times, then VPP hangs during start-up. Stack-trace-2 shows where it hangs.

I'm using VPP debug build, and code is current as of 06/06/2016 9:00AM.

Below is a snippet of my config file. If I remove enic (0000:09:00.0) and just leave IXGBE (0000:0e:00.0) there, VPP starts up fine. It used to work with "rss 4" earlier.

dpdk {

socket-mem 1024

dev 0000:09:00.0

{
num-rx-queues 4
num-tx-queues 4
}

dev 0000:0e:00.0 { num-rx-queues 4 num-tx-queues 4 }
}

*--------------------------

STACK-TRACE-1

-------------------------*

Program received signal SIGSEGV, Segmentation fault.

[Switching to Thread 0x7fff9d400700 (LWP 2752)]

0x000000000055ebbf in vnic_dev_priv (vdev=0x0)

at /scratch/localadmin/openvpp/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.04/drivers/net/enic/base/vnic_dev.c:98

98 return vdev->priv;

(gdb) where

#0 0x000000000055ebbf in vnic_dev_priv (vdev=0x0)

at /scratch/localadmin/openvpp/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.04/drivers/net/enic/base/vnic_dev.c:98

#1 0x0000000000559658 in enic_recv_pkts (rx_queue=0x7fff8fbcc6c8, rx_pkts=0x7fffc4db1cc0, nb_pkts=256)

at /scratch/localadmin/openvpp/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.04/drivers/net/enic/enic_rxtx.c:248

#2 0x00007ffff6fbc653 in rte_eth_rx_burst (port_id=0 '\000', queue_id=1, rx_pkts=0x7fffc4db1cc0, nb_pkts=256)

at /scratch/localadmin/openvpp/vpp/build-root/install-vpp_debug-native/dpdk/include/rte_ethdev.h:2641

#3 0x00007ffff6fbc8c2 in dpdk_rx_burst (dm=0xb28640 <dpdk_main>, xd=0x7fffc4b97ac0, queue_id=1)

at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/dpdk_priv.h:65

#4 0x00007ffff6fbdd97 in dpdk_device_input (dm=0xb28640 <dpdk_main>, xd=0x7fffc4b97ac0, node=0x7fffc4e83100, cpu_index=2, queue_id=1,

use_efd=0) at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/node.c:511

#5 0x00007ffff6fbee54 in dpdk_input_rss (vm=0x7fffc4e90f14, node=0x7fffc4e83100, f=0x0)

at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/node.c:822

#6 0x00007ffff74e4bc5 in dispatch_node (vm=0x7fffc4e90f14, node=0x7fffc4e83100, type=VLIB_NODE_TYPE_INPUT,

dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x0, last_time_stamp=2595560569848974)

at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/main.c:996

#7 0x00007ffff6fc336d in dpdk_worker_thread_internal (vm=0x7fffc4e90f14, callback=0x0, have_io_threads=0)

at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/threads.c:209

#8 0x00007ffff6fc3598 in dpdk_worker_thread (w=0x7fffc521ca50, io_name=0x7ffff70aae8d "io", callback=0x0)

at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/threads.c:265

#9 0x00007ffff6fc35fa in dpdk_worker_thread_fn (arg=0x7fffc521ca50)

at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/threads.c:272

#10 0x00007ffff6233584 in clib_calljmp () at /scratch/localadmin/openvpp/vpp/build-data/../vppinfra/vppinfra/longjmp.S:110

#11 0x00007fff9d3ffbb0 in ?? ()

#12 0x00007ffff75260b0 in vlib_worker_thread_bootstrap_fn (arg=0x7fffc521ca50)

at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/threads.c:492

Backtrace stopped: previous frame inner to this frame (corrupt stack?)

*--------------------------

STACK-TRACE-2

-------------------------*

Starting program: /scratch/localadmin/openvpp/vpp/build-root/install-vpp_debug-native/vpp/bin/vpp -c vppconfigs/vpp_startup.conf

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

vlib_plugin_early_init:201: plugin path /usr/lib/vpp_plugins

^C

Program received signal SIGINT, Interrupt.

__lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135

135 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.

(gdb) where

#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135

#1 0x00007ffff5fcb649 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0

#2 0x00007ffff5fcb470 in _GI__pthread_mutex_lock (mutex=0x30008008) at ../nptl/pthread_mutex_lock.c:79

#3 0x00007ffff68f8577 in region_lock (rp=0x30008000, tag=2) at /scratch/localadmin/openvpp/vpp/build-data/../svm/svm.c:62

#4 0x00007ffff68fa54b in svm_map_region (a=0x7fffc571cce0) at /scratch/localadmin/openvpp/vpp/build-data/../svm/svm.c:590

#5 0x00007ffff68fa8b5 in svm_region_init_internal (root_path=0x0, uid=-1, gid=-1)

at /scratch/localadmin/openvpp/vpp/build-data/../svm/svm.c:653

#6 0x00007ffff68fac6f in svm_region_init_chroot_uid_gid (root_path=0x0, uid=-1, gid=-1)

at /scratch/localadmin/openvpp/vpp/build-data/../svm/svm.c:689

#7 0x0000000000468bac in gmon_init (vm=0xb28840 <vlib_global_main>) at /scratch/localadmin/openvpp/vpp/build-data/../vpp/api/gmon.c:174

#8 0x00007ffff74d911d in vlib_call_init_exit_functions (vm=0xb28840 <vlib_global_main>, head=0xaecda0 <_vlib_init_function.21287>,

call_once=1) at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/init.c:57

#9 0x00007ffff74d91a8 in vlib_call_all_init_functions (vm=0xb28840 <vlib_global_main>)

at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/init.c:74

#10 0x00007ffff74e738e in vlib_main (vm=0xb28840 <vlib_global_main>, input=0x7fffc571cfb0)

at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/main.c:1576

#11 0x00007ffff7784a89 in thread0 (arg=11700288) at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/unix/main.c:425

#12 0x00007ffff6233584 in clib_calljmp () at /scratch/localadmin/openvpp/vpp/build-data/../vppinfra/vppinfra/longjmp.S:110

#13 0x00007fffffffd3d0 in ?? ()

#14 0x00007ffff7784f13 in vlib_unix_main (argc=46, argv=0xbbab00)

at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/unix/main.c:485

#15 0x000000000040b1e2 in main (argc=46, argv=0xbbab00) at /scratch/localadmin/openvpp/vpp/build-data/../vpp/vnet/main.c:246

Assignee

Shesha Sreenivasamurthy

Reporter

Shesha Sreenivasamurthy

Comments

neescoba (Fri, 29 Jul 2016 21:40:59 +0000): The patch made it into upstream DPDK, (and into the 16.07 release), but the patch didn't make it into the vpp patch set. Shesha Sreenivasamurthy is in the process of adding the patch to the vpp patch set. Re-assigning this issue to him.
neescoba (Wed, 15 Jun 2016 01:03:31 +0000): Sent patch to Shesha, with some caveats. The fix is going upstream as part of a revised scatter rx patch. Re-assigning this issue to him.
neescoba (Sat, 11 Jun 2016 00:51:13 +0000): I was able to reproduce the exact problem yesterday, and found the problem and fixed it today. I expect to send Sesha a patch on Monday.
neescoba (Thu, 9 Jun 2016 01:30:01 +0000): It turns out that my problem looked similar but wasn't the same as the one Shesha reported. I will be trying to replicate Shesha's exact setup tomorrow morning to see if I can replicate his error.
dmarion (Wed, 8 Jun 2016 18:38:30 +0000): Nelson. 2nd issue is fixed yesterday in the following commit:

https://git.fd.io/cgit/vpp/commit/?id=599839d12d13ffed46717256d2f4ec190ff798fc

shesha (Wed, 8 Jun 2016 18:15:16 +0000): I'm glad that you could independently verify the problem. As suggested by Damjan, VPP hang can be overcome by removing SHM files in /dev/shm. You are right, the new patches that were sent to me via separate email was not applied. I just tested with the new patch and it still crashes. However the memzone error is definitely fixed with the new patch.

PMD: rte_enic_pmd: vNIC resources used: wq 4 rq 4 cq 8 intr 8

PMD: ixgbe_dev_configure(): >>

neescoba (Wed, 8 Jun 2016 17:49:36 +0000): I independently ran into the same problem yesterday evening after doing a 'git pull' to get the latest vpp code.

There are two issues, the segfault and that vpp hang. The first is likely an issue with the enic pmd driver, but I don't know the cause of the second issue. Does anyone know why vpp seems to get stuck like that and how to get around that second issue without having to reboot the machine?

Shesha, was the patch John Daley sent you applied to this code? The log you included seems to still have the error message suggesting that the patch was not applied:

PMD: rte_enic_pmd: enic_alloc_consistent : Failed to allocate memory requested for rss_key-0000:09:00.0

PMD: rte_enic_pmd: RSS disabled, Failed to set RSS key.

shesha (Wed, 8 Jun 2016 00:51:02 +0000): I would not close it as complete. There definitely exists an issue and we need a way to track it. Some change has currently made VPP unusable for a particular configuration. Let me send an email on vpp-dev alias.
dmarion (Tue, 7 Jun 2016 19:39:39 +0000):
I dont think anybody else in fd.io community posses familiarity to debug ENIC driver issues. Everything indicates that this is a issue in ENIC driver so I suggest that you raise this issue with ENIC driver maintainer.

Also, as long as this is not a issue in the VPP code I would recommend that we close this case or at least I should be removed as an Assignee....

shesha (Tue, 7 Jun 2016 19:18:17 +0000): I am not thinking or saying VPP-core has an issue. I am reporting an issue in using VPP. It can very well be an issue in external bits, like DPDK, that VPP is dependent on.

Enic is getting its input from VPP. Either VPP is configuring enic incorrectly or enic is misbehaving. The person who works on the bug should investigate and decide where the problem is. If it is analyzed that the issue is in DPDK's enic driver, then it should be forward to them. Please assign it someone who recently touched or is familiar with the code in that area to further investigate.

I would be more than happy to provide any information that will aid in the investigation but do not possess familiarity in that area of the code to analyze the problem myself.

dmarion (Tue, 7 Jun 2016 17:46:23 +0000):
OK, so this crash is happening in the ENIC PMD code, when we are trying to poll queue number 3.

Startup log shows that queue number 3 is initialised by VPP (or et least VPP tried to initialise it).

Why do you think that this is a VPP issue?

shesha (Tue, 7 Jun 2016 17:06:59 +0000):
Requested info

--------------------

vppconfigs/vpp_startup.conf :

unix {

nodaemon

log /tmp/vpe.log

cli-listen localhost:5002

full-coredump

startup-config /scratch/vppconfigs/vcgn.setip.conf

}

api-trace {

on

}

cpu {

main-core 0

corelist-workers 4-11

thread-prefix vcgn

}

dpdk {

socket-mem 1024

dev 0000:09:00.0

{
num-tx-queues 4
num-rx-queues 4
}

dev 0000:0e:00.0 { num-tx-queues 4 num-rx-queues 4 }

}

#####################################

/scratch/vppconfigs/vcgn.setip.conf:

set interface state TenGigabitEthernet9/0/0 up

set interface ip address TenGigabitEthernet9/0/0 2.0.0.1/24

set ip arp TenGigabitEthernet9/0/0 2.0.0.2 0000.001c.bf1a

set interface state TenGigabitEthernete/0/0 up

set interface ip address TenGigabitEthernete/0/0 3.0.0.1/24

set ip arp TenGigabitEthernete/0/0 3.0.0.2 0000.0283.acee

#####################################

(gdb) run -c vppconfigs/vpp_startup.conf

Starting program: /scratch/localadmin/openvpp/vpp/build-root/install-vpp_debug-native/vpp/bin/vpp -c vppconfigs/vpp_startup.conf

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

vlib_plugin_early_init:201: plugin path /usr/lib/vpp_plugins

EAL: Detected lcore 0 as core 0 on socket 0

EAL: Detected lcore 1 as core 1 on socket 0

EAL: Detected lcore 2 as core 2 on socket 0

EAL: Detected lcore 3 as core 3 on socket 0

EAL: Detected lcore 4 as core 4 on socket 0

EAL: Detected lcore 5 as core 5 on socket 0

EAL: Detected lcore 6 as core 8 on socket 0

EAL: Detected lcore 7 as core 9 on socket 0

EAL: Detected lcore 8 as core 10 on socket 0

EAL: Detected lcore 9 as core 11 on socket 0

EAL: Detected lcore 10 as core 12 on socket 0

EAL: Detected lcore 11 as core 13 on socket 0

EAL: Detected lcore 12 as core 0 on socket 1

EAL: Detected lcore 13 as core 1 on socket 1

EAL: Detected lcore 14 as core 2 on socket 1

EAL: Detected lcore 15 as core 3 on socket 1

EAL: Detected lcore 16 as core 4 on socket 1

EAL: Detected lcore 17 as core 5 on socket 1

EAL: Detected lcore 18 as core 8 on socket 1

EAL: Detected lcore 19 as core 9 on socket 1

EAL: Detected lcore 20 as core 10 on socket 1

EAL: Detected lcore 21 as core 11 on socket 1

EAL: Detected lcore 22 as core 12 on socket 1

EAL: Detected lcore 23 as core 13 on socket 1

EAL: Detected lcore 24 as core 0 on socket 0

EAL: Detected lcore 25 as core 1 on socket 0

EAL: Detected lcore 26 as core 2 on socket 0

EAL: Detected lcore 27 as core 3 on socket 0

EAL: Detected lcore 28 as core 4 on socket 0

EAL: Detected lcore 29 as core 5 on socket 0

EAL: Detected lcore 30 as core 8 on socket 0

EAL: Detected lcore 31 as core 9 on socket 0

EAL: Detected lcore 32 as core 10 on socket 0

EAL: Detected lcore 33 as core 11 on socket 0

EAL: Detected lcore 34 as core 12 on socket 0

EAL: Detected lcore 35 as core 13 on socket 0

EAL: Detected lcore 36 as core 0 on socket 1

EAL: Detected lcore 37 as core 1 on socket 1

EAL: Detected lcore 38 as core 2 on socket 1

EAL: Detected lcore 39 as core 3 on socket 1

EAL: Detected lcore 40 as core 4 on socket 1

EAL: Detected lcore 41 as core 5 on socket 1

EAL: Detected lcore 42 as core 8 on socket 1

EAL: Detected lcore 43 as core 9 on socket 1

EAL: Detected lcore 44 as core 10 on socket 1

EAL: Detected lcore 45 as core 11 on socket 1

EAL: Detected lcore 46 as core 12 on socket 1

EAL: Detected lcore 47 as core 13 on socket 1

EAL: Support maximum 256 logical core(s) by configuration.

EAL: Detected 48 lcore(s)

EAL: Probing VFIO support...

EAL: Module /sys/module/vfio_pci not found! error 2 (No such file or directory)

EAL: VFIO modules not loaded, skipping VFIO support...

EAL: Setting up physically contiguous memory...

EAL: Ask a virtual area of 0xc00000 bytes

EAL: Virtual area found at 0x7fff98c00000 (size = 0xc00000)

EAL: Ask a virtual area of 0x800000 bytes

EAL: Virtual area found at 0x7fff98200000 (size = 0x800000)

EAL: Ask a virtual area of 0x7c00000 bytes

EAL: Virtual area found at 0x7fff90400000 (size = 0x7c00000)

EAL: Ask a virtual area of 0x37000000 bytes

EAL: Virtual area found at 0x7fff59200000 (size = 0x37000000)

EAL: Ask a virtual area of 0x40000000 bytes

EAL: Virtual area found at 0x7fff19000000 (size = 0x40000000)

EAL: Requesting 512 pages of size 2MB from socket 0

EAL: TSC frequency is ~2593741 KHz

EAL: Master lcore 0 is ready (tid=f7fe0900;cpuset=[0])

PMD: rte_igbvf_pmd_init(): >>

PMD: rte_i40e_pmd_init(): >>

PMD: rte_i40evf_pmd_init(): >>

PMD: rte_ixgbe_pmd_init(): >>

PMD: rte_ixgbevf_pmd_init(): >>

PMD: rte_vmxnet3_pmd_init(): >>

[New Thread 0x7fff9e401700 (LWP 10533)]

[New Thread 0x7fff9dc00700 (LWP 10534)]

EAL: lcore 4 is ready (tid=9dc00700;cpuset=[4])

[New Thread 0x7fff9d3ff700 (LWP 10535)]

EAL: lcore 5 is ready (tid=9d3ff700;cpuset=[5])

[New Thread 0x7fff9cbfe700 (LWP 10536)]

EAL: lcore 6 is ready (tid=9cbfe700;cpuset=[6])

[New Thread 0x7fff9c3fd700 (LWP 10537)]

[New Thread 0x7fff9bbfc700 (LWP 10538)]

EAL: lcore 7 is ready (tid=9c3fd700;cpuset=[7])

EAL: lcore 8 is ready (tid=9bbfc700;cpuset=[8])

[New Thread 0x7fff9b3fb700 (LWP 10539)]

[New Thread 0x7fff9abfa700 (LWP 10540)]

EAL: lcore 9 is ready (tid=9b3fb700;cpuset=[9])

EAL: lcore 10 is ready (tid=9abfa700;cpuset=[10])

[New Thread 0x7fff9a3f9700 (LWP 10541)]

EAL: lcore 11 is ready (tid=9a3f9700;cpuset=[11])

EAL: PCI device 0000:09:00.0 on NUMA socket 0

EAL: probe driver: 1137:43 rte_enic_pmd

EAL: PCI memory mapped at 0x7fff99800000

EAL: PCI memory mapped at 0x7fff99808000

PMD: rte_enic_pmd: Initializing ENIC PMD version 1.0.0.6

PMD: rte_enic_pmd: vNIC MAC addr e8:65:49:1f:09:22 wq/rq 256/512 mtu 1500

PMD: rte_enic_pmd: vNIC csum tx/rx yes/yes rss yes intr mode any type min timer 125 usec loopback tag 0x0000

PMD: rte_enic_pmd: vNIC resources avail: wq 8 rq 8 cq 16 intr 8

EAL: PCI device 0000:0e:00.0 on NUMA socket 0

EAL: probe driver: 8086:10fb rte_ixgbe_pmd

EAL: PCI memory mapped at 0x7fff9980a000

EAL: PCI memory mapped at 0x7fff9988a000

PMD: eth_ixgbe_dev_init(): >>

PMD: ixgbe_disable_intr(): >>

PMD: ixgbe_pf_host_init(): >>

PMD: eth_ixgbe_dev_init(): MAC: 2, PHY: 12, SFP+: 3

PMD: eth_ixgbe_dev_init(): port 1 vendorID=0x8086 deviceID=0x10fb

DPDK physical memory layout:

Segment 0: phys:0x35800000, len:12582912, virt:0x7fff98c00000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0

Segment 1: phys:0x36800000, len:8388608, virt:0x7fff98200000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0

Segment 2: phys:0x6d800000, len:130023424, virt:0x7fff90400000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0

Segment 3: phys:0x1f8d400000, len:922746880, virt:0x7fff59200000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0

RING: Cannot reserve memory

[New Thread 0x7fff591ff700 (LWP 10542)]

PMD: rte_enic_pmd: WQ 0 - number of tx desc in cmd line (2048)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)

PMD: rte_enic_pmd: WQ 1 - number of tx desc in cmd line (2048)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)

PMD: rte_enic_pmd: WQ 2 - number of tx desc in cmd line (2048)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)

PMD: rte_enic_pmd: WQ 3 - number of tx desc in cmd line (2048)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)

PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum

PMD: rte_enic_pmd: For mtu 1500 and mbuf size 2048 valid rx descriptor range is 64 to 512

PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)

PMD: rte_enic_pmd: Set queue_id:0 free thresh:32

PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum

PMD: rte_enic_pmd: For mtu 1500 and mbuf size 2048 valid rx descriptor range is 64 to 512

PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)

PMD: rte_enic_pmd: Set queue_id:1 free thresh:32

PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum

PMD: rte_enic_pmd: For mtu 1500 and mbuf size 2048 valid rx descriptor range is 64 to 512

PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)

PMD: rte_enic_pmd: Set queue_id:2 free thresh:32

PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum

PMD: rte_enic_pmd: For mtu 1500 and mbuf size 2048 valid rx descriptor range is 64 to 512

PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)

PMD: rte_enic_pmd: Set queue_id:3 free thresh:32

PMD: rte_enic_pmd: vNIC resources used: wq 4 rq 4 cq 8 intr 8

EAL: memzone_reserve_aligned_thread_unsafe(): memzone <rss_key-0000:09:00.0> already exists

PMD: rte_enic_pmd: enic_alloc_consistent : Failed to allocate memory requested for rss_key-0000:09:00.0

PMD: rte_enic_pmd: RSS disabled, Failed to set RSS key.

PMD: ixgbe_dev_configure(): >>

PMD: ixgbe_dev_tx_queue_setup(): >>

PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7fff984e8f40 hw_ring=0x7fff984f0f80 dma_addr=0x36af0f80

PMD: ixgbe_set_tx_function(): Using full-featured tx code path

PMD: ixgbe_set_tx_function(): - txq_flags = f00 [IXGBE_SIMPLE_FLAGS=f01]

PMD: ixgbe_set_tx_function(): - tx_rs_thresh = 32 [RTE_PMD_IXGBE_TX_MAX_BURST=32]

PMD: ixgbe_dev_tx_queue_setup(): >>

PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7fff984d0ec0 hw_ring=0x7fff984d8f00 dma_addr=0x36ad8f00

PMD: ixgbe_set_tx_function(): Using full-featured tx code path

PMD: ixgbe_set_tx_function(): - txq_flags = f00 [IXGBE_SIMPLE_FLAGS=f01]

PMD: ixgbe_set_tx_function(): - tx_rs_thresh = 32 [RTE_PMD_IXGBE_TX_MAX_BURST=32]

PMD: ixgbe_dev_tx_queue_setup(): >>

PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7fff984b8e40 hw_ring=0x7fff984c0e80 dma_addr=0x36ac0e80

PMD: ixgbe_set_tx_function(): Using full-featured tx code path

PMD: ixgbe_set_tx_function(): - txq_flags = f00 [IXGBE_SIMPLE_FLAGS=f01]

PMD: ixgbe_set_tx_function(): - tx_rs_thresh = 32 [RTE_PMD_IXGBE_TX_MAX_BURST=32]

PMD: ixgbe_dev_tx_queue_setup(): >>

PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7fff984a0dc0 hw_ring=0x7fff984a8e00 dma_addr=0x36aa8e00

PMD: ixgbe_set_tx_function(): Using full-featured tx code path

PMD: ixgbe_set_tx_function(): - txq_flags = f00 [IXGBE_SIMPLE_FLAGS=f01]

PMD: ixgbe_set_tx_function(): - tx_rs_thresh = 32 [RTE_PMD_IXGBE_TX_MAX_BURST=32]

PMD: ixgbe_dev_rx_queue_setup(): >>

PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fff9848cbc0 sw_sc_ring=0x7fff98488a80 hw_ring=0x7fff98490d00 dma_addr=0x36a90d00

PMD: ixgbe_dev_rx_queue_setup(): >>

PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fff984748c0 sw_sc_ring=0x7fff98470780 hw_ring=0x7fff98478a00 dma_addr=0x36a78a00

PMD: ixgbe_dev_rx_queue_setup(): >>

PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fff9845c5c0 sw_sc_ring=0x7fff98458480 hw_ring=0x7fff98460700 dma_addr=0x36a60700

PMD: ixgbe_dev_rx_queue_setup(): >>

PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fff984442c0 sw_sc_ring=0x7fff98440180 hw_ring=0x7fff98448400 dma_addr=0x36a48400

PMD: rte_enic_pmd: queue 0, allocating 512 rx queue mbufs

PMD: rte_enic_pmd: port=0, qidx=0, Write 511 posted idx, 0 sw held

PMD: rte_enic_pmd: queue 2, allocating 512 rx queue mbufs

PMD: rte_enic_pmd: port=0, qidx=2, Write 511 posted idx, 0 sw held

PMD: rte_enic_pmd: queue 4, allocating 512 rx queue mbufs

PMD: rte_enic_pmd: port=0, qidx=4, Write 511 posted idx, 0 sw held

PMD: rte_enic_pmd: queue 6, allocating 512 rx queue mbufs

PMD: rte_enic_pmd: port=0, qidx=6, Write 511 posted idx, 0 sw held

Program received signal SIGSEGV, Segmentation fault.

[Switching to Thread 0x7fff9c3fd700 (LWP 10537)]

0x000000000055ebbf in vnic_dev_priv (vdev=0x0)

at /scratch/localadmin/openvpp/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.04/drivers/net/enic/base/vnic_dev.c:98

98 return vdev->priv;

(gdb)

(gdb) where

#0 0x000000000055ebbf in vnic_dev_priv (vdev=0x0)

at /scratch/localadmin/openvpp/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.04/drivers/net/enic/base/vnic_dev.c:98

#1 0x0000000000559658 in enic_recv_pkts (rx_queue=0x7fff985c0738, rx_pkts=0x7fffc4ce1f80, nb_pkts=256)

at /scratch/localadmin/openvpp/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.04/drivers/net/enic/enic_rxtx.c:248

#2 0x00007ffff6fbb85f in rte_eth_rx_burst (port_id=0 '\000', queue_id=3, rx_pkts=0x7fffc4ce1f80, nb_pkts=256)

at /scratch/localadmin/openvpp/vpp/build-root/install-vpp_debug-native/dpdk/include/rte_ethdev.h:2641

#3 0x00007ffff6fbbace in dpdk_rx_burst (dm=0xb28640 <dpdk_main>, xd=0x7fffc4ba1240, queue_id=3)

at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/dpdk_priv.h:65

#4 0x00007ffff6fbcfa3 in dpdk_device_input (dm=0xb28640 <dpdk_main>, xd=0x7fffc4ba1240, node=0x7fffc4dfbc48, cpu_index=4, queue_id=3,

use_efd=0) at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/node.c:511

#5 0x00007ffff6fbe060 in dpdk_input_rss (vm=0x7fffc4e4ea94, node=0x7fffc4dfbc48, f=0x0)

at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/node.c:822

#6 0x00007ffff74e4bc5 in dispatch_node (vm=0x7fffc4e4ea94, node=0x7fffc4dfbc48, type=VLIB_NODE_TYPE_INPUT,

dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x0, last_time_stamp=2795634055528941)

at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/main.c:996

#7 0x00007ffff6fc2579 in dpdk_worker_thread_internal (vm=0x7fffc4e4ea94, callback=0x0, have_io_threads=0)

at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/threads.c:209

#8 0x00007ffff6fc27a4 in dpdk_worker_thread (w=0x7fffc521bba0, io_name=0x7ffff70aa10d "io", callback=0x0)

at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/threads.c:265

#9 0x00007ffff6fc2806 in dpdk_worker_thread_fn (arg=0x7fffc521bba0)

at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/threads.c:272

#10 0x00007ffff6232584 in clib_calljmp () at /scratch/localadmin/openvpp/vpp/build-data/../vppinfra/vppinfra/longjmp.S:110

#11 0x00007fff9c3fcbb0 in ?? ()

#12 0x00007ffff75260b0 in vlib_worker_thread_bootstrap_fn (arg=0x7fffc521bba0)

at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/threads.c:492

Backtrace stopped: previous frame inner to this frame (corrupt stack?)

(gdb)

dmarion (Tue, 7 Jun 2016 09:12:57 +0000): Can you provide full startup log?
shesha (Tue, 7 Jun 2016 09:02:18 +0000): VPP master code with what ever patches exist in the repo crashes.
dmarion (Tue, 7 Jun 2016 08:12:37 +0000): Dear Shesha,

Is this with new patch from John Daley?

If yes, have you verified that this is not the issue with that patch?

Regarding, your 2nd issue, it doesn't have anything with 1st one, you just need to do "sudo rm /dev/shm/

{vpe-api,db,global_vm}
" after VPP crashes.

Original issue: https://jira.fd.io/browse/VPP-115

The text was updated successfully, but these errors were encountered:

vvalderrv · 2025-01-31T06:53:37Z

The patch made it into upstream DPDK, (and into the 16.07 release), but the patch didn't make it into the vpp patch set. Shesha Sreenivasamurthy is in the process of adding the patch to the vpp patch set. Re-assigning this issue to him.

vvalderrv · 2025-01-31T06:53:39Z

Sent patch to Shesha, with some caveats. The fix is going upstream as part of a revised scatter rx patch. Re-assigning this issue to him.

vvalderrv · 2025-01-31T06:53:40Z

I was able to reproduce the exact problem yesterday, and found the problem and fixed it today. I expect to send Sesha a patch on Monday.

vvalderrv · 2025-01-31T06:53:42Z

It turns out that my problem looked similar but wasn't the same as the one Shesha reported. I will be trying to replicate Shesha's exact setup tomorrow morning to see if I can replicate his error.

vvalderrv · 2025-01-31T06:53:43Z

Nelson. 2nd issue is fixed yesterday in the following commit:

https://git.fd.io/cgit/vpp/commit/?id=599839d12d13ffed46717256d2f4ec190ff798fc

vvalderrv · 2025-01-31T06:53:45Z

I'm glad that you could independently verify the problem. As suggested by Damjan, VPP hang can be overcome by removing SHM files in /dev/shm. You are right, the new patches that were sent to me via separate email was not applied. I just tested with the new patch and it still crashes. However the memzone error is definitely fixed with the new patch.

PMD: rte_enic_pmd: vNIC resources used: wq 4 rq 4 cq 8 intr 8
PMD: ixgbe_dev_configure(): >>

vvalderrv · 2025-01-31T06:53:46Z

I independently ran into the same problem yesterday evening after doing a 'git pull' to get the latest vpp code.

There are two issues, the segfault and that vpp hang. The first is likely an issue with the enic pmd driver, but I don't know the cause of the second issue. Does anyone know why vpp seems to get stuck like that and how to get around that second issue without having to reboot the machine?

Shesha, was the patch John Daley sent you applied to this code? The log you included seems to still have the error message suggesting that the patch was not applied:
PMD: rte_enic_pmd: enic_alloc_consistent : Failed to allocate memory requested for rss_key-0000:09:00.0
PMD: rte_enic_pmd: RSS disabled, Failed to set RSS key.

vvalderrv · 2025-01-31T06:53:48Z

I would not close it as complete. There definitely exists an issue and we need a way to track it. Some change has currently made VPP unusable for a particular configuration. Let me send an email on vpp-dev alias.

vvalderrv · 2025-01-31T06:53:49Z

I dont think anybody else in fd.io community posses familiarity to debug ENIC driver issues. Everything indicates that this is a issue in ENIC driver so I suggest that you raise this issue with ENIC driver maintainer.
Also, as long as this is not a issue in the VPP code I would recommend that we close this case or at least I should be removed as an Assignee....

vvalderrv · 2025-01-31T06:53:51Z

I am not thinking or saying VPP-core has an issue. I am reporting an issue in using VPP. It can very well be an issue in external bits, like DPDK, that VPP is dependent on.

Enic is getting its input from VPP. Either VPP is configuring enic incorrectly or enic is misbehaving. The person who works on the bug should investigate and decide where the problem is. If it is analyzed that the issue is in DPDK's enic driver, then it should be forward to them. Please assign it someone who recently touched or is familiar with the code in that area to further investigate.

I would be more than happy to provide any information that will aid in the investigation but do not possess familiarity in that area of the code to analyze the problem myself.

vvalderrv · 2025-01-31T06:53:52Z

OK, so this crash is happening in the ENIC PMD code, when we are trying to poll queue number 3.
Startup log shows that queue number 3 is initialised by VPP (or et least VPP tried to initialise it).

Why do you think that this is a VPP issue?

vvalderrv · 2025-01-31T06:53:54Z

Requested info
--------------------

vppconfigs/vpp_startup.conf :

unix {
nodaemon
log /tmp/vpe.log
cli-listen localhost:5002
full-coredump
startup-config /scratch/vppconfigs/vcgn.setip.conf
}

api-trace {
on
}

cpu {
main-core 0
corelist-workers 4-11
thread-prefix vcgn
}

dpdk {
socket-mem 1024
dev 0000:09:00.0

{ num-tx-queues 4 num-rx-queues 4 }
dev 0000:0e:00.0 { num-tx-queues 4 num-rx-queues 4 }

}
#####################################
/scratch/vppconfigs/vcgn.setip.conf:

set interface state TenGigabitEthernet9/0/0 up
set interface ip address TenGigabitEthernet9/0/0 2.0.0.1/24
set ip arp TenGigabitEthernet9/0/0 2.0.0.2 0000.001c.bf1a

set interface state TenGigabitEthernete/0/0 up
set interface ip address TenGigabitEthernete/0/0 3.0.0.1/24
set ip arp TenGigabitEthernete/0/0 3.0.0.2 0000.0283.acee

#####################################
(gdb) run -c vppconfigs/vpp_startup.conf
Starting program: /scratch/localadmin/openvpp/vpp/build-root/install-vpp_debug-native/vpp/bin/vpp -c vppconfigs/vpp_startup.conf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
vlib_plugin_early_init:201: plugin path /usr/lib/vpp_plugins
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 4 on socket 0
EAL: Detected lcore 5 as core 5 on socket 0
EAL: Detected lcore 6 as core 8 on socket 0
EAL: Detected lcore 7 as core 9 on socket 0
EAL: Detected lcore 8 as core 10 on socket 0
EAL: Detected lcore 9 as core 11 on socket 0
EAL: Detected lcore 10 as core 12 on socket 0
EAL: Detected lcore 11 as core 13 on socket 0
EAL: Detected lcore 12 as core 0 on socket 1
EAL: Detected lcore 13 as core 1 on socket 1
EAL: Detected lcore 14 as core 2 on socket 1
EAL: Detected lcore 15 as core 3 on socket 1
EAL: Detected lcore 16 as core 4 on socket 1
EAL: Detected lcore 17 as core 5 on socket 1
EAL: Detected lcore 18 as core 8 on socket 1
EAL: Detected lcore 19 as core 9 on socket 1
EAL: Detected lcore 20 as core 10 on socket 1
EAL: Detected lcore 21 as core 11 on socket 1
EAL: Detected lcore 22 as core 12 on socket 1
EAL: Detected lcore 23 as core 13 on socket 1
EAL: Detected lcore 24 as core 0 on socket 0
EAL: Detected lcore 25 as core 1 on socket 0
EAL: Detected lcore 26 as core 2 on socket 0
EAL: Detected lcore 27 as core 3 on socket 0
EAL: Detected lcore 28 as core 4 on socket 0
EAL: Detected lcore 29 as core 5 on socket 0
EAL: Detected lcore 30 as core 8 on socket 0
EAL: Detected lcore 31 as core 9 on socket 0
EAL: Detected lcore 32 as core 10 on socket 0
EAL: Detected lcore 33 as core 11 on socket 0
EAL: Detected lcore 34 as core 12 on socket 0
EAL: Detected lcore 35 as core 13 on socket 0
EAL: Detected lcore 36 as core 0 on socket 1
EAL: Detected lcore 37 as core 1 on socket 1
EAL: Detected lcore 38 as core 2 on socket 1
EAL: Detected lcore 39 as core 3 on socket 1
EAL: Detected lcore 40 as core 4 on socket 1
EAL: Detected lcore 41 as core 5 on socket 1
EAL: Detected lcore 42 as core 8 on socket 1
EAL: Detected lcore 43 as core 9 on socket 1
EAL: Detected lcore 44 as core 10 on socket 1
EAL: Detected lcore 45 as core 11 on socket 1
EAL: Detected lcore 46 as core 12 on socket 1
EAL: Detected lcore 47 as core 13 on socket 1
EAL: Support maximum 256 logical core(s) by configuration.
EAL: Detected 48 lcore(s)
EAL: Probing VFIO support...
EAL: Module /sys/module/vfio_pci not found! error 2 (No such file or directory)
EAL: VFIO modules not loaded, skipping VFIO support...
EAL: Setting up physically contiguous memory...
EAL: Ask a virtual area of 0xc00000 bytes
EAL: Virtual area found at 0x7fff98c00000 (size = 0xc00000)
EAL: Ask a virtual area of 0x800000 bytes
EAL: Virtual area found at 0x7fff98200000 (size = 0x800000)
EAL: Ask a virtual area of 0x7c00000 bytes
EAL: Virtual area found at 0x7fff90400000 (size = 0x7c00000)
EAL: Ask a virtual area of 0x37000000 bytes
EAL: Virtual area found at 0x7fff59200000 (size = 0x37000000)
EAL: Ask a virtual area of 0x40000000 bytes
EAL: Virtual area found at 0x7fff19000000 (size = 0x40000000)
EAL: Requesting 512 pages of size 2MB from socket 0
EAL: TSC frequency is ~2593741 KHz
EAL: Master lcore 0 is ready (tid=f7fe0900;cpuset=[0])
PMD: rte_igbvf_pmd_init(): >>
PMD: rte_i40e_pmd_init(): >>
PMD: rte_i40evf_pmd_init(): >>
PMD: rte_ixgbe_pmd_init(): >>
PMD: rte_ixgbevf_pmd_init(): >>
PMD: rte_vmxnet3_pmd_init(): >>
[New Thread 0x7fff9e401700 (LWP 10533)]
[New Thread 0x7fff9dc00700 (LWP 10534)]
EAL: lcore 4 is ready (tid=9dc00700;cpuset=[4])
[New Thread 0x7fff9d3ff700 (LWP 10535)]
EAL: lcore 5 is ready (tid=9d3ff700;cpuset=[5])
[New Thread 0x7fff9cbfe700 (LWP 10536)]
EAL: lcore 6 is ready (tid=9cbfe700;cpuset=[6])
[New Thread 0x7fff9c3fd700 (LWP 10537)]
[New Thread 0x7fff9bbfc700 (LWP 10538)]
EAL: lcore 7 is ready (tid=9c3fd700;cpuset=[7])
EAL: lcore 8 is ready (tid=9bbfc700;cpuset=[8])
[New Thread 0x7fff9b3fb700 (LWP 10539)]
[New Thread 0x7fff9abfa700 (LWP 10540)]
EAL: lcore 9 is ready (tid=9b3fb700;cpuset=[9])
EAL: lcore 10 is ready (tid=9abfa700;cpuset=[10])
[New Thread 0x7fff9a3f9700 (LWP 10541)]
EAL: lcore 11 is ready (tid=9a3f9700;cpuset=[11])
EAL: PCI device 0000:09:00.0 on NUMA socket 0
EAL: probe driver: 1137:43 rte_enic_pmd
EAL: PCI memory mapped at 0x7fff99800000
EAL: PCI memory mapped at 0x7fff99808000
PMD: rte_enic_pmd: Initializing ENIC PMD version 1.0.0.6
PMD: rte_enic_pmd: vNIC MAC addr e8:65:49:1f:09:22 wq/rq 256/512 mtu 1500
PMD: rte_enic_pmd: vNIC csum tx/rx yes/yes rss yes intr mode any type min timer 125 usec loopback tag 0x0000
PMD: rte_enic_pmd: vNIC resources avail: wq 8 rq 8 cq 16 intr 8
EAL: PCI device 0000:0e:00.0 on NUMA socket 0
EAL: probe driver: 8086:10fb rte_ixgbe_pmd
EAL: PCI memory mapped at 0x7fff9980a000
EAL: PCI memory mapped at 0x7fff9988a000
PMD: eth_ixgbe_dev_init(): >>
PMD: ixgbe_disable_intr(): >>
PMD: ixgbe_pf_host_init(): >>
PMD: eth_ixgbe_dev_init(): MAC: 2, PHY: 12, SFP+: 3
PMD: eth_ixgbe_dev_init(): port 1 vendorID=0x8086 deviceID=0x10fb
DPDK physical memory layout:
Segment 0: phys:0x35800000, len:12582912, virt:0x7fff98c00000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0
Segment 1: phys:0x36800000, len:8388608, virt:0x7fff98200000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0
Segment 2: phys:0x6d800000, len:130023424, virt:0x7fff90400000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0
Segment 3: phys:0x1f8d400000, len:922746880, virt:0x7fff59200000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0
RING: Cannot reserve memory
[New Thread 0x7fff591ff700 (LWP 10542)]
PMD: rte_enic_pmd: WQ 0 - number of tx desc in cmd line (2048)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)
PMD: rte_enic_pmd: WQ 1 - number of tx desc in cmd line (2048)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)
PMD: rte_enic_pmd: WQ 2 - number of tx desc in cmd line (2048)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)
PMD: rte_enic_pmd: WQ 3 - number of tx desc in cmd line (2048)is greater than that in the UCSM/CIMC adapterpolicy. Applying the value in the adapter policy (256)
PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum
PMD: rte_enic_pmd: For mtu 1500 and mbuf size 2048 valid rx descriptor range is 64 to 512
PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)
PMD: rte_enic_pmd: Set queue_id:0 free thresh:32
PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum
PMD: rte_enic_pmd: For mtu 1500 and mbuf size 2048 valid rx descriptor range is 64 to 512
PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)
PMD: rte_enic_pmd: Set queue_id:1 free thresh:32
PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum
PMD: rte_enic_pmd: For mtu 1500 and mbuf size 2048 valid rx descriptor range is 64 to 512
PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)
PMD: rte_enic_pmd: Set queue_id:2 free thresh:32
PMD: rte_enic_pmd: Number of rx_descs too high, adjusting to maximum
PMD: rte_enic_pmd: For mtu 1500 and mbuf size 2048 valid rx descriptor range is 64 to 512
PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)
PMD: rte_enic_pmd: Set queue_id:3 free thresh:32
PMD: rte_enic_pmd: vNIC resources used: wq 4 rq 4 cq 8 intr 8
EAL: memzone_reserve_aligned_thread_unsafe(): memzone <rss_key-0000:09:00.0> already exists
PMD: rte_enic_pmd: enic_alloc_consistent : Failed to allocate memory requested for rss_key-0000:09:00.0
PMD: rte_enic_pmd: RSS disabled, Failed to set RSS key.
PMD: ixgbe_dev_configure(): >>
PMD: ixgbe_dev_tx_queue_setup(): >>
PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7fff984e8f40 hw_ring=0x7fff984f0f80 dma_addr=0x36af0f80
PMD: ixgbe_set_tx_function(): Using full-featured tx code path
PMD: ixgbe_set_tx_function(): - txq_flags = f00 [IXGBE_SIMPLE_FLAGS=f01]
PMD: ixgbe_set_tx_function(): - tx_rs_thresh = 32 [RTE_PMD_IXGBE_TX_MAX_BURST=32]
PMD: ixgbe_dev_tx_queue_setup(): >>
PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7fff984d0ec0 hw_ring=0x7fff984d8f00 dma_addr=0x36ad8f00
PMD: ixgbe_set_tx_function(): Using full-featured tx code path
PMD: ixgbe_set_tx_function(): - txq_flags = f00 [IXGBE_SIMPLE_FLAGS=f01]
PMD: ixgbe_set_tx_function(): - tx_rs_thresh = 32 [RTE_PMD_IXGBE_TX_MAX_BURST=32]
PMD: ixgbe_dev_tx_queue_setup(): >>
PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7fff984b8e40 hw_ring=0x7fff984c0e80 dma_addr=0x36ac0e80
PMD: ixgbe_set_tx_function(): Using full-featured tx code path
PMD: ixgbe_set_tx_function(): - txq_flags = f00 [IXGBE_SIMPLE_FLAGS=f01]
PMD: ixgbe_set_tx_function(): - tx_rs_thresh = 32 [RTE_PMD_IXGBE_TX_MAX_BURST=32]
PMD: ixgbe_dev_tx_queue_setup(): >>
PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7fff984a0dc0 hw_ring=0x7fff984a8e00 dma_addr=0x36aa8e00
PMD: ixgbe_set_tx_function(): Using full-featured tx code path
PMD: ixgbe_set_tx_function(): - txq_flags = f00 [IXGBE_SIMPLE_FLAGS=f01]
PMD: ixgbe_set_tx_function(): - tx_rs_thresh = 32 [RTE_PMD_IXGBE_TX_MAX_BURST=32]
PMD: ixgbe_dev_rx_queue_setup(): >>
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fff9848cbc0 sw_sc_ring=0x7fff98488a80 hw_ring=0x7fff98490d00 dma_addr=0x36a90d00
PMD: ixgbe_dev_rx_queue_setup(): >>
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fff984748c0 sw_sc_ring=0x7fff98470780 hw_ring=0x7fff98478a00 dma_addr=0x36a78a00
PMD: ixgbe_dev_rx_queue_setup(): >>
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fff9845c5c0 sw_sc_ring=0x7fff98458480 hw_ring=0x7fff98460700 dma_addr=0x36a60700
PMD: ixgbe_dev_rx_queue_setup(): >>
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fff984442c0 sw_sc_ring=0x7fff98440180 hw_ring=0x7fff98448400 dma_addr=0x36a48400
PMD: rte_enic_pmd: queue 0, allocating 512 rx queue mbufs
PMD: rte_enic_pmd: port=0, qidx=0, Write 511 posted idx, 0 sw held
PMD: rte_enic_pmd: queue 2, allocating 512 rx queue mbufs
PMD: rte_enic_pmd: port=0, qidx=2, Write 511 posted idx, 0 sw held
PMD: rte_enic_pmd: queue 4, allocating 512 rx queue mbufs
PMD: rte_enic_pmd: port=0, qidx=4, Write 511 posted idx, 0 sw held
PMD: rte_enic_pmd: queue 6, allocating 512 rx queue mbufs
PMD: rte_enic_pmd: port=0, qidx=6, Write 511 posted idx, 0 sw held

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff9c3fd700 (LWP 10537)]
0x000000000055ebbf in vnic_dev_priv (vdev=0x0)
at /scratch/localadmin/openvpp/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.04/drivers/net/enic/base/vnic_dev.c:98
98 return vdev->priv;
(gdb)
(gdb)
(gdb) where
#0 0x000000000055ebbf in vnic_dev_priv (vdev=0x0)
at /scratch/localadmin/openvpp/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.04/drivers/net/enic/base/vnic_dev.c:98
#1 0x0000000000559658 in enic_recv_pkts (rx_queue=0x7fff985c0738, rx_pkts=0x7fffc4ce1f80, nb_pkts=256)
at /scratch/localadmin/openvpp/vpp/build-root/build-vpp_debug-native/dpdk/dpdk-16.04/drivers/net/enic/enic_rxtx.c:248
#2 0x00007ffff6fbb85f in rte_eth_rx_burst (port_id=0 '\000', queue_id=3, rx_pkts=0x7fffc4ce1f80, nb_pkts=256)
at /scratch/localadmin/openvpp/vpp/build-root/install-vpp_debug-native/dpdk/include/rte_ethdev.h:2641
#3 0x00007ffff6fbbace in dpdk_rx_burst (dm=0xb28640 <dpdk_main>, xd=0x7fffc4ba1240, queue_id=3)
at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/dpdk_priv.h:65
#4 0x00007ffff6fbcfa3 in dpdk_device_input (dm=0xb28640 <dpdk_main>, xd=0x7fffc4ba1240, node=0x7fffc4dfbc48, cpu_index=4, queue_id=3,
use_efd=0) at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/node.c:511
#5 0x00007ffff6fbe060 in dpdk_input_rss (vm=0x7fffc4e4ea94, node=0x7fffc4dfbc48, f=0x0)
at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/node.c:822
#6 0x00007ffff74e4bc5 in dispatch_node (vm=0x7fffc4e4ea94, node=0x7fffc4dfbc48, type=VLIB_NODE_TYPE_INPUT,
dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x0, last_time_stamp=2795634055528941)
at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/main.c:996
#7 0x00007ffff6fc2579 in dpdk_worker_thread_internal (vm=0x7fffc4e4ea94, callback=0x0, have_io_threads=0)
at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/threads.c:209
#8 0x00007ffff6fc27a4 in dpdk_worker_thread (w=0x7fffc521bba0, io_name=0x7ffff70aa10d "io", callback=0x0)
at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/threads.c:265
#9 0x00007ffff6fc2806 in dpdk_worker_thread_fn (arg=0x7fffc521bba0)
at /scratch/localadmin/openvpp/vpp/build-data/../vnet/vnet/devices/dpdk/threads.c:272
#10 0x00007ffff6232584 in clib_calljmp () at /scratch/localadmin/openvpp/vpp/build-data/../vppinfra/vppinfra/longjmp.S:110
#11 0x00007fff9c3fcbb0 in ?? ()
#12 0x00007ffff75260b0 in vlib_worker_thread_bootstrap_fn (arg=0x7fffc521bba0)
at /scratch/localadmin/openvpp/vpp/build-data/../vlib/vlib/threads.c:492
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb)

vvalderrv · 2025-01-31T06:53:55Z

Can you provide full startup log?

vvalderrv · 2025-01-31T06:53:57Z

VPP master code with what ever patches exist in the repo crashes.

vvalderrv · 2025-01-31T06:53:58Z

Dear Shesha,

Is this with new patch from John Daley?
If yes, have you verified that this is not the issue with that patch?

Regarding, your 2nd issue, it doesn't have anything with 1st one, you just need to do "sudo rm /dev/shm/

{vpe-api,db,global_vm}

" after VPP crashes.

vvalderrv closed this as completed Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VPP-115] VPP crashes during start with RSS enabled for enic #1374

[VPP-115] VPP crashes during start with RSS enabled for enic #1374

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025

vvalderrv commented Jan 31, 2025