Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VPP-171] [vhost-user] vhost traffic flowing to a wrong tx virtQueue #1427

Closed
vvalderrv opened this issue Jan 31, 2025 · 6 comments
Closed

Comments

@vvalderrv
Copy link
Contributor

Description

There’s case where vhost traffic is flowing to a wrong tx virtQueue.

This issue happens while reusing inactive vhost-user interfaces.

Here’s a scenario:

    - create vhost virtual interfaces between 2 VM: ping works. - delete all vhost virtual interfaces - create loopback interface (This will introduce a new interface and normal scenario to create a different subnet.) - create vhost virtual interfaces between 2 VM: ping fails.

    Assignee

    Dave Barach

    Reporter

    Steve Shin

    Comments

    • shesha (Fri, 1 Jul 2016 18:03:28 +0000): Thanks Dave. Calling vlib_worker_thread_node_runtime_update(); was the missing piece. It works now.
    • dbarach (Fri, 1 Jul 2016 16:37:11 +0000): I suspect we may be back on this topic, but I've cleaned up several issues.
    • shesha (Fri, 1 Jul 2016 00:10:33 +0000): Steve can you check with the code that is at least as recent as below:

    DBGvpp# show version verbose

    Version: v16.09-rc0~161-gea3e1fc

    Compiled by: localadmin

    Compile host: ubuntu

    Compile date: Thu Jun 30 17:03:05 PDT 2016

    Compile location: /scratch/localadmin/openvpp/vpp.new

    Compiler: GCC 4.8.4

    CPU model name: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz

    CPU microarchitecture: Haswell (Haswell-E)

    CPU flags: sse3 ssse3 sse41 sse42 avx avx2 aes

    Current PID: 6472

    DPDK Version: DPDK 16.04.0

    DPDK EAL init args: -c 3 -n 4 --huge-dir /run/vpp/hugepages --file-prefix vpp -w 0000:0a:00.0 --master-lcore 0 --socket-mem 512,512

    ------------------

    It looks like some sort of corruption is happening in vec_foreach (f, feature_vector) loop in find_config_with_features(). The reason I say that is, the vlib_node_runtime_t in vnet_interface_output_node_no_flatten_inline looks spooky after the execution of the above mentioned loop.

    =========

    BEFORE

    =========

    (gdb) p


    {vlib_node_runtime_t} 0x7fffc4d3d4e4

    $204 = {

    function = 0x7ffff6d33346 <vnet_interface_output_node_no_flatten>,

    errors = 0x7fffc4df8dec,

    clocks_since_last_overflow = 315980,

    max_clock = 128513,

    max_clock_n = 1,

    calls_since_last_overflow = 8,

    vectors_since_last_overflow = 8,

    next_frame_index = 677,

    node_index = 187,

    input_main_loops_per_call = 0,

    main_loop_count_last_dispatch = 162883242,

    main_loop_vector_stats = {0, 1},

    flags = 0,

    state = 0,

    n_next_nodes = 3,

    cached_next_index = 2,

    cpu_index = 0,

    runtime_data = {25769803782, 1, 0, 0, 0, 0, 0}

    =======

    AFTER

    =======

    (gdb) p {vlib_node_runtime_t}

    0x7fffc4d3d4e4

    $209 = {

    function = 0x7ffff6d33346 <vnet_interface_output_node_no_flatten>,

    errors = 0x7fffc4df8dec,

    clocks_since_last_overflow = 0,

    max_clock = 0,

    max_clock_n = 0,

    calls_since_last_overflow = 0,

    vectors_since_last_overflow = 0,

    next_frame_index = 690,

    node_index = 187,

    input_main_loops_per_call = 0,

    main_loop_count_last_dispatch = 0,

    main_loop_vector_stats =


    {0, 0}

    ,

    flags = 0,

    state = 0,

    n_next_nodes = 3,

    cached_next_index = 0,

    cpu_index = 0,

    * runtime_data = {25769803782, 4294967297, 0, 0, 0, 0, 0}*

    Just for kicks, to see how the system would behave if this corruption had not happened, I add the following:

    — a/vnet/vnet/interface_output.c

    +++ b/vnet/vnet/interface_output.c

    @@ -420,6 +420,11 @@ vnet_interface_output_node_no_flatten_inline (vlib_main_t * vm,

    from = vlib_frame_args (frame);

    • if (rt->is_deleted)

    {

    • printf("RESET DELETE FLAG\n");
    • rt->is_deleted = 0;
    • }

    Now, the VM were pingable after vhost interface ADD-DEL-ADD sequence.

    • jonshin (Thu, 30 Jun 2016 22:01:31 +0000):

      On my setup, i ran some add/delete vhost interface along with 2 VM ping. But i didn't see any packets drop.

    Here's my configuration:

    Thread 0 vpp_main (lcore 0)

    Thread 1 vpp_wk_0 (lcore 1)

    cpu {

    main-core 0

    corelist-workers 1

    }

    Your issue seems to be related with your test environment.

    • shesha (Thu, 30 Jun 2016 01:03:02 +0000): Steve Shin and I actively debugged this issue and found that the following fixes the issue when VPP is running in single threaded mode. However, the problem persists in multi-threaded mode. Below, I have provided some information that I know as of now.

    ==================================

    Following fix works for single thread mode

    ==================================

    — a/vnet/vnet/interface.c

    +++ b/vnet/vnet/interface.c

    @@ -656,6 +656,15 @@ vnet_register_interface (vnet_main_t * vnm,

       rt = vlib_node_get_runtime_data (vm, hw->output_node_index);
    
       ASSERT (rt->is_deleted == 1);
    
       rt->is_deleted = 0;
    
    •  rt->hw_if_index = hw_index;
      
    •  rt->sw_if_index = hw->sw_if_index;
      
    •  rt->dev_instance = hw->dev_instance;
      
    •  rt = vlib_node_get_runtime_data (vm, hw->tx_node_index);
      
    •  rt->is_deleted = 0;
      
    •  rt->hw_if_index = hw_index;
      
    •  rt->sw_if_index = hw->sw_if_index;
      
    •  rt->dev_instance = hw->dev_instance;</p>
      

    _vec_len (im->deleted_hw_interface_nodes) -= 1;

     }</p>
    

    ================

    Multi threaded case

    ================

    Setup is very simple. VPP has two vhost interfaces with a VM attached to each one. Boot those VMs, and delete the interfaces and add them back. Reboot VMs to reconnect vhost file descriptors. (Steve has a patch in QEMU that does not require this reboot). After reboot, ping fails because one of the interfaces is showing as still deleted in app (show err).

    ===============

    Steps to reproduce

    ===============

    create vhost-user socket /tmp/sock0

    create vhost-user socket /tmp/sock1

    set interface state VirtualEthernet0/0/0 up

    set interface state VirtualEthernet0/0/1 up

    set interface l2 bridge VirtualEthernet0/0/0 23

    set interface l2 bridge VirtualEthernet0/0/1 23

    Boot VMs

    set interface state VirtualEthernet0/0/0 down

    set interface state VirtualEthernet0/0/1 down

    delete vhost-user sw_if_index

    delete vhost-user sw_if_index

    create vhost-user socket /tmp/sock0

    create vhost-user socket /tmp/sock1

    set interface state VirtualEthernet0/0/0 up

    set interface state VirtualEthernet0/0/1 up

    set interface l2 bridge VirtualEthernet0/0/0 23

    set interface l2 bridge VirtualEthernet0/0/1 23

    Problem

    ========

    The problem can be noticed when the first vhost interface is created after deletion. Packets are dropped because vnet_interface_output_node_no_flatten_inline() sees rt->is_deleted as 1. This is strange as that variable is always zero and is never changed "explicitly". (rt variables modified in vnet_register_interface () is different than one accessed in vnet_interface_output_node_no_flatten_inline. Their addresses are different.). This I have verified 100s of times in GDB during my debug. What I noticed is, 'rt' variable that is accessed in vnet_interface_output_node_no_flatten_inline() is getting updated as a side-effect in the following function chain:

    #0 find_config_with_features (vm=0xc68a80 , cm=0x7ffff74a93f8 , feature_vector=0x7fffc4d6d244)

    at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/config.c:118
    

    #1 0x00007ffff6d0f515 in vnet_config_add_feature (vm=0xc68a80 <vlib_global_main>, cm=0x7ffff74a93f8 <ip6_main+312>,

    config_string_heap_index=1, feature_index=4, feature_config=0x0, n_feature_config_bytes=0)
    
    at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/config.c:276
    

    #2 0x00007ffff6eda86b in ip6_sw_interface_add_del (vnm=0xc691c0 <vnet_main>, sw_if_index=7, is_add=1)

    at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/ip/ip6_forward.c:1297
    

    #3 0x00007ffff6d1dc75 in call_elf_section_interface_callbacks (vnm=0xc691c0 <vnet_main>, if_index=7, flags=1,

    elt=0x7ffff74a5b00 <init_function>) at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/interface.c:219
    

    #4 0x00007ffff6d1de11 in call_sw_interface_add_del_callbacks (vnm=0xc691c0 <vnet_main>, sw_if_index=7, is_create=1)

    at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/interface.c:252
    

    #5 0x00007ffff6d1e0d0 in vnet_sw_interface_set_flags_helper (vnm=0xc691c0 <vnet_main>, sw_if_index=7, flags=0, helper_flags=1)

    at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/interface.c:335
    

    #6 0x00007ffff6d1fc15 in vnet_register_interface (vnm=0xc691c0 <vnet_main>, dev_class_index=6, dev_instance=2, hw_class_index=16,

    hw_instance=2) at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/interface.c:757
    

    #7 0x00007ffff6d80ff5 in ethernet_register_interface (vnm=0xc691c0 <vnet_main>, dev_class_index=6, dev_instance=2,

    After the execution of vec_foreach (f, feature_vector) loop in find_config_with_features(), the variable rt gets updated.

    (gdb) b vnet/vnet/config.c:116

    Breakpoint 26 at 0x7ffff6d0e310: file /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/config.c, line 116.

    (gdb) p


    {vnet_interface_output_runtime_t} 0x7fffc4d3f5ec

    $171 = {

    hw_if_index = 6,

    sw_if_index = 6,

    dev_instance = 1,

    is_deleted = 0

    }

    (gdb) c

    Continuing.

    Breakpoint 26, find_config_with_features (vm=0xc68a80 <vlib_global_main>, cm=0x7ffff74a93f8 <ip6_main+312>,

    feature_vector=0x7fffc4d6d244) at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/config.c:118
    

    118 if (last_node_index == ~0 || last_node_index != cm->end_node_index)

    (gdb) p {vnet_interface_output_runtime_t}

    0x7fffc4d3f5ec

    $172 =


    {
    hw_if_index = 6,
    sw_if_index = 6,
    dev_instance = 1,
    {color:red}

    is_deleted = 1

    }

    At this time, I have run out of ideas and will be helpful if some one with more knowledge of VPP chips in.

    • jonshin (Wed, 29 Jun 2016 19:56:49 +0000):

      If you look at the following packet trace capture, on VirtualEthernet0/0/0-tx node, it is trying to send to the wrong virtual QUEUE - VirtualEthernet0/0/1 tx queue 0.


    00:14:02:645617: dpdk-input

    VirtualEthernet0/0/1 rx queue 0

    buffer 0xfbd49da: current data 0, length 42, free-list 0, totlen-nifb 0, trace 0x2

    PKT MBUF: port 255, nb_segs 1, pkt_len 42

    buf_len 2176, data_len 42, ol_flags 0x0,
    
    packet_type 0x0
    

    ARP: fa:16:3e:ad:f2:67 -> ff:ff:ff:ff:ff:ff

    request, type ethernet/IP4, address size 6/4

    fa:16:3e:ad:f2:67/51.51.51.142 -> 00:00:00:00:00:00/51.51.51.141

    00:14:02:645627: ethernet-input

    ARP: fa:16:3e:ad:f2:67 -> ff:ff:ff:ff:ff:ff

    00:14:02:645633: l2-input

    l2-input: sw_if_index 10 dst ff:ff:ff:ff:ff:ff src fa:16:3e:ad:f2:67

    00:14:02:645634: arp-term-l2bd

    request, type ethernet/IP4, address size 6/4

    fa:16:3e:ad:f2:67/51.51.51.142 -> 00:00:00:00:00:00/51.51.51.141

    00:14:02:645637: l2-flood

    l2-flood: sw_if_index 10 dst ff:ff:ff:ff:ff:ff src fa:16:3e:ad:f2:67 bd_index 1

    00:14:02:645639: l2-output

    l2-output: sw_if_index 9 dst ff:ff:ff:ff:ff:ff src fa:16:3e:ad:f2:67

    00:14:02:645643: VirtualEthernet0/0/0-output

    VirtualEthernet0/0/0

    ARP: fa:16:3e:ad:f2:67 -> ff:ff:ff:ff:ff:ff

    request, type ethernet/IP4, address size 6/4

    fa:16:3e:ad:f2:67/51.51.51.142 -> 00:00:00:00:00:00/51.51.51.141

    00:14:02:645645: VirtualEthernet0/0/0-tx

    VirtualEthernet0/0/1 tx queue 0 —————> This should be VirtualEthernet0/0/0.

    buffer 0xfbd49da: current data 0, length 42, free-list 1, totlen-nifb 0, trace 0x2

    ARP: fa:16:3e:ad:f2:67 -> ff:ff:ff:ff:ff:ff

    request, type ethernet/IP4, address size 6/4

    fa:16:3e:ad:f2:67/51.51.51.142 -> 00:00:00:00:00:00/51.51.51.141

    00:14:02:645661: l2-flood

    l2-flood: sw_if_index 10 dst 00:01:08:00:06:04 src 00:01:fa:16:3e:ad bd_index 1

    00:14:02:645664: arp-input

    request, type ethernet/IP4, address size 6/4

    fa:16:3e:ad:f2:67/51.51.51.142 -> 00:00:00:00:00:00/51.51.51.141

    00:14:02:645668: error-drop

    arp-input: IP4 destination address not local to subnet

    -------

    The is where the problem happens:

    open-vpp/vnet/vnet/devices/dpdk/device.c

    dpdk_interface_tx (vlib_main_t * vm,

           vlib_node_runtime_t * node,
    
           vlib_frame_t * f)
    

    {

    dpdk_main_t * dm = &dpdk_main;

    vnet_interface_output_runtime_t * rd = (void *) node->runtime_data;

    dpdk_device_t * xd = vec_elt_at_index (dm->devices, rd->dev_instance); -> xd is extracted using dev_instance which comes from node’s runtime_data. This data is determined when vnet_register_interface().

    u32 n_packets = f->n_vectors;

    Original issue: https://jira.fd.io/browse/VPP-171

@vvalderrv
Copy link
Contributor Author

Thanks Dave. Calling vlib_worker_thread_node_runtime_update(); was the missing piece. It works now.

@vvalderrv
Copy link
Contributor Author

I suspect we may be back on this topic, but I've cleaned up several issues.

@vvalderrv
Copy link
Contributor Author

Steve can you check with the code that is at least as recent as below:

DBGvpp# show version verbose
Version: v16.09-rc0~161-gea3e1fc
Compiled by: localadmin
Compile host: ubuntu
Compile date: Thu Jun 30 17:03:05 PDT 2016
Compile location: /scratch/localadmin/openvpp/vpp.new
Compiler: GCC 4.8.4
CPU model name: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
CPU microarchitecture: Haswell (Haswell-E)
CPU flags: sse3 ssse3 sse41 sse42 avx avx2 aes
Current PID: 6472
DPDK Version: DPDK 16.04.0
DPDK EAL init args: -c 3 -n 4 --huge-dir /run/vpp/hugepages --file-prefix vpp -w 0000:0a:00.0 --master-lcore 0 --socket-mem 512,512
------------------

It looks like some sort of corruption is happening in vec_foreach (f, feature_vector) loop in find_config_with_features(). The reason I say that is, the vlib_node_runtime_t in vnet_interface_output_node_no_flatten_inline looks spooky after the execution of the above mentioned loop.

=========
BEFORE
=========
(gdb) p

{vlib_node_runtime_t} 0x7fffc4d3d4e4
$204 = {
function = 0x7ffff6d33346 <vnet_interface_output_node_no_flatten>,
errors = 0x7fffc4df8dec,
clocks_since_last_overflow = 315980,
max_clock = 128513,
max_clock_n = 1,
calls_since_last_overflow = 8,
vectors_since_last_overflow = 8,
next_frame_index = 677,
node_index = 187,
input_main_loops_per_call = 0,
main_loop_count_last_dispatch = 162883242,
main_loop_vector_stats = {0, 1},
flags = 0,
state = 0,
n_next_nodes = 3,
cached_next_index = 2,
cpu_index = 0,
runtime_data = {25769803782, 1, 0, 0, 0, 0, 0}

=======
AFTER
=======
(gdb) p {vlib_node_runtime_t}

0x7fffc4d3d4e4
$209 = {
function = 0x7ffff6d33346 <vnet_interface_output_node_no_flatten>,
errors = 0x7fffc4df8dec,
clocks_since_last_overflow = 0,
max_clock = 0,
max_clock_n = 0,
calls_since_last_overflow = 0,
vectors_since_last_overflow = 0,
next_frame_index = 690,
node_index = 187,
input_main_loops_per_call = 0,
main_loop_count_last_dispatch = 0,
main_loop_vector_stats =

{0, 0}

,
flags = 0,
state = 0,
n_next_nodes = 3,
cached_next_index = 0,
cpu_index = 0,
* runtime_data = {25769803782, 4294967297, 0, 0, 0, 0, 0}*

Just for kicks, to see how the system would behave if this corruption had not happened, I add the following:

— a/vnet/vnet/interface_output.c
+++ b/vnet/vnet/interface_output.c
@@ -420,6 +420,11 @@ vnet_interface_output_node_no_flatten_inline (vlib_main_t * vm,

from = vlib_frame_args (frame);

+ if (rt->is_deleted)

{ + printf("RESET DELETE FLAG\n"); + rt->is_deleted = 0; + }

Now, the VM were pingable after vhost interface ADD-DEL-ADD sequence.

@vvalderrv
Copy link
Contributor Author

On my setup, i ran some add/delete vhost interface along with 2 VM ping. But i didn't see any packets drop.
Here's my configuration:
Thread 0 vpp_main (lcore 0)
Thread 1 vpp_wk_0 (lcore 1)

cpu {
main-core 0
corelist-workers 1
}

Your issue seems to be related with your test environment.

@vvalderrv
Copy link
Contributor Author

Steve Shin and I actively debugged this issue and found that the following fixes the issue when VPP is running in single threaded mode. However, the problem persists in multi-threaded mode. Below, I have provided some information that I know as of now.

==================================
Following fix works for single thread mode
==================================
— a/vnet/vnet/interface.c
+++ b/vnet/vnet/interface.c
@@ -656,6 +656,15 @@ vnet_register_interface (vnet_main_t * vnm,
rt = vlib_node_get_runtime_data (vm, hw->output_node_index);
ASSERT (rt->is_deleted == 1);
rt->is_deleted = 0;
+ rt->hw_if_index = hw_index;
+ rt->sw_if_index = hw->sw_if_index;
+ rt->dev_instance = hw->dev_instance;
+
+ rt = vlib_node_get_runtime_data (vm, hw->tx_node_index);
+ rt->is_deleted = 0;
+ rt->hw_if_index = hw_index;
+ rt->sw_if_index = hw->sw_if_index;
+ rt->dev_instance = hw->dev_instance;

_vec_len (im->deleted_hw_interface_nodes) -= 1;
}

================
Multi threaded case
================
Setup is very simple. VPP has two vhost interfaces with a VM attached to each one. Boot those VMs, and delete the interfaces and add them back. Reboot VMs to reconnect vhost file descriptors. (Steve has a patch in QEMU that does not require this reboot). After reboot, ping fails because one of the interfaces is showing as still deleted in app (show err).

===============
Steps to reproduce
===============
create vhost-user socket /tmp/sock0
create vhost-user socket /tmp/sock1
set interface state VirtualEthernet0/0/0 up
set interface state VirtualEthernet0/0/1 up
set interface l2 bridge VirtualEthernet0/0/0 23
set interface l2 bridge VirtualEthernet0/0/1 23

Boot VMs

set interface state VirtualEthernet0/0/0 down
set interface state VirtualEthernet0/0/1 down
delete vhost-user sw_if_index <index1>
delete vhost-user sw_if_index <index2>
create vhost-user socket /tmp/sock0
create vhost-user socket /tmp/sock1
set interface state VirtualEthernet0/0/0 up
set interface state VirtualEthernet0/0/1 up
set interface l2 bridge VirtualEthernet0/0/0 23
set interface l2 bridge VirtualEthernet0/0/1 23

Problem
========
The problem can be noticed when the first vhost interface is created after deletion. Packets are dropped because vnet_interface_output_node_no_flatten_inline() sees rt->is_deleted as 1. This is strange as that variable is always zero and is never changed "explicitly". (rt variables modified in vnet_register_interface () is different than one accessed in vnet_interface_output_node_no_flatten_inline. Their addresses are different.). This I have verified 100s of times in GDB during my debug. What I noticed is, 'rt' variable that is accessed in vnet_interface_output_node_no_flatten_inline() is getting updated as a side-effect in the following function chain:

#0 find_config_with_features (vm=0xc68a80 <vlib_global_main>, cm=0x7ffff74a93f8 <ip6_main+312>, feature_vector=0x7fffc4d6d244)
at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/config.c:118
#1 0x00007ffff6d0f515 in vnet_config_add_feature (vm=0xc68a80 <vlib_global_main>, cm=0x7ffff74a93f8 <ip6_main+312>,
config_string_heap_index=1, feature_index=4, feature_config=0x0, n_feature_config_bytes=0)
at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/config.c:276
#2 0x00007ffff6eda86b in ip6_sw_interface_add_del (vnm=0xc691c0 <vnet_main>, sw_if_index=7, is_add=1)
at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/ip/ip6_forward.c:1297
#3 0x00007ffff6d1dc75 in call_elf_section_interface_callbacks (vnm=0xc691c0 <vnet_main>, if_index=7, flags=1,
elt=0x7ffff74a5b00 <init_function>) at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/interface.c:219
#4 0x00007ffff6d1de11 in call_sw_interface_add_del_callbacks (vnm=0xc691c0 <vnet_main>, sw_if_index=7, is_create=1)
at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/interface.c:252
#5 0x00007ffff6d1e0d0 in vnet_sw_interface_set_flags_helper (vnm=0xc691c0 <vnet_main>, sw_if_index=7, flags=0, helper_flags=1)
at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/interface.c:335
#6 0x00007ffff6d1fc15 in vnet_register_interface (vnm=0xc691c0 <vnet_main>, dev_class_index=6, dev_instance=2, hw_class_index=16,
hw_instance=2) at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/interface.c:757
#7 0x00007ffff6d80ff5 in ethernet_register_interface (vnm=0xc691c0 <vnet_main>, dev_class_index=6, dev_instance=2,

After the execution of vec_foreach (f, feature_vector) loop in find_config_with_features(), the variable rt gets updated.

(gdb) b vnet/vnet/config.c:116
Breakpoint 26 at 0x7ffff6d0e310: file /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/config.c, line 116.
(gdb) p

{vnet_interface_output_runtime_t} 0x7fffc4d3f5ec
$171 = {
hw_if_index = 6,
sw_if_index = 6,
dev_instance = 1,
is_deleted = 0
}
(gdb) c
Continuing.

Breakpoint 26, find_config_with_features (vm=0xc68a80 <vlib_global_main>, cm=0x7ffff74a93f8 <ip6_main+312>,
feature_vector=0x7fffc4d6d244) at /scratch/localadmin/openvpp/vpp.new/build-data/../vnet/vnet/config.c:118
118 if (last_node_index == ~0 || last_node_index != cm->end_node_index)
(gdb) p {vnet_interface_output_runtime_t}

0x7fffc4d3f5ec
$172 =

{ hw_if_index = 6, sw_if_index = 6, dev_instance = 1, {color:red}

is_deleted = 1
}

At this time, I have run out of ideas and will be helpful if some one with more knowledge of VPP chips in.

@vvalderrv
Copy link
Contributor Author

If you look at the following packet trace capture, on VirtualEthernet0/0/0-tx node, it is trying to send to the wrong virtual QUEUE - VirtualEthernet0/0/1 tx queue 0.
-------
00:14:02:645617: dpdk-input
VirtualEthernet0/0/1 rx queue 0
buffer 0xfbd49da: current data 0, length 42, free-list 0, totlen-nifb 0, trace 0x2
PKT MBUF: port 255, nb_segs 1, pkt_len 42
buf_len 2176, data_len 42, ol_flags 0x0,
packet_type 0x0
ARP: fa:16:3e:ad:f2:67 -> ff:ff:ff:ff:ff:ff
request, type ethernet/IP4, address size 6/4
fa:16:3e:ad:f2:67/51.51.51.142 -> 00:00:00:00:00:00/51.51.51.141
00:14:02:645627: ethernet-input
ARP: fa:16:3e:ad:f2:67 -> ff:ff:ff:ff:ff:ff
00:14:02:645633: l2-input
l2-input: sw_if_index 10 dst ff:ff:ff:ff:ff:ff src fa:16:3e:ad:f2:67
00:14:02:645634: arp-term-l2bd
request, type ethernet/IP4, address size 6/4
fa:16:3e:ad:f2:67/51.51.51.142 -> 00:00:00:00:00:00/51.51.51.141
00:14:02:645637: l2-flood
l2-flood: sw_if_index 10 dst ff:ff:ff:ff:ff:ff src fa:16:3e:ad:f2:67 bd_index 1
00:14:02:645639: l2-output
l2-output: sw_if_index 9 dst ff:ff:ff:ff:ff:ff src fa:16:3e:ad:f2:67
00:14:02:645643: VirtualEthernet0/0/0-output
VirtualEthernet0/0/0
ARP: fa:16:3e:ad:f2:67 -> ff:ff:ff:ff:ff:ff
request, type ethernet/IP4, address size 6/4
fa:16:3e:ad:f2:67/51.51.51.142 -> 00:00:00:00:00:00/51.51.51.141
00:14:02:645645: VirtualEthernet0/0/0-tx
VirtualEthernet0/0/1 tx queue 0 —————> This should be VirtualEthernet0/0/0.
buffer 0xfbd49da: current data 0, length 42, free-list 1, totlen-nifb 0, trace 0x2
ARP: fa:16:3e:ad:f2:67 -> ff:ff:ff:ff:ff:ff
request, type ethernet/IP4, address size 6/4
fa:16:3e:ad:f2:67/51.51.51.142 -> 00:00:00:00:00:00/51.51.51.141
00:14:02:645661: l2-flood
l2-flood: sw_if_index 10 dst 00:01:08:00:06:04 src 00:01:fa:16:3e:ad bd_index 1
00:14:02:645664: arp-input
request, type ethernet/IP4, address size 6/4
fa:16:3e:ad:f2:67/51.51.51.142 -> 00:00:00:00:00:00/51.51.51.141
00:14:02:645668: error-drop
arp-input: IP4 destination address not local to subnet
-------

The is where the problem happens:
open-vpp/vnet/vnet/devices/dpdk/device.c
dpdk_interface_tx (vlib_main_t * vm,
vlib_node_runtime_t * node,
vlib_frame_t * f)
{
dpdk_main_t * dm = &dpdk_main;
vnet_interface_output_runtime_t * rd = (void *) node->runtime_data;
dpdk_device_t * xd = vec_elt_at_index (dm->devices, rd->dev_instance); -> xd is extracted using dev_instance which comes from node’s runtime_data. This data is determined when vnet_register_interface().
u32 n_packets = f->n_vectors;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant