Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors on ARM CPU #51

Open
ThetaDR opened this issue Jul 15, 2022 · 2 comments
Open

Errors on ARM CPU #51

ThetaDR opened this issue Jul 15, 2022 · 2 comments

Comments

@ThetaDR
Copy link

ThetaDR commented Jul 15, 2022

There is an error on ARM machine (Apple's M1) in k8s cluster when NSM's forwarder-vpp is trying to start.
I have tried to start it with and without Rosetta - in both cases it gives the same error.
Here's a log from the forwarder.

Jul 15 15:11:30.600�[36m [INFO] [cmd:/bin/forwarder] �[0mSetting env variable DLV_LISTEN_FORWARDER to a valid dlv '--listen' value will cause the dlv debugger to execute this binary and listen as directed.
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0mthere are 9 phases which will be executed followed by a success message:
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0mthe phases include:
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0m1: get config from environment
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0m2: run vpp and get a connection to it
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0m3: get SR-IOV config from file
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0m4: init pools
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0m5: start device plugin server
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0m6: retrieve spiffe svid
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0m7: create xconnect network service endpoint
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0m8: create grpc server and register xconnect
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0m9: register xconnectns with the registry
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0ma final success message with start time duration
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0mexecuting phase 1: get config from environment (time since start: 805.75µs)
This application is configured via the environment. The following environment
variables can be used:

KEY                          TYPE                                           DEFAULT                                                REQUIRED    DESCRIPTION
NSM_NAME                     String                                         forwarder                                                          Name of Endpoint
NSM_LABELS                   Comma-separated list of String:String pairs    p2p:true                                                           Labels related to this forwarder-vpp instance
NSM_NSNAME                   String                                         forwarder                                                          Name of Network Service to Register with Registry
NSM_CONNECT_TO               URL                                            unix:///connect.to.socket                                          url to connect to
NSM_LISTEN_ON                URL                                            unix:///listen.on.socket                                           url to listen on
NSM_MAX_TOKEN_LIFETIME       Duration                                       10m                                                                maximum lifetime of tokens
NSM_LOG_LEVEL                String                                         INFO                                                               Log level
NSM_DIAL_TIMEOUT             Duration                                       100ms                                                              Timeout for the dial the next endpoint
NSM_OPENTELEMETRYENDPOINT    String                                         otel-collector.observability.svc.cluster.local:4317                OpenTelemetry Collector Endpoint
NSM_TUNNEL_IP                String                                                                                                            IP to use for tunnels
NSM_VXLAN_PORT               Unsigned Integer                               0                                                                  VXLAN port to use
NSM_VPP_API_SOCKET           String                                         /var/run/vpp/external/vpp-api.sock                                 filename of socket to connect to existing VPP instance.  If empty a VPP instance is run in forwarder
NSM_VPP_INIT                 Func                                           NONE                                                               type of VPP initialization. Must be NONE or AF_PACKET
NSM_RESOURCE_POLL_TIMEOUT    Duration                                       30s                                                                device plugin polling timeout
NSM_DEVICE_PLUGIN_PATH       String                                         /var/lib/kubelet/device-plugins/                                   path to the device plugin directory
NSM_POD_RESOURCES_PATH       String                                         /var/lib/kubelet/pod-resources/                                    path to the pod resources directory
NSM_DEVICE_SELECTOR_FILE     String                                                                                                            config file for device name to label matching
NSM_SRIOV_CONFIG_FILE        String                                                                                                            PCI resources config path
NSM_PCI_DEVICES_PATH         String                                         /sys/bus/pci/devices                                               path to the PCI devices directory
NSM_PCI_DRIVERS_PATH         String                                         /sys/bus/pci/drivers                                               path to the PCI drivers directory
NSM_CGROUP_PATH              String                                         /host/sys/fs/cgroup/devices                                        path to the host cgroup directory
NSM_VFIO_PATH                String                                         /host/dev/vfio                                                     path to the host VFIO directory
Jul 15 15:11:30.621�[36m [INFO] [cmd:/bin/forwarder] �[0mConfig: &config.Config{Name:"forwarder-vpp-9mgxg", Labels:map[string]string{"p2p":"true"}, NSName:"forwarder", ConnectTo:url.URL{Scheme:"unix", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/var/lib/networkservicemesh/nsm.io.sock", RawPath:"", ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}, ListenOn:url.URL{Scheme:"unix", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/listen.on.sock", RawPath:"", ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}, MaxTokenLifetime:600000000000, LogLevel:"TRACE", DialTimeout:100000000, OpenTelemetryEndpoint:"otel-collector.observability.svc.cluster.local:4317", TunnelIP:net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xac, 0x13, 0x0, 0x3}, VxlanPort:0x0, VppAPISocket:"/var/run/vpp/external/vpp-api.sock", VppInit:vppinit.Func{f:(func(context.Context, api.Connection, net.IP) (net.IP, error))(0xc875e0)}, ResourcePollTimeout:30000000000, DevicePluginPath:"/var/lib/kubelet/device-plugins/", PodResourcesPath:"/var/lib/kubelet/pod-resources/", DeviceSelectorFile:"", SRIOVConfigFile:"", PCIDevicesPath:"/sys/bus/pci/devices", PCIDriversPath:"/sys/bus/pci/drivers", CgroupPath:"/host/sys/fs/cgroup/devices", VFIOPath:"/host/dev/vfio"}
Jul 15 15:11:30.622�[36m [INFO] [cmd:/bin/forwarder] [duration:18.244709ms] �[0mcompleted phase 1: get config from environment
Jul 15 15:11:30.622�[36m [INFO] [cmd:/bin/forwarder] �[0mexecuting phase 2: run vpp and get a connection to it (time since start: 19.806667ms)
Jul 15 15:11:30.623�[36m [INFO] �[0mConfiguration file: "/etc/vpp/helper/vpp.conf" not found, using defaults
Jul 15 15:11:30.633�[36m [INFO] [cmd:/bin/forwarder] �[0mlocal vpp is being used
Jul 15 15:11:30.634�[36m [INFO] [cmd:/bin/forwarder] [duration:11.589958ms] �[0mcompleted phase 2: run vpp and get a connection to it
Jul 15 15:11:30.634�[33m [WARN] [cmd:/bin/forwarder] �[0mskipping phases 3-5: no PCI resources config
Jul 15 15:11:30.634�[33m [WARN] [cmd:/bin/forwarder] �[0mSR-IOV is not enabled
Jul 15 15:11:30.634�[36m [INFO] [cmd:/bin/forwarder] �[0mexecuting phase 6: retrieving svid, check spire agent logs if this is the last line you see (time since start: 32.209ms)
Jul 15 15:11:30.898�[36m [INFO] �[0mSVID: "spiffe://example.org/ns/nsm-system/pod/forwarder-vpp-9mgxg"
Jul 15 15:11:30.906�[36m [INFO] [cmd:/bin/forwarder] [duration:271.382458ms] �[0mcompleted phase 6: retrieving svid
Jul 15 15:11:30.906�[36m [INFO] [cmd:/bin/forwarder] �[0mexecuting phase 7: create xconnect network service endpoint (time since start: 304.018959ms)
Jul 15 15:11:30.624�[36m [INFO] [cmd:vpp] �[0mvpp[3500]: clib_sysfs_prealloc_hugepages:262: pre-allocating 64 additional 2048K hugepages on numa node 0
Jul 15 15:11:30.624�[36m [INFO] [cmd:vpp] �[0mvpp[3500]: buffer: numa[0] falling back to non-hugepage backed buffer pool (vlib_physmem_shared_map_create: pmalloc_map_pages: failed to mmap 64 pages at 0x1000000000 fd 5 numa 0 flags 0x11: Invalid argument)
Jul 15 15:11:33.219�[37m [DEBU] �[0m/var/run/vpp/api.sock was created after 2.587428293s
Jul 15 15:11:33.330�[37m [DEBU] �[0msuccessfully connected to /var/run/vpp/api.sock after 110.855416ms and 1 attempts
panic: error: VPPApiError: System call error #1 (-11)

goroutine 1 [running]:
github.com/networkservicemesh/cmd-forwarder-vpp/internal/vppinit.Must(...)
	/build/internal/vppinit/vppinit.go:68
main.main()
	/build/main.go:239 +0x2f45

And the same log in file:
forwarder-vpp-9mgxg.log

@richardstone
Copy link

Hello!

I have the same issues. Can we follow it up? How could this be solved?

/Richard

@edwarnicke
Copy link
Owner

edwarnicke commented Sep 6, 2022

@richardstone I think the issue here is we need to do proper ARM builds.

My current thoughts are to look at ARM runners on Equinix Metal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants