-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
subnet_prefix not honored with recent kernels #36
Comments
Link-Local subnet prefix (0xFE8::/64) should only be supported by the spec. See section 4.1.1 GID USAGE AND PROPERTIES Vol 1 Release 1.7
Site-local subnet prefix (0xFEC::6) should be used in described use case.
From: Alexandre DENIS ***@***.***>
Sent: Tuesday, May 21, 2024 4:13 PM
To: linux-rdma/opensm ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [linux-rdma/opensm] subnet_prefix not honored with recent kernels (Issue #36)
Hi,
I noticed that with recent kernels, the subnet_prefix has no effect.
In my opensm.conf, I have the following line:
subnet_prefix 0xfe80000000000006
(with the 6 and the end)
When I boot with the old Linux 4.19, it behaves as expected:
ibv_devinfo -v | grep GID
GID[ 0]: fe80:0000:0000:0006:b859:9f03:00db:f884
On the same machine, without touching anything else, if I boot with kernel 6.1 or 6.8, it is wrong:
ibv_devinfo -v | grep GID
GID[ 0]: fe80:0000:0000:0000:b859:9f03:00db:f884
even though in the log, it seems to have been taken into account:
OpenSM 3.3.23
Reading Cached Option File: /etc/opensm/opensm.conf
Loading Cached Option:subnet_prefix = 0xfe80000000000006
I do not know which kernel version broke it.
Thank you.
—
Reply to this email directly, view it on GitHub<#36>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AD6PRHEAV24JWZRYNY6CRL3ZDNB5BAVCNFSM6AAAAABIBSX2K6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGMYDQMRYGI4TONA>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.******@***.***>>
|
I just tried with |
Did you restart subnet manager after changing subnet prefix in opensm.conf?
From: Alexandre DENIS ***@***.***>
Sent: Tuesday, May 21, 2024 4:48 PM
To: linux-rdma/opensm ***@***.***>
Cc: Vladimir Koushnir ***@***.***>; Comment ***@***.***>
Subject: Re: [linux-rdma/opensm] subnet_prefix not honored with recent kernels (Issue #36)
I just tried with subnet_prefix 0xfec0000000000006, but the issue is the same, it's still 0xfe80000000000000.
—
Reply to this email directly, view it on GitHub<#36 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AD6PRHGAOEUU5GMIJY47BC3ZDNF7VAVCNFSM6AAAAABIBSX2K6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRSGY3TSOJUHE>.
You are receiving this because you commented.Message ID: ***@***.******@***.***>>
|
Yes, I made the change on both hosts, and restarted opensm on both. I forgot to mention: there is no switch involved. Nodes are connected back to back. |
Some more precisions: on another pair of machine with old ConnectX-3 boards, subnet_prefix works even with kernel 6.1. The machines where it does not work are using ConnectX-4 boards. This might be of some interest. |
The issue seems nothing to do with opensm.
Please refer to the relevant kernel forum.
From: Alexandre DENIS ***@***.***>
Sent: Thursday, June 20, 2024 6:17 PM
To: linux-rdma/opensm ***@***.***>
Cc: Vladimir Koushnir ***@***.***>; Comment ***@***.***>
Subject: Re: [linux-rdma/opensm] subnet_prefix not honored with recent kernels (Issue #36)
Some more precisions: on another pair of machine with old ConnectX-3 boards, subnet_prefix works even with kernel 6.1. The machines where it does not work are using ConnectX-4 boards. This might be of some interest.
—
Reply to this email directly, view it on GitHub<#36 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AD6PRHE4I43GSC33NUSVUATZILW5LAVCNFSM6AAAAABIBSX2K6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBQHE2TOMBSGE>.
You are receiving this because you commented.Message ID: ***@***.******@***.***>>
|
Hi,
I noticed that with recent kernels, the
subnet_prefix
has no effect.In my
opensm.conf
, I have the following line:subnet_prefix 0xfe80000000000006
(with the 6 and the end)
When I boot with the old Linux 4.19, it behaves as expected:
On the same machine, without touching anything else, if I boot with kernel 6.1 or 6.8, it is wrong:
even though in the log, it seems to have been taken into account:
I do not know which kernel version broke it.
Thank you.
The text was updated successfully, but these errors were encountered: