Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subnet_prefix not honored with recent kernels #36

Open
a-denis opened this issue May 21, 2024 · 6 comments
Open

subnet_prefix not honored with recent kernels #36

a-denis opened this issue May 21, 2024 · 6 comments

Comments

@a-denis
Copy link

a-denis commented May 21, 2024

Hi,

I noticed that with recent kernels, the subnet_prefix has no effect.
In my opensm.conf, I have the following line:
subnet_prefix 0xfe80000000000006
(with the 6 and the end)

When I boot with the old Linux 4.19, it behaves as expected:

 ibv_devinfo -v | grep GID
			GID[  0]:		fe80:0000:0000:0006:b859:9f03:00db:f884

On the same machine, without touching anything else, if I boot with kernel 6.1 or 6.8, it is wrong:

ibv_devinfo -v | grep GID
   		GID[  0]:		fe80:0000:0000:0000:b859:9f03:00db:f884

even though in the log, it seems to have been taken into account:

OpenSM 3.3.23
 Reading Cached Option File: /etc/opensm/opensm.conf
 Loading Cached Option:subnet_prefix = 0xfe80000000000006

I do not know which kernel version broke it.

Thank you.

@vladko1974
Copy link
Contributor

vladko1974 commented May 21, 2024 via email

@a-denis
Copy link
Author

a-denis commented May 21, 2024

I just tried with subnet_prefix 0xfec0000000000006, but the issue is the same, it's still 0xfe80000000000000.

@vladko1974
Copy link
Contributor

vladko1974 commented May 21, 2024 via email

@a-denis
Copy link
Author

a-denis commented Jun 20, 2024

Yes, I made the change on both hosts, and restarted opensm on both.
subnet_prefix still has no effect with kernel 6.1 (Debian package)

I forgot to mention: there is no switch involved. Nodes are connected back to back.

@a-denis
Copy link
Author

a-denis commented Jun 20, 2024

Some more precisions: on another pair of machine with old ConnectX-3 boards, subnet_prefix works even with kernel 6.1. The machines where it does not work are using ConnectX-4 boards. This might be of some interest.

@vladko1974
Copy link
Contributor

vladko1974 commented Jun 25, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants