Skip to main content
Ask Question
Installing BCM
Asked a question 7 months ago

How to get Infiniband working for BCM9.1 on CentOS8.3? Have tried to install head node with Mellanox ConnectX-2 interface, but the installation with BCM9.1 on CentOS8.3 does not recognize the interface card. The head node does work properly for BCM9.1 on Ubuntu 18.04, for BCM9.1 on Ubuntu 20.04, and for BCM9.1 on CentOS7.9 (using default OFED stack). I noticed a difference in the loaded kernel modules between these versions, but even after manually loading kernel modules mlx4-core, mlx4-ib, mlx4-en, ib-core, ib-uverbs during first installation step, still no success. Any suggestions?

Where am I?

In Bright Computing, Inc. you can ask and answer questions and share your experience with others!

trick1
Guy Who Does Stuff

Hi mini4ever 

So what I have found outside of bright cluster (BBC - as in I had this kind of issue under ubuntu in my cluster Before Bright Cl;uster) is that you need to pay attention to the drivers that support your version of the card.

I would start by logging in via SSH and running the following

lspci | grep Mellanox

you should find something like the following

08:00.0 InfiniBand: Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 5GT/s - IB DDR / 10GigE] (rev b0)

At which point you can determine which version of the OFED drivers support your card. Old version such as MellanoX-2 are not supported in current drivers. 

Current cards support matrix https://www.mellanox.com/support/mlnx-ofed-matrix71

for older cards you will need to use an older mellanox driver (which maybe unsupported in BC)

I would work out how to stand up linux on a node with a working driver and then determine how to get that into the BC node deployment.