Why is the Ethernet switch port-based identification not working?
The steps below will help you troubleshoot why port-based identification on an Ethernet switch might not be working. If your problem is not solved even after following the troubleshooting steps below, then you can always contact the Bright Computing technical support team for more help.
Bright Cluster Manager will not generate DNS entries for your switches if they are on the externalnet. It assumes that your external DNS server knows about these switches.
Make sure that:
1. You can ping the management IP address of your switches from the head node
2. You have set the SNMP community string as the 'readstring' property for each switch
3. SNMP is working properly for your switch. Try snmpwalk -v1 -c public xxx.xxx.xxx.xxx (where 'public' is the SNMP community string and xxx.xxx.xxx.xxx is the IP of your switch). Note the use of SNMP v1. If SNMP is not working correctly, then an alternative approach is explained in http://kb.brightcomputing.com/faq/index.php?action=artikel&id=425
4. The 'device showport <MAC address>' command in cmsh is working. The "SNMP Community string" is like a user id or password that allows access to a router's or other device's statistics. It is used only by devices that support the SNMPv1 and SNMPv2c protocol.
* If the community string is correct, the device responds with the requested information.
* If the community string is incorrect, the device simply discards the request and does not respond.
5. Run the following command: cm-diagnose -j -e <your switch name> in order to collect useful diagnostics. Note that you will need to use the name of the switch and not the IP address. To find the name of the switch you can use the following commands.
[root@my-head ~]# cmsh
Type Hostname (key) MAC Category Ip Network
---------------- -------------- ------------------ --------- ------------ -----------
EthernetSwitch myswitch01 00:00:00:00:00:00 10.141.0.253 internalnet
HeadNode master2 00:16:3E:1B:5B:39 10.141.0.252 internalnet
HeadNode my-head 00:16:3E:74:FC:67 10.141.0.254 internalnet
PhysicalNode node001 00:16:3E:8F:15:37 default 10.141.0.1 internalnet
PhysicalNode node002 00:16:3E:6D:56:84 default 10.141.0.2 internalnet
In the above example "myswitch01" is the name of the switch.
Verifying network connectivity through cmsh
First, make sure the switch is reported UP by CMDaemon. Using cmsh:
[mycluster]% device status switch01
switch01 ................. [ UP ]
If the switch is not reported as UP, check if the correct IP was set in CMDaemon, using cmsh:
[mycluster]% device get switch01 ip
[mycluster]% device get switch01 network
[mycluster]% network get internalnet netmaskbits
Use the switch's native management interface to verify that it is setup for the correct IP address. If everything is configured correctly, but the switch is still not reported as UP, then the switch could be blocking the CMDaemon probes. (To check if a device is UP, the CMDaemon tries to establish a connection to tcp port 2. CMDaemon interprets both an accept and a connection refused message as being reachable, causing the state to be UP. So the device does not actually need to have anything listening on port 2. However, if the connection simply times out, CMDaemon will set the device to DOWN.) If unblocking the CMDaemon probes is not an option, then, as a last resort, one could try and set a custom ping script for the switch. Using cmsh:
[mycluster]% device set switch01 custompingscript /bin/true
[mycluster]% device commit switch01
Thu Aug 2 13:54:22 2012 [notice] mycluster: switch01 [ UP ]
Basic SNMP communication example
Make sure CMDaemon is able to do SNMP communication to the switch. A simple check is to see if the CMDaemon has obtained the model information. Using cmsh:
[mycluster]% device get switch01 model
ProCurve J4904A Switch 2848
In case no model information is obtained, then check, using the switches native management tools, if SNMP is enabled. Also check if the community strings match those set in Bright. To check the Bright settings from cmsh:
[mycluster]% device get switch01 readstring public
Using the showport command
To check if port detection works, the showport command in cmsh device mode can be used. Connect some device for which the MAC address is known to the switch and make sure it generates some traffic. (This could be a PXE booting compute node, or a laptop, etc.) Then using cmsh (replace the example MAC address!):
[mycluster]% device showport -s switch01 00:30:48:7C:F3:EC
If the above returns a port number on the specified switch, then port detection is working on that switch. The next step is then to configure the uplinks property, see below under "Configuring uplink ports correctly". If the above does not return a port number, or returns a port number on another switch, then chances are that the uplinks property is not configured correctly. To verify port detection on a specific switch, using the showport command, it is recommended to first clear the uplink settings for that switch. (Be sure to write down the old values!) Using cmsh:
[mycluster]% device get switch01 uplinks
[mycluster]% device clear switch01 uplinks
mycluster]% device commit switch01
Then repeat the showport command as shown earlier. If, after clearing the uplinks, the showport command still does not show a port number on the specified switch, then further investigation is needed. See below in section "Verifying switch SNMP MIB support".
Important: The cmgui, in the overview tab for a selected switch, can also display some information on MAC addresses and to which ports they are connected. However, this uses different methods of obtaining the information from the switch and can therefore not be used to verify that port detection is working.