Slurmctld shows the error "we don't have select plugin type 102"
When enabling shared resources in Slurm as per the article here34, you may see the following error in
/var/log/slurmctld on the headnode:
we don't have select plugin type 102
Checking through the logs you may also see:
error: Incomplete job record fatal: Incomplete job state save file, start with '-i' to ignore this
Occasionally, when enabling shared resources in Slurm, the job state save file becomes incomplete. To work around this issue, perform the following steps.
slurmctld in Bright:
# cmsh % device use master % services % stop slurm % quit
Next, have you SelectType and SelectTypeParameters set how you want them to be configured in
slurmctld by running the following command on your head node:
# /cm/shared/apps/slurm/current/sbin/slurmctld -i
That will tell
slurmctld to start while ignoring the incomplete job state save file error.
After that, kill the process for
# killall slurmctld
Then, start slurmctld from Bright again:
# cmsh % device use master % services % start slurm
slurmctld should be starting properly using your desired slurm.conf settings.
You may also need to run the scontrol reconfigure command once slurmctld is started to notify the compute nodes.