Skip to main content
Asked a question 4 years ago

How do I run an external Torque server with Bright?

Where am I?

In Bright Computing, Inc. you can ask and answer questions and share your experience with others!

How do I setup a Bright cluster to use an external Torque server?

Enable torque using the wlm-setup utility:

[root@ma-b71-c6 ~]# wlm-setup -s -w torque
               Disabling torque services  .....   [  OK  ]
          Creating default torque config  .....   [  OK  ]
               Initializing torque setup  .....   [  OK  ]
                     Setting permissions  .....   [  OK  ]
                Enabling torque services  .....   [  OK  ]
                              Finalizing  .....   [  OK  ]

Please note that the changes in the software image(s) have not
been propagated to the running nodes. This will happen when
the node(s) is/are rebooted.  

2. Set the externalserver property to yes so that CMDaemon won’t complain about stopped torque_server service

[root@ma-b71-c6 ~]# cmsh
[ma-b71-c6]% device roles master 
[ma-b71-c6->device[ma-b71-c6]->roles]% use torqueserver 
[ma-b71-c6->device[ma-b71-c6]->roles[torqueserver]]% get externalserver 
[ma-b71-c6->device[ma-b71-c6]->roles[torqueserver]]% set externalserver yes
[ma-b71-c6->device*[ma-b71-c6*]->roles*[torqueserver*]]% commit

3. Freeze the torque configurations to avoid CMDaemon writing out the configuration files:

[root@ma-b71-c6 ~]# grep Torqu /cm/local/apps/cmd/etc/cmd.conf 

FreezeChangesToTorqueConfig = true

[root@ma-b71-c6 ~]# service cmd restart

4. Replace the “master.cm4.cluster” with the hostname of the external server.

[root@ma-b71-c6 ~]# cat /cm/shared/apps/torque/var/spool/torque.cfg 


[root@ma-b71-c6 ~]# cat /cm/shared/apps/torque/var/spool/server_namema-b70-c6
[root@ma-b71-c6 spool]# cat /cm/local/apps/torque/var/spool/server_name


5. Add the following firewall rules to /etc/shorewall/rules on the head node to allow communication with the external torque server:

ACCEPT   net            fw              tcp     15004

ACCEPT   net            fw              udp     15004
ACCEPT   net            fw              tcp     15003
ACCEPT   net            fw              udp     15003
ACCEPT   net            fw              tcp     15002
ACCEPT   net            fw              udp     15002
ACCEPT   net            fw              tcp     15001
ACCEPT   net            fw              udp     15001

[root@ma-b71-c6 ~]# /etc/init.d/shorewall restart

6. Restart the trqauthd and torque_mom services and make sure that the torque_server is stopped:

[root@ma-b71-c6 ~]# service torque_mom restart

[root@ma-b71-c6 ~]# service trqauthd restart
[root@ma-b71-c6 ~]# service torque_server stop