Skip to main content
Ask Question
Asked a question last month

Hello, We have a Bright Easy8 installation in the beginning stage. We have 4 computing nodes with the head node. One of the computing node sny02 keeps on failing for healthchecks. We have tried re-installing Bright a number of times, still this particular node is giving issues like this. Logs from /var/log/cmdaemon are quoted here:Mar 17 08:52:51 sunya01 cmd: [ MON ] Info: [CacheGather::task_done], key: 38654705668, ierr: 7 Mar 17 08:54:51 sunya01 cmd: [ RPC ] Info: rpc not OK: 7 Mar 17 08:54:51 sunya01 cmd: [ MON ] Info: [Mon::RPCFactory::find], error: CMMon::fetchCachedData, rpc: Failed to connect, is CMDaemon running and reachable? (7), http: 0 Mar 17 08:54:51 sunya01 cmd: [ MON ] Info: [CacheGather::task_done], key: 38654705668, ierr: 7 Mar 17 08:56:02 sunya01 cmd: [ RPC ] Info: rpc not OK: 7 Mar 17 08:56:10 sny02 cmd: [ MON ]Warning: [ScriptSamplerTask] /cm/local/apps/cmd/scripts/healthchecks/ldap failed Mar 17 08:56:10 sny02 cmd: [ MON ]Warning: [ScriptSamplerTask] /cm/local/apps/cmd/scripts/healthchecks/mounts failed Mar 17 08:56:10 sny02 cmd: [ MON ]Warning: [ScriptSamplerTask] /cm/local/apps/cmd/scripts/healthchecks/ntp failed Mar 17 08:56:10 sny02 cmd: [ MON ]Warning: [ScriptSamplerTask] /cm/local/apps/cmd/scripts/healthchecks/schedulers failed Mar 17 08:56:10 sny02 cmd: [ MON ]Warning: [ScriptSamplerTask] /cm/local/apps/cmd/scripts/healthchecks/defaultgateway failed Mar 17 08:58:10 sny02 cmd: [ MON ]Warning: [ScriptSamplerTask] /cm/local/apps/cmd/scripts/healthchecks/mounts failed Mar 17 08:58:10 sny02 cmd: [ MON ]Warning: [ScriptSamplerTask] /cm/local/apps/cmd/scripts/healthchecks/schedulers failed Can somebody help us pointing what to look for where? We have done a firmware, ILO upgrade etc. still it is giving issues. The system is a HPE server.

Where am I?

In Bright Computing, Inc. you can ask and answer questions and share your experience with others!

If you go into cmsh and do the following, you will be able to see which healthchecks are failing, and (to some degree) why:

# cmsh
% device use sny02
% latesthealthdata -v

Once you know which healthchecks are failing, you can start investigating why. If it is not clear why a healthcheck is failing, you can look at the healthcheck script and see exactly what it is doing. They are in /cm/local/apps/cmd/scripts/healthchecks. You can also run a healthcheck immediately by using the 'device samplenow' command (e.g. device samplenow -n sny02 schedulers)

Related Questions

Question Stats

38 view
1 follower
Asked a question last month
Views this month