Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
strigger(1)			Slurm Commands			   strigger(1)

NAME
       strigger	- Used to set, get or clear Slurm trigger information.

SYNOPSIS
       strigger	--set	[OPTIONS...]
       strigger	--get	[OPTIONS...]
       strigger	--clear	[OPTIONS...]

DESCRIPTION
       strigger	is used	to set,	get or clear Slurm trigger information.	 Trig-
       gers  include  events  such  as a node failing, a job reaching its time
       limit or	a job terminating.  These events can cause actions such	as the
       execution of an arbitrary script.  Typical uses include notifying  sys-
       tem  administrators  of	node failures and gracefully terminating a job
       when its	time limit is approaching.   A	hostlist  expression  for  the
       nodelist	or job ID is passed as an argument to the program.

       Trigger	events	are  not processed instantly, but a check is performed
       for trigger events on a periodic	basis (currently  every	 15  seconds).
       Any  trigger  events  which occur within	that interval will be compared
       against the trigger programs set	at the end of the time interval.   The
       trigger	program	 will be executed once for any event occurring in that
       interval.  The record of	those events (e.g. nodes which	went  DOWN  in
       the  previous  15  seconds)  will then be cleared.  The trigger program
       must set	a new trigger before the end of	the next  interval  to	ensure
       that  no	 trigger events	are missed OR the trigger must be created with
       an argument of "--flags=PERM".  If desired, multiple  trigger  programs
       can be set for the same event.

       NOTE:  This  command can	only set triggers if run by the	user SlurmUser
       unless SlurmUser	is configured as user root.  This is required for  the
       slurmctld daemon	to set the appropriate user and	group IDs for the exe-
       cuted  program.	 Also note that	the trigger program is executed	on the
       same node that the slurmctld daemon uses	 rather	 than  some  allocated
       compute node.  To check the value of SlurmUser, run the command:

	      scontrol show config | grep SlurmUser

ARGUMENTS
       -C, --backup_slurmctld_assumed_control
	      Trigger event when backup	slurmctld assumes control.

       -B, --backup_slurmctld_failure
	      Trigger an event when the	backup slurmctld fails.

       -c, --backup_slurmctld_resumed_operation
	      Trigger an event when the	backup slurmctld resumes operation af-
	      ter failure.

       --burst_buffer
	      Trigger event when burst buffer error occurs.

       --clear
	      Clear  or	 delete	a previously defined event trigger.  The --id,
	      --jobid or --user	option must be specified to identify the trig-
	      ger(s) to	be cleared.  Only user root or the  trigger's  creator
	      can delete a trigger.

       -M, --clusters=<string>
	      Clusters	to  issue commands to.	Note that the SlurmDBD must be
	      up for this option to work properly.

       -d, --down
	      Trigger an event if the specified	node goes into a DOWN state.

       -D, --drained
	      Trigger an event if the  specified  node	goes  into  a  DRAINED
	      state.

       --draining
	      Trigger  an  event  if  the  specified node goes into a DRAINING
	      state, before it is DRAINED.

       -F, --fail
	      Trigger an event if the  specified  node	goes  into  a  FAILING
	      state.

       -f, --fini
	      Trigger an event when the	specified job completes	execution.

       --flags=<flag>
	      Associate	 flags	with the reservation. Multiple flags should be
	      comma separated.	Valid flags include:

	      PERM   Make the trigger permanent. Do not	 purge	it  after  the
		     event occurs.

       --front_end
	      Trigger  events  based  upon changes in state of front end nodes
	      rather than compute nodes.  Use this option with either the --up
	      or --down	option.

       --get  Show registered event triggers.  Options can be used for filter-
	      ing purposes.

       -i, --id=<id>
	      Trigger ID number.

       -I, --idle
	      Trigger an event if the specified	node remains in	an IDLE	 state
	      for  at  least the time period specified by the --offset option.
	      This can be useful to hibernate a	node that remains  idle,  thus
	      reducing power consumption.

       -j, --jobid=<id>
	      Job ID of	interest.  NOTE: The --jobid option can	not be used in
	      conjunction  with	 the --node option. When the --jobid option is
	      used in conjunction with the --up	or --down  option,  all	 nodes
	      allocated	 to that job will considered the nodes used as a trig-
	      ger event.

       -n, --node[=host]
	      Host name(s) of interest.	 By default, all nodes associated with
	      the job (if --jobid is specified)	or on the system  are  consid-
	      ered  for	 event	triggers.   NOTE: The --node option can	not be
	      used in conjunction with the --jobid option.  When  the  --jobid
	      option is	used in	conjunction with the --up, --down or --drained
	      option,  all  nodes  allocated  to  that job will	considered the
	      nodes used as a trigger event. Since this	option's  argument  is
	      optional,	 for  proper  parsing the single letter	option must be
	      followed immediately with	the value and not include a space  be-
	      tween them. For example "-ntux" and not "-n tux".

       -N, --noheader
	      Do not print the header when displaying a	list of	triggers.

       -o, --offset=<seconds>
	      The specified action should follow the event by this time	inter-
	      val.   Specify  a	 negative  value if action should preceded the
	      event.  The default value	is zero	if no --offset option is spec-
	      ified.  The resolution of	this time is about 20 seconds,	so  to
	      execute  a  script  not  less  than  five	minutes	prior to a job
	      reaching its time	limit, specify --offset=320 (5 minutes plus 20
	      seconds).

       -h, --primary_database_failure
	      Trigger an event when the	primary	database fails.	This event  is
	      triggered	 when the accounting plugin tries to open a connection
	      with mysql and it	fails and the slurmctld	needs the database for
	      some operations.

       -H, --primary_database_resumed_operation
	      Trigger an event when the	primary	database resumes operation af-
	      ter failure.  It happens when the	connection to mysql  from  the
	      accounting plugin	is restored.

       -g, --primary_slurmdbd_failure
	      Trigger an event when the	primary	slurmdbd fails.	The trigger is
	      launched	by  slurmctld  in the occasions	it tries to connect to
	      slurmdbd,	but receives no	response on the	socket.

       -G, --primary_slurmdbd_resumed_operation
	      Trigger an event when the	primary	slurmdbd resumes operation af-
	      ter failure.  This event is triggered when opening  the  connec-
	      tion  from  slurmctld  to	slurmdbd results in a response.	It can
	      happen also in different situations, periodically	every 15  sec-
	      onds  when  checking  the	 connection status, when saving	state,
	      when agent queue is filling, and so on.

       -e, --primary_slurmctld_acct_buffer_full
	      Trigger an event when primary  slurmctld	accounting  buffer  is
	      full.

       -a, --primary_slurmctld_failure
	      Trigger an event when the	primary	slurmctld fails.

       -b, --primary_slurmctld_resumed_control
	      Trigger an event when primary slurmctld resumes control.

       -A, --primary_slurmctld_resumed_operation
	      Trigger  an  event when the primary slurmctld resuming operation
	      after failure.

       -p, --program=<path>
	      Execute the program at the specified  fully  qualified  pathname
	      when the event occurs.  You may quote the	path and include extra
	      program  arguments  if desired.  The program will	be executed as
	      the user who sets	the trigger.  If the program fails  to	termi-
	      nate  within 5 minutes, it will be killed	along with any spawned
	      processes.

       -Q, --quiet
	      Do not report non-fatal errors.  This can	 be  useful  to	 clear
	      triggers which may have already been purged.

       -r, --reconfig
	      Trigger an event when the	system configuration changes.  This is
	      triggered	when the slurmctld daemon reads	its configuration file
	      or when a	node state changes.

       -R, --resume
	      Trigger  an  event  if  the  specified node is set to the	RESUME
	      state.

       --set  Register an event	 trigger  based	 upon  the  supplied  options.
	      NOTE:  An	event is only triggered	once. A	new event trigger must
	      be set established for future events of  the  same  type	to  be
	      processed.   Triggers  can  only be set if the command is	run by
	      the user SlurmUser unless	SlurmUser is configured	as user	root.

       -t, --time
	      Trigger an event when the	specified job's	time limit is reached.
	      This must	be used	in conjunction with the	--jobid	option.

       -u, --up
	      Trigger an event if the specified	node is	 returned  to  service
	      from a DOWN state.

       --user=<user_name_or_id>
	      Clear  or	get triggers created by	the specified user.  For exam-
	      ple, a trigger created by	user root for a	job  created  by  user
	      adam  could  be cleared with an option --user=root.  Specify ei-
	      ther a user name or user ID.

       -v, --verbose
	      Print detailed event logging. This includes time-stamps on  data
	      structures, record counts, etc.

       -V , --version
	      Print version information	and exit.

OUTPUT FIELD DESCRIPTIONS
       TRIG_ID
	      Trigger ID number.

       RES_TYPE
	      Resource type: job or node

       RES_ID Resource ID: job ID or host names	or "*" for any host

       TYPE   Trigger type: time or fini (for jobs only), down or up (for jobs
	      or nodes), or drained, idle or reconfig (for nodes only)

       OFFSET Time  offset  in	seconds. Negative numbers indicated the	action
	      should occur before the event (if	possible)

       USER   Name of the user requesting the action

       PROGRAM
	      Pathname of the program to execute when the event	occurs

PERFORMANCE
       Executing strigger sends	a  remote  procedure  call  to	slurmctld.  If
       enough calls from strigger or other Slurm client	commands that send re-
       mote  procedure	calls  to the slurmctld	daemon come in at once,	it can
       result in a degradation of performance of the slurmctld daemon,	possi-
       bly resulting in	a denial of service.

       Do  not	run  strigger  or other	Slurm client commands that send	remote
       procedure calls to slurmctld from loops in shell	scripts	or other  pro-
       grams. Ensure that programs limit calls to strigger to the minimum nec-
       essary for the information you are trying to gather.

ENVIRONMENT VARIABLES
       Some strigger options may be set	via environment	variables. These envi-
       ronment	variables,  along with their corresponding options, are	listed
       below.  (Note: Command line options will	 always	 override  these  set-
       tings.)

       SLURM_CONF	   The location	of the Slurm configuration file.

       SLURM_DEBUG_FLAGS   Specify  debug  flags  for strigger to use. See De-
			   bugFlags in the slurm.conf(5) man page for  a  full
			   list	 of  flags.  The  environment  variable	 takes
			   precedence over the setting in the slurm.conf.

EXAMPLES
       Execute the program "/usr/sbin/primary_slurmctld_failure" whenever the
       primary slurmctld fails.

	      $	cat /usr/sbin/primary_slurmctld_failure
	      #!/bin/bash
	      #	Submit trigger for next	primary	slurmctld failure event
	      strigger --set --primary_slurmctld_failure \
		       --program=/usr/sbin/primary_slurmctld_failure
	      #	Notify the administrator of the	failure	using e-mail
	      /bin/mail	slurm_admin@site.com -s	Primary_SLURMCTLD_FAILURE

	      $	strigger --set --primary_slurmctld_failure \
			 --program=/usr/sbin/primary_slurmctld_failure

       Execute the program "/usr/sbin/slurm_admin_notify" whenever any node in
       the cluster goes	down. The subject line will include the	node names
       which have entered the down state (passed as an argument	to the script
       by Slurm).

	      $	cat /usr/sbin/slurm_admin_notify
	      #!/bin/bash
	      #	Submit trigger for next	event
	      strigger --set --node --down \
		       --program=/usr/sbin/slurm_admin_notify
	      #	Notify administrator using by e-mail
	      /bin/mail	slurm_admin@site.com -s	NodesDown:$*

	      $	strigger --set --node --down \
			 --program=/usr/sbin/slurm_admin_notify

       Execute the program "/usr/sbin/slurm_suspend_node" whenever any node in
       the cluster remains in the idle state for at least 600 seconds.

	      $	strigger --set --node --idle --offset=600 \
			 --program=/usr/sbin/slurm_suspend_node

       Execute the program "/home/joe/clean_up"	when job 1234 is within	10
       minutes of reaching its time limit.

	      $	strigger --set --jobid=1234 --time --offset=-600 \
			 --program=/home/joe/clean_up

       Execute the program "/home/joe/node_died" when any node allocated to
       job 1234	enters the DOWN	state.

	      $	strigger --set --jobid=1234 --down \
			 --program=/home/joe/node_died

       Show all	triggers associated with job 1235.

	      $	strigger --get --jobid=1235
	      TRIG_ID RES_TYPE RES_ID TYPE OFFSET USER PROGRAM
		  123	   job	 1235 time   -600  joe /home/bob/clean_up
		  125	   job	 1235 down	0  joe /home/bob/node_died

       Delete event trigger 125.

	      $	strigger --clear --id=125

       Execute /home/joe/job_fini upon completion of job 1237.

	      $	strigger --set --jobid=1237 --fini --program=/home/joe/job_fini

COPYING
       Copyright (C) 2007 The Regents of the University	of  California.	  Pro-
       duced at	Lawrence Livermore National Laboratory (cf, DISCLAIMER).
       Copyright (C) 2008-2010 Lawrence	Livermore National Security.
       Copyright (C) 2010-2022 SchedMD LLC.

       This  file  is  part  of	Slurm, a resource management program.  For de-
       tails, see <https://slurm.schedmd.com/>.

       Slurm is	free software; you can redistribute it and/or modify it	 under
       the  terms  of  the GNU General Public License as published by the Free
       Software	Foundation; either version 2 of	the License, or	(at  your  op-
       tion) any later version.

       Slurm  is  distributed  in the hope that	it will	be useful, but WITHOUT
       ANY WARRANTY; without even the implied warranty of  MERCHANTABILITY  or
       FITNESS	FOR  A	PARTICULAR PURPOSE. See	the GNU	General	Public License
       for more	details.

SEE ALSO
       scontrol(1), sinfo(1), squeue(1)

January	2024			Slurm Commands			   strigger(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=strigger&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help