Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
pbs_mom(8B)			      PBS			   pbs_mom(8B)

NAME
       pbs_mom - start a pbs batch execution mini-server

SYNOPSIS
       pbs_mom	 [-a alarm]   [-C chkdirectory]	  [-c config]	[-d directory]
       [-H hostname] [-L logfile] [-M MOMport] [-R RPPport] [-p|-q|-r] [-x]

DESCRIPTION
       The pbs_mom command starts the operation	of a  batch  Machine  Oriented
       Mini-server,  MOM,  on the local	host.  Typically, this command will be
       in a local boot file such  as  /etc/rc.local  .	 To  insure  that  the
       pbs_mom	command	 is  not  runnable  by the general user	community, the
       server will only	execute	if its real and	effective uid is zero.

       One function of pbs_mom is to place jobs	into execution as directed  by
       the  server,  establish resource	usage limits, monitor the job's	usage,
       and notify the server when the job completes.  If they  exist,  pbs_mom
       will  execute  a	prologue script	before executing a job and an epilogue
       script after executing the job.	The next function of pbs_mom is	to re-
       spond  to  resource  monitor  requests.	 This  was  done by a separate
       process in previous versions of PBS but has now been combined into  one
       process.	  The resource monitor function	is provided mainly for the PBS
       scheduler.  It provides information about the status of	running	 jobs,
       memory  available  etc.	 The next function of pbs_mom is to respond to
       task manager requests.  This involves communicating with	running	 tasks
       over a tcp socket as well as communicating with other MOMs within a job
       (aka a "sisterhood").

       Pbs_mom will record a diagnostic	message	in a log file  for  any	 error
       occurrence.  The	log files are maintained in the	mom_logs directory be-
       low the home directory of the  server.	If  the	 log  file  cannot  be
       opened, the diagnostic message is written to the	system console.

OPTIONS
       -a alarm	       Used  to	 specify the alarm timeout in seconds for com-
		       puting a	resource.  Every time a	 resource  request  is
		       processed,  an  alarm  is  set  for the given amount of
		       time.  If the request  has  not	completed  before  the
		       given  time, an alarm signal is generated.  The default
		       is 5 seconds.

       -C chkdirectory Specifieds the path  of	the  directory	used  to  hold
		       checkpoint  files.   [Currently	this  is only valid on
		       Cray    systems.]     The    default    directory    is
		       PBS_HOME/spool/checkpoint,  see the -d option.  The di-
		       rectory specified with the -C option must be  owned  by
		       root  and  accessible (rwx) only	by root	to protect the
		       security	of the checkpoint files.

       -c config       Specify a alternative configuration file, see  descrip-
		       tion below.  If this is a relative file name it will be
		       relative	to PBS_HOME/mom_priv, see the -d  option.   If
		       the  specified  file  cannot  be	 opened,  pbs_mom will
		       abort.  If the -c option	is not supplied, pbs_mom  will
		       attempt to open the default
			configuration  file "config" in	PBS_HOME/mom_priv.  If
		       this file is not	present, pbs_mom will log the fact and
		       continue.

       -H hostname     Set  MOM's hostname.  This can be useful	on multi-homed
		       networks.

       -d directory    Specifies the path of the directory which is  the  home
		       of the servers working files, PBS_HOME.	This option is
		       typically used along with -M when debugging  MOM.   The
		       default directory is given by $PBS_SERVER_HOME which is
		       typically /usr/spool/PBS.

       -L logfile      Specify an absolute path	name for use as	the log	 file.
		       If  not	specified,  MOM	will open a file named for the
		       current date in the  PBS_HOME/mom_logs  directory,  see
		       the -d option.

       -M port	       Specifies  the  port  number  on	 which the mini-server
		       (MOM) will listen for batch requests.

       -R port	       Specifies the port  number  on  which  the  mini-server
		       (MOM)  will  listen for resource	monitor	requests, task
		       manager requests	and inter-MOM messages.	  Both	a  UDP
		       and a TCP port of this number will be used.

       -p	       (Default	 after	version	2.4.0) (Preserve running jobs)
		       -- Specifies the	impact on jobs which were in execution
		       when the	   mini-server shut-down.  The -p option tries
		       to preserve any running jobs  when  the	MOM  restarts.
		       The  new	mini-server will not be	the parent of any run-
		       ning jobs, MOM has lost control of her  offspring  (not
		       a  new situation	for a mother).	The MOM	will allow the
		       jobs to continue	to run and monitor them	indirectly via
		       polling.	All recovered jobs will	report an exit code of
		       0 when they are complete. The -p	option is mutually ex-
		       clusive with the	-r, -P and -q options.

       -P	       (Terminate  all jobs and	remove them from the queue) --
		       Specifies the impact on jobs which  were	 in  execution
		       when the	mini-server shut-down.	With the -P option, it
		       is assumed that	either	the  entire  system  has  been
		       restarted  or the MOM has been down so long that	it can
		       no longer guarantee that	the pid	of any running process
		       is the same as the recorded job process pid of a	recov-
		       ering job. Unlike the -p	option no attempt is  made  to
		       try  and	preserve or recover running jobs. All jobs are
		       terminated and removed from the queue.  The  -q	option
		       is mutually exclusive with the -p, -q and -r options.

       -q	       (Requeue	 all  jobs  -  This is the default behavior in
		       versions	prior to 2.4.0)	--  Specifies  the  impact  on
		       jobs  which were	in execution when the mini-servershut-
		       down. Do	not terminate running processes.  With the  -q
		       option, it is assumed that either the entire system has
		       been restarted or the MOM has been down so long that it
		       can  no	longer	guarantee  that	the pid	of any running
		       process is the same as the recorded job process pid  of
		       a  recovering  job. No attempt is made to kill job pro-
		       cesses.	The MOM	will mark the jobs as  terminated  and
		       notify the batch	server which owns the job. Re-runnable
		       jobs will be requeued.  The -q option is	 mutually  ex-
		       clusive with the	-p, -P and -r options.

       -r	       (Terminate  running  processes and requeue all jobs) --
		       Specifies the impact on jobs which  were	 in  execution
		       when the	mini-server shut-down. With the	-r option, MOM
		       will kill any processes belonging to running jobs, mark
		       the jobs	as terminated and notify the batch server that
		       owns the	job. Re-runnable jobs are reset	 to  a	queued
		       state so	they can be run	again.	The -r option is mutu-
		       ally exclusive with the -p, -P and -q options.

		       If the -r option	is used	following  a  reboot,  process
		       IDs  (pids)  may	 be  reused and	MOM may	kill a process
		       that is not a batch session.

       -S port	       Specifies the port number on which  the	pbs_server  is
		       listening  for requests.	 If pbs_server is started with
		       a -p option, pbs_mom will need to use the -S option and
		       match   the   port   value  which  was  used  to	 start
		       pbs_server.

       -x	       Disables	the check for privileged port resource monitor
		       connections.  This is used mainly for testing since the
		       privileged port is the only mechanism used  to  prevent
		       any ordinary user from connecting.

CONFIGURATION FILE
       The  configuration file may be specified	on the command line at program
       start with the -c flag.	The use	of this	file  is  to  provide  several
       types  of  run  time  information to pbs_mom: static resource names and
       values, external	resources provided by a	program	to be run  on  request
       via  a shell escape, and	values to pass to internal set up functions at
       initialization (and re-initialization).

       Each item type is on a single line with the component  parts  separated
       by  white  space.  If the line starts with a hash mark (pound sign, #),
       the line	is considered to be a comment and is skipped.

       Static Resources
	      For static resource names	and  values,  the  configuration  file
	      contains	a  list	 of  resource names/values pairs, one pair per
	      line and separated by white space.   An Example  of  static  re-
	      source  names  and  values could be the number of	tape drives of
	      different	types and could	be specified by

	      tape3480	    4
	      tape3420	    2
	      tapedat	    1
	      tape8mm	    1

       Shell Commands
	      If the first character of	the value is an	exclamation mark  (!),
	      the  entire rest of the line is saved to be executed through the
	      services of the system(3)	standard library routine.

	      The shell	escape provides	a means	for the	 resource  monitor  to
	      yield arbitrary information to the scheduler.  Parameter substi-
	      tution is	done such that the value of any	 qualifier  sent  with
	      the  query,  as explained	below, replaces	a token	with a percent
	      sign (%) followed	by the name of the  qualifier.	 For  example,
	      here is a	configuration file line	which gives a resource name of
	      "escape":

	      escape	 !echo %xxx %yyy

	      If a query for "escape" is sent with no qualifiers, the  command
	      executed	would  be "echo	%xxx %yyy".  If	one qualifier is sent,
	      "escape[xxx=hi there]", the command executed would be  "echo  hi
	      there    %yyy".	  If	two    qualifiers   are	  sent,	  "es-
	      cape[xxx=hi][yyy=there]",	the command executed would be "echo hi
	      there".	If  a  qualifier is sent with no matching token	in the
	      command line, "escape[zzz=snafu]", an error is reported.

       size[fs=<FS>]
	      Specifies	that the available and configured disk	space  in  the
	      <FS>  filesystem	is to be reported to the pbs_server and	sched-
	      uler.  NOTE: To request disk space on a per job  basis,  specify
	      the file resource	as in 'qsub -l nodes=1,file=1000kb'  For exam-
	      ple, the	available  and	configured  disk  space	 in  the  /lo-
	      calscratch filesystem will be reported:

	      size[fs=/localscratch]

       Initialization Value
	      An initialization	value directive	has a name which starts	with a
	      dollar sign ($) and must be known	to MOM via an internal	table.
	      The entries in this table	now are:

	      auto_ideal_load
		     if	jobs are running, sets idea_load based on a simple ex-
		     pression.	The expressions	start with  the	 variable  't'
		     (total assigned CPUs) or 'c' (existing CPUs), an operator
		     (+	- / *),	and followed by	a float	constant.

		     $auto_ideal_load t-0.2

	      auto_max_load
		     if	jobs are running, sets max_load	based on a simple  ex-
		     pression.	 The  expressions  start with the variable 't'
		     (total assigned CPUs) or 'c' (existing CPUs), an operator
		     (+	- / *),	and followed by	a float	constant.

	      cputmult
		     which  sets  a  factor  used to adjust cpu	time used by a
		     job.  This	 is  provided  to  allow  adjustment  of  time
		     charged  and  limits  enforced where the job might	run on
		     systems with different cpu	performance.  If Mom's	system
		     is	 faster	 than  the reference system, set cputmult to a
		     decimal value greater than	 1.0.	 If  Mom's  system  is
		     slower, set cputmult to a value between 1.0 and 0.0.  For
		     example:

		     $cputmult 1.5
		     $cputmult 0.75

	      configversion
		     specifies the version of the config file data, a string.

	      check_poll_time
		     specifies the MOM interval	in seconds.  MOM  checks  each
		     job  for updated resource usages, exited processes, over-
		     limit conditions, etc. once  per  interval.   This	 value
		     should  be	 equal or lower	to pbs_server's	job_stat_rate.
		     High values  result  in  stale  information  reported  to
		     pbs_server.   Low values result in	increased system usage
		     by	MOM.  Default is 45 seconds.

	      down_on_error
		     causes MOM	to report itself as state "down" to pbs_server
		     in	 the  event of a failed	health check.  This feature is
		     EXPERIMENTAL and likely to	be removed in the future.  See
		     HEALTH CHECK below.

	      enablemomrestart
		     enable  automatic	restarts of MOM.  If enabled, MOM will
		     check if its binary has been updated and  restart	itself
		     at	a safe point when no jobs are running; thus making up-
		     grades easier.  The check is made by comparing the	 mtime
		     of	  the  pbs_mom	executable.   Command-line  args,  the
		     process name, and the PATH	 env  variable	are  preserved
		     across  restarts.	It is recommended that this not	be en-
		     abled in the config file, but enabled when	 desired  with
		     momctl (see RESOURCES for more information.)

	      ideal_load
		     ideal  processor  load.   Represents a low	water mark for
		     the load average.	Nodes that  are	 currently  busy  will
		     consider itself free after	falling	below ideal_load.

	      igncput
		     Ignore cpu	time violations	on this	mom, meaning jobs will
		     not be cancelled due to exceeding their  limits  for  cpu
		     time.

	      ignmem Ignore  memory  violations	on this	mom, meaning jobs will
		     not be cancelled due to exceeding their memory limits.

	      ignvmem
		     If	set to true, then pbs_mom will ignore vmem/pvmem limit
		     enforcement.

	      ignwalltime
		     If	 set  to true, then pbs_mom will ignore	walltime limit
		     enforcement.

	      job_output_file_mask
		     Specifies a mask for creating job output and error	files.
		     Values  can  be specified in base 8, 10, or 16; leading 0
		     implies octal and leading 0x or 0X	hexadecimal.  A	 value
		     of	 "userdefault"	will  use  the	user's	default	umask.
		     $job_output_file_mask 027

	      log_directory
		     Changes  the   log	  directory.   Default	 is   $TORQUE-
		     HOME/mom_logs/. $TORQUEHOME default is /var/spool/torque/
		     but can be	changed	in the ./configure script.  The	 value
		     is	 a  string  and	should be the full path	to the desired
		     mom log directory.	 $log_directory	/opt/torque/mom_logs/

	      logevent
		     which sets	the mask that determines which event types are
		     logged by pbs_mom.	 For example:

		     $logevent 0x1fff
		     $logevent 255

		     The  first	 example would set the log event mask to 0x1ff
		     (511) which enables logging of all	events including debug
		     events.   The  second example would set the mask to 0x0ff
		     (255) which enables all events except debug events.

	      log_file_suffix
		     Optional suffix to	append to log file names. If %h	is the
		     suffix,  pbs_mom  appends	the hostname for where the log
		     files are stored if it knows it, otherwise	it will	append
		     the  hostname where the mom is running.  $log_file_suffix
		     tom = 20100223.tom

	      log_keep_days
		     Specifies how  many  days	to  keep  log  files.  pbs_mom
		     deletes  log  files  older	 than  the specified number of
		     days. If not specified, pbs_mom won't  delete  log	 files
		     based on their age.

	      loglevel
		     specifies	the  verbosity	of logging with	higher numbers
		     specifying	more verbose logging.  Values  may  range  be-
		     tween 0 and 7.

	      log_file_max_size
		     If	  this	 is  set to a value > 0	then pbs_mom will roll
		     the current log file to log-file-name.1 when its size  is
		     greater	 than	or    equal    to    the    value   of
		     log_file_max_size.	This value  is	interpreted  as	 kilo-
		     bytes.

	      log_file_roll_depth
		     If	 this is set to	a value	>=1 and	 log_file_max_size  is
		     set then  pbs_mom	will continue rolling the log files to
		     log-file-name.log_file_roll_depth.

	      max_load
		     maximum processor load.  Nodes over this load average are
		     considered	busy (see ideal_load above).

	      node_check_script
		     specifies the fully  qualified  pathname  of  the	health
		     check  script  to run (see	HEALTH CHECK for more informa-
		     tion).

	      node_check_interval
		     specifies when to run the MOM health  check.   The	 check
		     can be either periodic, event-driver, or both.  The value
		     starts with an integer specifying the number of  MOM  in-
		     tervals  between  subsequent  executions of the specified
		     health check.  After the integer is  an  optional	comma-
		     separated	list  of event names.  Currently supported are
		     "jobstart"	and "jobend".  This value defaults to  1  with
		     no	events indicating the check is run every MOM interval.
		     (see HEALTH CHECK for more	information)

		     $node_check_interval 0  #Disabled.
		     $node_check_interval 0,jobstart  #Only runs at job	starts
		     $node_check_interval 10,jobstart,jobend

	      nodefile_suffix
		     Specifies the suffix to append to a host names to	denote
		     the  data channel network adapter in a multihomed compute
		     node.  $nodefile_suffix i With the	suffix of 'i' and  the
		     control  channel  adapter	with the name node01, the data
		     channel would have	a hostname of node01i.

	      nospool_dir_list
		     If	the job's output file should be	in one	of  the	 paths
		     specified	here, then it will be spooled directly in that
		     directory instead of the normal spool directory.
		     Specified	 in   the   format    path1,	path2,	  etc.
		     $nospool_dir_list/home/mike/*,/var/tmp/spool/

	      pbsclient
		     which causes a host name to be added to the list of hosts
		     which will	be allowed to connect to MOM as	long  as  they
		     are  using	a privilaged port for the purposes of resource
		     monitor requests.	For example, here are  two  configura-
		     tion  file	 lines	which  will allow the hosts "fred" and
		     "wilma" to	connect:

		     $pbsclient	     fred
		     $pbsclient	     wilma

		     Two  host	name  are  always  allowed  to	connection  to
		     pbs_mom,  "localhost" and the name	returned to pbs_mom by
		     the system	call gethostname().  These names need  not  be
		     specified in the configuration file.  The hosts listed as
		     "clients"	can  issue  Resource  Monitor  (RM)  requests.
		     Other  MOM	 nodes and servers do not need to be listed as
		     clients.

	      pbsserver
		     which defines hostnames running pbs_server	that  will  be
		     allowed  to  submit jobs, issue Resource Monitor (RM) re-
		     quests, and get status updates.  MOM will continually at-
		     tempt  to	contact	 all  server hosts for node status and
		     state updates.   Like  $PBS_SERVER_HOME/server_name,  the
		     hostname  may  be	followed by a colon and	a port number.
		     This parameter replaces the oft-confused $clienthost  pa-
		     rameter  from  TORQUE 2.0.0p0 and earlier.	 Note that the
		     hostname in $PBS_SERVER_HOME/server_name is  used	if  no
		     $pbsserver	parameters are found

	      prologalarm
		     Specifies	maximum	 duration  (in	seconds) which the MOM
		     will wait for the job prolog or job job  epilog  to  com-
		     plete.  This parameter default to 300 seconds (5 minutes)

	      rcpcmd Specify the the full path and argument to be used for re-
		     mote file copies.	This overrides	the  compile-time  de-
		     fault found in configure.	This must contain 2 words: the
		     full path to the command and the switches.	 The copy com-
		     mand must be able to recursively copy files to the	remote
		     host and accept arguments of the  form  "user@host:files"
		     For example:

		     $rcpcmd /usr/bin/rcp -rp
		     $rcpcmd /usr/bin/scp -rpB

	      restricted
		     which causes a host name to be added to the list of hosts
		     which will	be allowed to connect to MOM  without  needing
		     to	use a privilaged port.	These names allow for wildcard
		     matching.	For example, here is a configuration file line
		     which  will  allow	 queries from any host from the	domain
		     "ibm.com".

		     $restricted      *.ibm.com

		     The restriction which applies  to	these  connections  is
		     that  only	 internal  queries  may	be made.  No resources
		     from a config file	will be	found.	This is	to prevent any
		     shell commands from being run by a	non-root process.
		     This  parameter is	generally not required except for some
		     versions of OSX.

	      remote_checkpoint_dirs
		     Specifies what server checkpoint directories are remotely
		     mounted.	This  directive	 is used to tell the MOM which
		     directories are shared with  the  server.	 Using	remote
		     checkpoint	 directories  eliminates  the need to copy the
		     checkpoint	files back and forth between the MOM  and  the
		     server. This parameter is available in 2.4.1 and later.

		     $remote_checkpoint_dirs /var/spool/torque/checkpoint

	      remote_reconfig
		     Enables  the ability to remotely reconfigure pbs_mom with
		     a new config file.	 Default is disabled.  This  parameter
		     accepts various forms of true, yes, and 1.

	      source_login_batch
		     Specifies	whether	 or  not mom will source the /etc/pro-
		     file, etc.	type files for batch jobs.  Parameter  accepts
		     various  forms  of	true, false, yes, no, 1	and 0. Default
		     is	True.

	      source_login_interactive
		     Specifies whether or not mom will	source	the  /etc/pro-
		     file, etc.	type files for interactive jobs. Parameter ac-
		     cepts various forms of true, false, yes, no, 1 and	0. De-
		     fault is True.

	      spool_as_final_name
		     If	 set to	true, jobs will	spool directly as their	output
		     files, with no intermediate locations or steps.  This  is
		     mostly  useful  for  shared filesystems with fast writing
		     capability.

	      status_update_time
		     Specifies (in seconds) how	often MOM updates  its	status
		     information  to  pbs_server.  This	value should correlate
		     with the server's scheduling interval.  High  values  in-
		     crease  the load of pbs_server and	the network.  Low val-
		     ues cause pbs_server to report  stale  information.   De-
		     fault is 45 seconds.

	      tmpdir Sets  the	directory basename for a per-job temporary di-
		     rectory.  Before job launch, MOM will append the jobid to
		     the  tmpdir basename and create the directory.  After the
		     job exit, MOM will	recursively delete it.	The env	 vari-
		     able  TMPDIR  will	be set for all pro/epilog scripts, the
		     job script, and TM	tasks.
		     Directory creation	and removal is done as the  job	 owner
		     and  group,  so  the  owner must have write permission to
		     create the	directory.  If the  directory  already	exists
		     and is owned by the job owner, it will not	be deleted af-
		     ter the job.  If the directory already exists and is  NOT
		     owned by the job owner, the job start will	be rejected.

	      timeout
		     Specifies	the number of seconds before TCP messages will
		     time out.	TCP messages include job  obituaries,  and  TM
		     requests if RPP is	disabled.  Default is 60 seconds.

	      usecp  specifies	which directories should be staged with	cp in-
		     stead of rcp/scp.	If a shared filesystem is available on
		     all  hosts	 in  a cluster,	this directive is used to make
		     these filesystems known to	MOM.  For example, if /home is
		     NFS mounted on all	nodes in a cluster:

		     $usecp *:/home  /home

	      varattr
		     This  is  similar to a shell escape above,	but includes a
		     TTL.  The command will only be run	every TTL seconds.   A
		     TTL  of  -1  will	cause  the command to be executed only
		     once.  A TTL of 0 will cause the command to be run	every-
		     time  varattr  is	requested.  This parameter may be used
		     multiple times, but all output will  be  grouped  into  a
		     single "varattr" attribute	in the request and status out-
		     put.  The command should  output  data  in	 the  form  of
		     varattrname=va1ue1[+value2]...

		     $varattr 3600 /path/to/script [<ARGS>]...

	      wallmult
		     which  sets a factor used to adjust wall time usage by to
		     job to a common reference system.	The factor is used for
		     walltime  calculations and	limits the same	as cputmult is
		     used for cpu time.

       The configuration file must be executable and  "secure".	  It  must  be
       owned by	a user id and group id less than 10 and	not be world writable.
       Output from this	file must be in	the format $VAR=$VAL, i.e.,

	      dataset13=20070104
	      dataset22=20070202
	      viraltest=abdd3

       xauthpath
	      Specifies	the path to the	xauth binary to	enable X11 fowarding.

       mom_host
	      Sets the local hostname as used by pbs_mom.

RESOURCES
       Resource	Monitor	queries	can be made with momctl's  -q  option  to  re-
       trieve  and set pbs_mom options.	 Any configured	static resource	may be
       retrieved with a	request	of the same name.  These are resource requests
       not otherwise documented	in the PBS ERS.

       cycle  forces an	immediate MOM cycle

       status_update_time
	      retrieve or set the $status_update_time parameter

       check_poll_time
	      retrieve or set the $check_poll_time parameter

       configversion
	      retrieve the config version

       jobstartblocktime
	      retrieve or set the $jobstartblocktime parameter

       enablemomrestart
	      retrieve or set the $enablemomrestart parameter

       loglevel
	      retrieve or set the $loglevel parameter

       down_on_error
	      retrieve or set the EXPERIMENTAL $down_on_error parameter

       diag0 - diag4
	      retrieves	various	diagnostic information

       rcpcmd retrieve or set the $rcpcmd parameter

       version
	      retrieves	the pbs_mom version

HEALTH CHECK
       The  health check script	is executed directly by	the pbs_mom daemon un-
       der the root user id. It	must be	accessible from	the compute  node  and
       may be a	script or compiled executable program.	It may make any	needed
       system calls and	execute	any combination	of system utilities but	should
       not  execute  resource  manager	client	commands.   Also, as of	TORQUE
       1.0.1, the pbs_mom daemon blocks	until the health  check	 is  completed
       and does	not possess a built-in timeout.	 Consequently, it is advisable
       to keep the launch script execution time	 short	and  verify  that  the
       script will not block even under	failure	conditions.

       If  the	script detects a failure, it should return the keyword 'ERROR'
       to stdout followed by an	error message.	The message (up	to 256 charac-
       ters)  immediately  following  the ERROR	string will be assigned	to the
       node attribute 'message'	of the associated node.

       If the script detects a failure when run	from "jobstart", then the  job
       will  be	 rejected.   This  should  probably only be used with advanced
       schedulers like Moab so that the	job can	be routed to another node.

       TORQUE currently	ignores	ERROR messages by default, but advanced	sched-
       ulers like moab can be configured to react appropriately.

       If the experimental $down_on_error MOM setting is enabled, MOM will set
       itself to state down and	report to pbs_server; and pbs_server will  re-
       port  the  node as "down".  Additionally, the experimental "down_on_er-
       ror" server attribute can be enabled which  has	the  same  effect  but
       moves  the  decision  to	 pbs_server.   It  is  redundant to have MOM's
       $down_on_error and pbs_server's down_on_error  features	enabled.   See
       "down_on_error" in pbs_server_attributes(7B).

FILES
       $PBS_SERVER_HOME/server_name
	      contains the hostname running pbs_server.

       $PBS_SERVER_HOME/mom_priv
		 the  default  directory  for  configuration  files, typically
		 (/usr/spool/pbs)/mom_priv.

       $PBS_SERVER_HOME/mom_logs
		 directory for log files recorded by the server.

       $PBS_SERVER_HOME/mom_priv/prologue
		 the administrative script to be run before job	execution.

       $PBS_SERVER_HOME/mom_priv/epilogue
		 the administrative script to be run after job execution.

SIGNAL HANDLING
       pbs_mom handles the following signals:

       SIGHUP causes pbs_mom to	re-read	its configuration file,	close and  re-
	      open the log file, and reinitialize resource structures.

       SIGALRM
	      results  in  a  log  file	entry. The signal is used to limit the
	      time taken by certain children processes,	such as	 the  prologue
	      and epilogue.

       SIGINT and SIGTERM
	      results in pbs_mom exiting without terminating any running jobs.
	      This is the action for the following signals as  well:  SIGXCPU,
	      SIGXFSZ, SIGCPULIM, and SIGSHUTDN.

       SIGUSR1,	SIGUSR2
	      causes  MOM  to  increase	 and  decrease logging levels, respec-
	      tively.

       SIGPIPE,	SIGINFO
	       are ignored.

       SIGBUS, SIGFPE, SIGILL, SIGTRAP,	and SIGSYS
	      cause a core dump	if the PBSCOREDUMP environmental  variable  is
	      defined.

       All other signals have their default behavior installed.

EXIT STATUS
       If  the	mini-server command fails to begin operation, the server exits
       with a value greater than zero.

SEE ALSO
       pbs_server(8B), pbs_scheduler_basl(8B), pbs_scheduler_tcl(8B), the  PBS
       External	Reference Specification, and the PBS Administrator's Guide.

Local								   pbs_mom(8B)

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | CONFIGURATION FILE | RESOURCES | HEALTH CHECK | FILES | SIGNAL HANDLING | EXIT STATUS | SEE ALSO

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=pbs_mom&sektion=8b&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help