Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
gres.conf(5)		   Slurm Configuration File		  gres.conf(5)

NAME
       gres.conf  -  Slurm configuration file for Generic RESource (GRES) man-
       agement.

DESCRIPTION
       gres.conf is an ASCII file which	describes the configuration of Generic
       RESource(s) (GRES) on each compute node.	 If the	 GRES  information  in
       the  slurm.conf	file  does  not	fully describe those resources,	then a
       gres.conf file should be	included  on  each  compute  node.  For	 cloud
       nodes,  a  gres.conf  file that includes	all the	cloud nodes must be on
       all cloud nodes and the controller. The file will always	be located  in
       the same	directory as slurm.conf.

       If  the	GRES  information in the slurm.conf file fully describes those
       resources (i.e. no "Cores", "File" or "Links" specification is required
       for that	GRES type or that information is automatically detected), that
       information may be omitted from the gres.conf file and only the config-
       uration information in the slurm.conf file will be used.	 The gres.conf
       file may	be omitted completely if the configuration information in  the
       slurm.conf file fully describes all GRES.

       If  using  the  gres.conf  file	to describe the	resources available to
       nodes, the first	parameter on the line should be	NodeName. If configur-
       ing Generic Resources without specifying	nodes, the first parameter  on
       the line	should be Name.

       Parameter  names	are case insensitive.  Any text	following a "#"	in the
       configuration file is treated as	a comment  through  the	 end  of  that
       line.   Changes	to  the	configuration file take	effect upon restart of
       Slurm daemons, daemon receipt of	the SIGHUP signal, or execution	of the
       command "scontrol reconfigure" unless otherwise noted.

       NOTE: Slurm support for gres/[mps|shard]	requires the use  of  the  se-
       lect/cons_tres  plugin.	For  more information on how to	configure MPS,
       see https://slurm.schedmd.com/gres.html#MPS_Management.	For  more  in-
       formation      on      how      to      configure     Sharding,	   see
       https://slurm.schedmd.com/gres.html#Sharding.

       For   more   information	  on   GRES   scheduling   in	general,   see
       https://slurm.schedmd.com/gres.html.

       The overall configuration parameters available include:

       AutoDetect
	      The  hardware  detection mechanisms to enable for	automatic GRES
	      configuration.  Currently, the options are:

	      nrt    Automatically detect AWS Trainium/Inferentia devices.

	      nvidia Automatically detect NVIDIA GPUs.	No  library  required,
		     but doesn't detect	MIGs or	NVlinks. Added in Slurm	24.11.

	      nvml   Automatically  detect  NVIDIA  GPUs.  Requires the	NVIDIA
		     Management	Library	(NVML).

	      off    Do	not automatically detect any GPUs.  Used  to  override
		     other options.

	      oneapi Automatically  detect  Intel  GPUs.  Requires  the	 Intel
		     Graphics Compute Runtime for oneAPI Level Zero and	OpenCL
		     Driver (oneapi).

	      rsmi   Automatically detect AMD GPUs. Requires the  ROCm	System
		     Management	Interface (ROCm	SMI) Library.

	      AutoDetect  can  be  on  a line by itself, in which case it will
	      globally apply to	all lines in gres.conf by  default.  In	 addi-
	      tion,  AutoDetect	can be combined	with NodeName to only apply to
	      certain nodes. Node-specific AutoDetects will trump  the	global
	      AutoDetect.  A  node-specific AutoDetect only needs to be	speci-
	      fied once	per node. If specified multiple	 times	for  the  same
	      nodes,  they must	all be the same	value. To unset	AutoDetect for
	      a	node when a global AutoDetect is set, simply set it  to	 "off"
	      in  a  node-specific  GRES  line.	  E.g.:	 NodeName=tux3 AutoDe-
	      tect=off Name=gpu	File=/dev/nvidia[0-3].	AutoDetect  cannot  be
	      used with	cloud nodes.

	      AutoDetect  will	automatically  detect files, cores, links, and
	      any other	hardware. If a parameter such as File, Cores, or Links
	      are specified when AutoDetect is used, then the specified	values
	      are used to sanity check the auto	detected values. If there is a
	      mismatch,	then the node's	state is set to	invalid	and  the  node
	      is drained.

       Count  Number  of  resources  of	this name/type available on this node.
	      The default value	is set to the number of	File values  specified
	      (if  any),  otherwise the	default	value is one. A	suffix of "K",
	      "M", "G",	"T" or "P" may be used to multiply the number by 1024,
	      1048576,	 1073741824,   etc.   respectively.    For    example:
	      "Count=10G".

       Cores  Optionally  specify the core index numbers matching the specific
	      sockets* which can use this resource.

	      If this option is	used, all the cores in a socket* must be spec-
	      ified together.  While Slurm can track and assign	 resources  at
	      the  CPU	or  thread  level,  its	 scheduling algorithms used to
	      co-allocate GRES devices with CPUs operates at a	socket*	 level
	      for job allocations.  Therefore, it is not possible to preferen-
	      tially  assign  GRES  with  different  specific CPUs on the same
	      socket*.

	      *Sockets may be substituted for NUMA  nodes  with	 SlurmdParame-
	      ters=numa_node_as_socket	  or	l3cache	  with	 SlurmdParame-
	      ters=l3cache_as_socket.

	      Multiple cores may be specified using a comma-delimited list  or
	      a	 range	may be specified using a "-" separator (e.g. "0,1,2,3"
	      or "0-3").  If  a	 job  specifies	 --gres-flags=enforce-binding,
	      then  only  the  identified  cores  can  be  allocated with each
	      generic resource.	This will tend to improve performance of jobs,
	      but delay	the allocation of resources to them.  If specified and
	      a	job is not submitted with the --gres-flags=enforce-binding op-
	      tion the identified cores	will be	preferred for scheduling  with
	      each generic resource.

	      If  --gres-flags=disable-binding is specified, then any core can
	      be used with the resources, which	also increases	the  speed  of
	      Slurm's  scheduling  algorithm  but  can degrade the application
	      performance.  The	--gres-flags=disable-binding  option  is  cur-
	      rently  required to use more CPUs	than are bound to a GRES (e.g.
	      if a GPU is bound	to the CPUs on one socket,  but	 resources  on
	      more  than one socket are	required to run	the job).  If any core
	      can be effectively used with the resources, then do not  specify
	      the  cores  option  for  improved	 speed in the Slurm scheduling
	      logic.  A	restart	of the slurmctld is needed for changes to  the
	      Cores option to take effect.

	      NOTE: Since Slurm	must be	able to	perform	resource management on
	      heterogeneous  clusters having various processing	unit numbering
	      schemes, a logical core index must be specified instead  of  the
	      physical	core  index.  That logical core	index might not	corre-
	      spond to your physical core index	number.	 Core 0	 will  be  the
	      first  core on the first socket, while core 1 will be the	second
	      core on the first	socket.	 This  numbering  coincides  with  the
	      logical  core  number (Core L#) seen in "lstopo -l" command out-
	      put.

       File   Fully qualified pathname of the device files associated  with  a
	      resource.	 The name can include a	numeric	range suffix to	be in-
	      terpreted	by Slurm (e.g. File=/dev/nvidia[0-3]).

	      This  field  is generally	required if enforcement	of generic re-
	      source allocations is to be supported (i.e. prevents users  from
	      making  use  of  resources  allocated to a different user).  En-
	      forcement	of the	file  allocation  relies  upon	Linux  Control
	      Groups  (cgroups)	 and  Slurm's  task/cgroup  plugin, which will
	      place the	allocated files	into the job's cgroup and prevent  use
	      of  other	 files.	 Please	see Slurm's Cgroups Guide for more in-
	      formation: https://slurm.schedmd.com/cgroups.html.

	      If File is specified then	Count must be either set to the	number
	      of file names specified or not set (the  default	value  is  the
	      number of	files specified).  The exception to this is MPS/Shard-
	      ing.  For	 either	of these GRES, each GPU	would be identified by
	      device file using	the File parameter and Count would specify the
	      number of	entries	that would correspond to that  GPU.  For  MPS,
	      typically	 100  or  some multiple	of 100.	For Sharding typically
	      the maximum number of jobs that could simultaneously share  that
	      GPU.

	      If  using	a card with Multi-Instance GPU functionality, use Mul-
	      tipleFiles instead. File and MultipleFiles are  mutually	exclu-
	      sive.

	      NOTE: File is required for all gpu typed GRES.

	      NOTE:  If	 you specify the File parameter	for a resource on some
	      node, the	option must be specified on all	nodes and  Slurm  will
	      track  the  assignment  of  each specific	resource on each node.
	      Otherwise	Slurm will only	track a	count of  allocated  resources
	      rather than the state of each individual device file.

	      NOTE:  Drain  a  node  before changing the count of records with
	      File parameters (e.g. if you want	to add or remove GPUs  from  a
	      node's  configuration).  Failure to do so	will result in any job
	      using those GRES being aborted.

	      NOTE: When specifying File, Count	is limited in size  (currently
	      1024) for	each node.

       Flags  Optional flags that can be specified to change configured	behav-
	      ior of the GRES.

	      Allowed values at	present	are:

	      CountOnly		  Do  not attempt to load a plugin of the GRES
				  type as this GRES will only be used to track
				  counts of GRES used. This avoids  attempting
				  to load non-existent plugin which can	affect
				  filesystems with high	latency	metadata oper-
				  ations for non-existent files.

				  NOTE:	 If a gres has this flag configured it
				  is global, so	all other nodes	with that gres
				  will have this flag implied.

	      explicit		  If the flag is set, GRES is not allocated to
				  the job as part  of  whole  node  allocation
				  (--exclusive	or OverSubscribe=EXCLUSIVE set
				  on partition)	unless it was  explicitly  re-
				  quested by the job.

				  NOTE:	 If a gres has this flag configured it
				  is global, so	all other nodes	with that gres
				  will have this flag implied.

	      one_sharing	  To be	used on	a  shared  gres.  If  using  a
				  shared  gres	(mps) on top of	a sharing gres
				  (gpu)	only allow one of the sharing gres  to
				  be used by the shared	gres.  This is the de-
				  fault	for MPS.

				  NOTE:	 If a gres has this flag configured it
				  is global, so	all other nodes	with that gres
				  will have this flag implied.	This  flag  is
				  not  compatible  with	all_sharing for	a spe-
				  cific	gres.

	      all_sharing	  To be	used on	a shared gres. This is the op-
				  posite of one_sharing	and can	be used	to al-
				  low all sharing gres (gpu) on	a node	to  be
				  used for shared gres (mps).

				  NOTE:	 If a gres has this flag configured it
				  is global, so	all other nodes	with that gres
				  will have this flag implied.	This  flag  is
				  not  compatible  with	one_sharing for	a spe-
				  cific	gres.

	      nvidia_gpu_env	  Set  environment  variable  CUDA_VISIBLE_DE-
				  VICES	for all	GPUs on	the specified node(s).

	      amd_gpu_env	  Set  environment  variable  ROCR_VISIBLE_DE-
				  VICES	for all	GPUs on	the specified node(s).

	      intel_gpu_env	  Set  environment  variable  ZE_AFFINITY_MASK
				  for all GPUs on the specified	node(s).

	      opencl_env	  Set  environment variable GPU_DEVICE_ORDINAL
				  for all GPUs on the specified	node(s).

	      no_gpu_env	  Set no GPU-specific  environment  variables.
				  This	is mutually exclusive to all other en-
				  vironment-related flags.

	      If   no	environment-related   flags   are   specified,	  then
	      nvidia_gpu_env,  amd_gpu_env, intel_gpu_env, and opencl_env will
	      be implicitly set	by default.  If	AutoDetect is used  and	 envi-
	      ronment-related flags are	not specified, then AutoDetect=nvml or
	      AutoDetect=nvidia	 will set nvidia_gpu_env, AutoDetect=rsmi will
	      set amd_gpu_env, and AutoDetect=oneapi will  set	intel_gpu_env.
	      Conversely,  specified  environment-related  flags  will	always
	      override AutoDetect.

	      Environment-related flags	set on one GRES	line will be inherited
	      by the GRES line directly	below  it  if  no  environment-related
	      flags  are specified on that line	and if it is of	the same node,
	      name, and	type. Environment-related flags	must be	the  same  for
	      GRES of the same node, name, and type.

	      Note that	there is a known issue with the	AMD ROCm runtime where
	      ROCR_VISIBLE_DEVICES  is	processed  first,  and then CUDA_VISI-
	      BLE_DEVICES is processed.	To avoid the issues  caused  by	 this,
	      set  Flags=amd_gpu_env for AMD GPUs so only ROCR_VISIBLE_DEVICES
	      is set.

       Links  A	comma-delimited	list of	numbers	identifying the	number of con-
	      nections between this device and other devices to	allow cosched-
	      uling of better connected	devices.  This is an ordered  list  in
	      which  the number	of connections this specific device has	to de-
	      vice number 0 would be in	the first position, the	number of con-
	      nections it has to device	number 1 in the	second position,  etc.
	      A	 -1  indicates	the device itself and a	0 indicates no connec-
	      tion.  If	specified, then	this line can only  contain  a	single
	      GRES device (i.e.	can only contain a single file via File).

	      This  is	an  optional value and is usually automatically	deter-
	      mined if AutoDetect is enabled.  A typical use case would	be  to
	      identify	GPUs  having NVLink connectivity.  Note	that for GPUs,
	      the minor	number assigned	by the OS and used in the device  file
	      (i.e.  the X in /dev/nvidiaX) is not necessarily the same	as the
	      device number/index. The device number is	created	by sorting the
	      GPUs by PCI bus ID and then numbering  them  starting  from  the
	      smallest		      bus		ID.		   See
	      https://slurm.schedmd.com/gres.html#GPU_Management

       MultipleFiles
	      Fully qualified pathname of the device files associated  with  a
	      resource.	  Graphics  cards using	Multi-Instance GPU (MIG) tech-
	      nology will present multiple device files	that should be managed
	      as a single generic resource. The	file names can be a comma sep-
	      arated list or it	can include a numeric range suffix (e.g.  Mul-
	      tipleFiles=/dev/nvidia[0-3]).

	      Drain  a node before changing the	count of records with the Mul-
	      tipleFiles parameter, such as when adding	or removing GPUs  from
	      a	node's configuration.  Failure to do so	will result in any job
	      using those GRES being aborted.

	      When  not	 using	GPUs with MIG functionality, use File instead.
	      MultipleFiles and	File are mutually exclusive.

       Name   Name of the generic resource. Any	desired	name may be used.  The
	      name must	match  a  value	 in  GresTypes	in  slurm.conf.	  Each
	      generic  resource	 has  an optional plugin which can provide re-
	      source-specific functionality.  Generic resources	that currently
	      include an optional plugin are:

	      gpu    Graphics Processing Unit

	      mps    CUDA Multi-Process	Service	(MPS)

	      nic    Network Interface Card

	      shard  Shards of a gpu

       NodeName
	      An optional NodeName specification can be	 used  to  permit  one
	      gres.conf	 file to be used for all compute nodes in a cluster by
	      specifying the node(s) that each	line  should  apply  to.   The
	      NodeName specification can use a Slurm hostlist specification as
	      shown in the example below.

       Type   An optional arbitrary string identifying the type	of generic re-
	      source.	For example, this might	be used	to identify a specific
	      model of GPU, which users	can then specify  in  a	 job  request.
	      For  changes  to	the Type option	to take	effect with a scontrol
	      reconfig all affected slurmd daemons must	be responding  to  the
	      slurmctld.  Otherwise a restart of the slurmctld and slurmd dae-
	      mons is required.

	      NOTE: If using autodetect	functionality and defining the Type in
	      your  gres.conf  file,  the  Type	specified should match or be a
	      substring	of the value that is detected, using an	underscore  in
	      lieu of any spaces.

EXAMPLES
       ##################################################################
       # Slurm's Generic Resource (GRES) configuration file
       # Define	GPU devices with MPS support, with AutoDetect sanity checking
       ##################################################################
       AutoDetect=nvml
       Name=gpu	Type=gtx560 File=/dev/nvidia0 COREs=0,1
       Name=gpu	Type=tesla  File=/dev/nvidia1 COREs=2,3
       Name=mps	Count=100 File=/dev/nvidia0 COREs=0,1
       Name=mps	Count=100  File=/dev/nvidia1 COREs=2,3

       ##################################################################
       # Slurm's Generic Resource (GRES) configuration file
       # Overwrite system defaults and explicitly configure three GPUs
       ##################################################################
       Name=gpu	Type=tesla File=/dev/nvidia[0-1] COREs=0,1
       # Name=gpu Type=tesla  File=/dev/nvidia[2-3] COREs=2,3
       # NOTE: nvidia2 device is out of	service
       Name=gpu	Type=tesla  File=/dev/nvidia3 COREs=2,3

       ##################################################################
       # Slurm's Generic Resource (GRES) configuration file
       # Use a single gres.conf	file for all compute nodes - positive method
       ##################################################################
       ## Explicitly specify devices on	nodes tux0-tux15
       # NodeName=tux[0-15]  Name=gpu File=/dev/nvidia[0-3]
       # NOTE: tux3 nvidia1 device is out of service
       NodeName=tux[0-2]  Name=gpu File=/dev/nvidia[0-3]
       NodeName=tux3  Name=gpu File=/dev/nvidia[0,2-3]
       NodeName=tux[4-15]  Name=gpu File=/dev/nvidia[0-3]

       ##################################################################
       # Slurm's Generic Resource (GRES) configuration file
       # Use NVML to gather GPU	configuration information
       # for all nodes except one
       ##################################################################
       AutoDetect=nvml
       NodeName=tux3 AutoDetect=off Name=gpu File=/dev/nvidia[0-3]

       ##################################################################
       # Slurm's Generic Resource (GRES) configuration file
       # Specify some nodes with NVML, some with RSMI, and some	with no	AutoDetect
       ##################################################################
       NodeName=tux[0-7] AutoDetect=nvml
       NodeName=tux[8-11] AutoDetect=rsmi
       NodeName=tux[12-15] Name=gpu File=/dev/nvidia[0-3]

       ##################################################################
       # Slurm's Generic Resource (GRES) configuration file
       # Define	'bandwidth' GRES to use	as a way to limit the
       # resource use on these nodes for workflow purposes
       ##################################################################
       NodeName=tux[0-7] Name=bandwidth	Type=lustre Count=4G Flags=CountOnly

COPYING
       Copyright  (C)  2010 The	Regents	of the University of California.  Pro-
       duced at	Lawrence Livermore National Laboratory (cf, DISCLAIMER).
       Copyright (C) 2010-2022 SchedMD LLC.

       This file is part of Slurm, a resource  management  program.   For  de-
       tails, see <https://slurm.schedmd.com/>.

       Slurm  is free software;	you can	redistribute it	and/or modify it under
       the terms of the	GNU General Public License as published	 by  the  Free
       Software	 Foundation;  either version 2 of the License, or (at your op-
       tion) any later version.

       Slurm is	distributed in the hope	that it	will be	 useful,  but  WITHOUT
       ANY  WARRANTY;  without even the	implied	warranty of MERCHANTABILITY or
       FITNESS FOR A PARTICULAR	PURPOSE. See the GNU  General  Public  License
       for more	details.

SEE ALSO
       slurm.conf(5)

Slurm 25.11		   Slurm Configuration File		  gres.conf(5)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=gres.conf&sektion=5&manpath=FreeBSD+Ports+15.0.quarterly>

home | help