Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
HWLOC-CALC(1)			     hwloc			 HWLOC-CALC(1)

NAME
       hwloc-calc - Operate on cpu mask	strings	and objects

SYNOPSIS
       hwloc-calc  [topology options] [options]	<location1> [<location2> [...]
       ]

       Note that hwloc(7) provides a detailed explanation of the hwloc	system
       and  of valid <location>	formats; it should be read before reading this
       man page.

TOPOLOGY OPTIONS
       All topology options must be given before all other options.

       --no-smt, --no-smt=<N>
		 Only keep the first PU	per core in the	input  locations.   If
		 <N>  is  specified, keep the <N>-th instead, if any.  PUs are
		 ordered by physical index during this filtering.

		 Note that this	option is applied after	 searching  locations.
		 Hence	--no-smt  pu:2-5 will first select the PUs #2 to #5 in
		 the machine before keeping one	of them	per core.   To	rather
		 get  PUs  #2  to  #5 after filtering one per core, you	should
		 combine invocations:

		   hwloc-calc --restrict $(hwloc-calc --no-smt all) pu:2-5

       --cpukind <n>, --cpukind	<infoname>=<infovalue>
		 Only keep PUs whose CPU kind match.  Either a single CPU kind
		 is specified as an index, or the  info	 attribute  name-value
		 will select matching kinds.

		 When  specified  by index, it corresponds to hwloc ranking of
		 CPU kinds which returns  energy-efficient  cores  first,  and
		 high-performance  power-hungry	 cores last.  The full list of
		 CPU kinds may be seen with lstopo --cpukinds.

		 Note that this	option is applied after	 searching  locations.
		 Hence	--cpukind  0 core:1 will return	the second core	of the
		 machine if it is of kind 0, and nothing otherwise.  To	rather
		 get the second	core among those of kind 0, you	should combine
		 invocations:

		   hwloc-calc --restrict $(hwloc-calc --cpukind	0 all) core:1

       --restrict <cpuset>
		 Restrict the topology to the given cpuset.  This removes some
		 PUs and their now-child-less parents.

		 This is useful	when combining invocations to filter some  ob-
		 jects before selecting	among them.

		 Beware	 that restricting the PUs in a topology	may change the
		 logical indexes of many objects, including NUMA nodes.

       --restrict nodeset=<nodeset>
		 Restrict the topology to  the	given  nodeset	(unless	 --re-
		 strict-flags  specifies  something  different).  This removes
		 some NUMA nodes and their now-child-less parents.

		 Beware	that restricting the NUMA  nodes  in  a	 topology  may
		 change	the logical indexes of many objects, including PUs.

       --restrict-flags	<flags>
		 Enforce  flags	 when  restricting the topology.  Flags	may be
		 given as numeric values or as a comma-separated list of  flag
		 names	that  are  passed to hwloc_topology_restrict().	 Those
		 names may be substrings of actual flag	names  as  long	 as  a
		 single	 one matches, for instance bynodeset,memless.  The de-
		 fault is 0 (or	none).

       --disallowed
		 Include objects disallowed by administrative limitations.

       -i <path>, --input <path>
		 Read the topology from	 <path>	 instead  of  discovering  the
		 topology of the local machine.

		 If  <path> is a file, it may be a XML file exported by	a pre-
		 vious hwloc program.  If <path> is "-",  the  standard	 input
		 may be	used as	a XML file.

		 On  Linux,  <path> may	be a directory containing the topology
		 files gathered	from  another  machine	topology  with	hwloc-
		 gather-topology.

		 On  x86,  <path>  may	be a directory containing a cpuid dump
		 gathered with hwloc-gather-cpuid.

		 When the archivemount program is available, <path>  may  also
		 be a tarball containing such Linux or x86 topology files.

       -i <specification>, --input <specification>
		 Simulate  a fake hierarchy (instead of	discovering the	topol-
		 ogy on	the local  machine).  If  <specification>  is  "node:2
		 pu:3",	 the  topology will contain two	NUMA nodes with	3 pro-
		 cessing units in each of them.	  The  <specification>	string
		 must end with a number	of PUs.

       --if <format>, --input-format <format>
		 Enforce  the  input  in  the given format, among xml, fsroot,
		 cpuid and synthetic.

OPTIONS
       All these options must be given after all topology options above.

       -p --physical
		 Use OS/physical indexes instead of logical indexes  for  both
		 input and output.

       -l --logical
		 Use  logical  indexes instead of physical/OS indexes for both
		 input and output (default).

       --pi --physical-input
		 Use OS/physical indexes instead of logical indexes for	input.

       --li --logical-input
		 Use logical indexes instead of	physical/OS indexes for	 input
		 (default).

       --po --physical-output
		 Use  OS/physical  indexes instead of logical indexes for out-
		 put.

       --lo --logical-output
		 Use logical indexes instead of	physical/OS indexes for	output
		 (default, except for cpusets which are	always physical).

       -n --nodeset
		 Interpret both	input and output sets as nodesets  instead  of
		 CPU sets.  See	--nodeset-output and --nodeset-input below for
		 details.

       --no --nodeset-output
		 Report	 nodesets  instead  of	CPU sets.  This	output is more
		 precise than the default CPU set output when memory  locality
		 matters because it properly describes CPU-less	NUMA nodes, as
		 well as NUMA-nodes that are local to multiple CPUs.

       --ni --nodeset-input
		 Interpret input sets as nodesets instead of CPU sets.

       --oo --object-output
		 When  reporting  object indexes (e.g. with -I or --local-mem-
		 ory), this option prefixes these  indexes  with  types	 (e.g.
		 Core:0	instead	of 0).

       -N --number-of <type|depth>
		 Report	 the number of objects of the given type or depth that
		 intersect the CPU set.	 This is convenient  for  finding  how
		 many cores, NUMA nodes	or PUs are available in	a machine.

		 When combined with --nodeset or --nodeset-output, the nodeset
		 is considered instead of the CPU set for finding matching ob-
		 jects.	  This is useful when reporting	the output as a	number
		 or set	of NUMA	nodes.

		 <type may contain a filter to select specific	objects	 among
		 the  type.  For  instance  -N	"numa[hbm]"  counts NUMA nodes
		 marked	with  subtype  "HBM",  while  -N  "numa[mcdram]"  only
		 counts	MCDRAM NUMA nodes on KNL.

		 If  an	OS device subtype such as gpu  is given	instead	of os-
		 dev, only the os devices of that subtype will be counted.

       -I --intersect <type|depth>
		 Find the list of objects of the given type or depth that  in-
		 tersect  the  CPU  set	and report the comma-separated list of
		 their indexes instead of the cpu mask string.	 This  may  be
		 used  for  determining	the list of objects above or below the
		 input objects.

		 When combined with --physical,	the list is convenient to pass
		 to external tools such	as taskset or numactl --physcpubind or
		 --membind.  This is different from --largest since the	latter
		 requires that all reported objects are	strictly included  in-
		 side the input	objects.

		 When combined with --nodeset or --nodeset-output, the nodeset
		 is considered instead of the CPU set for finding matching ob-
		 jects.	  This is useful when reporting	the output as a	number
		 or set	of NUMA	nodes.

		 <type may contain a filter to select specific	objects	 among
		 the type. For instance	-N "numa[hbm]" lists NUMA nodes	marked
		 with subtype "HBM", while -N "numa[mcdram]" only lists	MCDRAM
		 NUMA nodes on KNL.

		 If  an	 OS device subtype such	as gpu is given	instead	of os-
		 dev, only the os devices of that subtype will be returned.

		 If combined with --object-output, object indexes are prefixed
		 with types (e.g. Core:0 instead of 0).

       -H --hierarchical <type1>.<type2>...
		 Find the list of objects of type <type2> that	intersect  the
		 CPU  set and report the space-separated list of their hierar-
		 chical	indexes	with respect to	<type1>,  <type2>,  etc.   For
		 instance, if package.core is given, the output	would be Pack-
		 age:1.Core:2 Package:2.Core:3 if the input contains the third
		 core  of  the second package and the fourth core of the third
		 package.

		 Only normal CPU-side object types should be used.

		 NUMA nodes may	be used	but they may cause redundancy  in  the
		 output	 on  heterogeneous memory platform. For	instance, on a
		 platform with both DRAM and HBM  memory  on  a	 package,  the
		 first	core  will  be	considered both	as first core of first
		 NUMA node (DRAM) and as first core of second NUMA node	(HBM).

       --largest Report	(in a human readable format) the list of  largest  ob-
		 jects	which exactly include all input	objects	(by looking at
		 their CPU sets).  None	of these output	objects	intersect each
		 other,	and the	sum of them is exactly equivalent to  the  in-
		 put. No larger	object is included in the input.

		 This is different from	--intersect where reported objects may
		 not be	strictly included in the input.

       --local-memory
		 Report	the list of NUMA nodes that are	local to the input ob-
		 jects.

		 This  option  is similar to -I	numa but the way nodes are se-
		 lected	is different: The selection performed by  --local-mem-
		 ory  may  be  precisely configured with --local-memory-flags,
		 while -I numa just selects all	nodes that are	somehow	 local
		 to any	of the input objects.

		 If combined with --object-output, object indexes are prefixed
		 with types (e.g. NUMANode:0 instead of	0).

       --local-memory-flags
		 Change	 the flags used	to select local	NUMA nodes.  Flags may
		 be given as numeric values or as a  comma-separated  list  of
		 flag	names	that   are  passed  to	hwloc_get_local_numan-
		 ode_objs().  Those names may be  substrings  of  actual  flag
		 names	as long	as a single one	matches.  The default is 3 (or
		 smaller,larger) which means NUMA nodes	are displayed if their
		 locality either contains or is	contained in the  locality  of
		 the given object.

		 This option enables --local-memory.

       --best-memattr <name>
		 Enable	the listing of local memory nodes with --local-memory,
		 but only display the local nodes that have the	best value for
		 the memory attribute given by <name> (or as an	index).

		 If  the  memory attribute values depend on the	initiator, the
		 hwloc-calc input objects are used as the initiator.

		 Standard attribute names are Capacity,	 Locality,  Bandwidth,
		 and Latency.  All existing attributes in the current topology
		 may be	listed with

		     $ lstopo --memattrs

		 If  combined  with  --object-output, the object index is pre-
		 fixed with its	type (e.g. NUMANode:0 instead of 0).

		 <name>	may be suffixed	with flags to tune  the	 selection  of
		 best  nodes,  for  instance as	bandwidth,strict,default.  de-
		 fault means that all local nodes  are	reported  if  no  best
		 could be found.  strict means that nodes are selected only if
		 their	performance  is	 the best for all the input CPUs. On a
		 dual-socket machine with HBM in each socket,  both  HBMs  are
		 the  best  for	 their	local  socket,	but not	for the	remote
		 socket.  Hence	both HBM are also considered best for the  en-
		 tire machine by default, but none if strict.

       --sep <sep>
		 Change	 the  field  separator	in  the	output.	 By default, a
		 space is used to separate output objects (for	instance  when
		 --hierarchical	 or  --largest is given) while a comma is used
		 to separate indexes (for instance when	--intersect is given).

       --single	 Singlify the output to	a single CPU.

       --cpuset-output-format <hwloc|list|taskset|systemd-dbus-api> --cof
       <hwloc|list|taskset|systemd-dbus-api>
		 Change	the format of displayed	CPU set	strings.  By  default,
		 the  hwloc-specific  format  is  used.	 If list is given, the
		 output	is  a  comma-separated	of  numbers  or	 ranges,  e.g.
		 2,4-5,8 .  If taskset is given, the output is compatible with
		 the  taskset  program (replaces the former --taskset option).
		 If systemd-dbus-api is	given, the output is  compatible  with
		 systemd's  D-Bus  API,	e.g. "AllowedCPUs ay 0x0002 0x78 0x04"
		 for the CPU set list "3-6,10".

		 This option has no impact on the  format  of  input  CPU  set
		 strings, see --cpuset-input-format.

       --cpuset-input-format <hwloc|list|taskset> --cif	<hwloc|list|taskset>
		 Change	 the format of input CPU set strings.  By default, the
		 tool tries to guess the  type	automatically  between	hwloc,
		 list or taskset formats.  This	option forces the parsing for-
		 mat  to  avoid	 ambiguity  for	 instance  when	"1,3,5"	may be
		 parsed	as a hwloc cpuset  "0x1,0x00000003,0x00000005"	or  as
		 list "1-1,3-3,5-5".

		 This  option  has  no	impact on the format of	output CPU set
		 strings, see --cpuset-output-format.

       -q --quiet
		 Hide non-fatal	error messages.	 It mostly includes  locations
		 pointing to non-existing objects.

       -v --verbose
		 Verbose output.

       --version Report	version	and exit.

       -h --help Display help message and exit.

DESCRIPTION
       hwloc-calc generates and	manipulates CPU	mask strings or	objects.  Both
       input  and  output  may be either objects (with physical	or logical in-
       dexes), CPU lists (with physical	 or  logical  indexes),	 or  CPU  mask
       strings	(always	 physically indexed).  Input location specification is
       described in hwloc(7).

       If objects or CPU mask strings are given	on the command-line, they  are
       combined	 and  a	 single	 output	 is printed.  If no object or CPU mask
       strings are given on the	command-line, the program will read the	 stan-
       dard  input.  It	will combine multiple objects or CPU mask strings that
       are given on the	same line of the standard input	line  with  spaces  as
       separators.  Different input lines will be processed separately.

       Command-line  arguments	and  options  are  processed  in order.	 First
       topology	configuration options should be	given.	 Then,	for  instance,
       changing	 the  type  of	input  indexes with --li or changing the input
       topology	with -i	only affects the processing the	following arguments.

       NOTE: It	is highly recommended that you read the	hwloc(7) overview page
       before reading this man	page.	Most  of  the  concepts	 described  in
       hwloc(7)	directly apply to the hwloc-calc utility.

EXAMPLES
       hwloc-calc's operation is best described	through	several	examples.

       To display the (physical) CPU mask corresponding	to the second package:

	   $ hwloc-calc	package:1
	   0x000000f0

       To  display the (physical) CPU mask corresponding to the	third pacakge,
       excluding its even numbered logical processors:

	   $ hwloc-calc	package:2 ~PU:even
	   0x00000c00

       To display the (physical) CPU mask of the entire	 topology  except  the
       third package:

	   $ hwloc-calc	all ~package:3
	   0x0000f0ff

       To combine two (physical) CPU masks:

	   $ hwloc-calc	0x0000ffff 0xff000000
	   0xff00ffff

Examples of listing or counting	objects
       To  display  the	 list of logical numbers of processors included	in the
       second package:

	   $ hwloc-calc	--intersect PU package:1
	   4,5,6,7

       To bind GNU OpenMP threads logically over the whole machine, we need to
       use physical number output instead:

	   $ export GOMP_CPU_AFFINITY=`hwloc-calc  --physical-output  --inter-
       sect PU all`
	   $ echo $GOMP_CPU_AFFINITY
	   0,4,1,5,2,6,3,7

       To  display the list of NUMA nodes, by physical indexes,	that intersect
       a given (physical) CPU mask:

	   $ hwloc-calc	--physical --intersect NUMAnode	0xf0f0f0f0
	   0,2

       To find how many	cores are in the second	 CPU  kind  (those  cores  are
       likely higher-performance and more power-hungry than cores of the first
       kind):

	   $ hwloc-calc	--cpukind 1 -N core all
	   4

       To  convert  a  cpu mask	to human-readable output, the -H option	can be
       used to emit a space-delimited list of locations:

	   $ echo 0x000000f0 | hwloc-calc -q -H	package.core
	   Package:1.Core1 Package:1.Core:1 Package:1.Core:2 Package:1.Core:3

       To use some other character (e.g., a comma) instead of spaces  in  out-
       put, use	the --sep option:

	   $ echo 0x000000f0 | hwloc-calc -q -H	package.core --sep ,
	   Package:1.Core1,Package:1.Core:1,Package:1.Core:2,Package:1.Core:3

       To synthetize a set of cores into largest objects on a 2-node 2-package
       2-core machine:

	   $ hwloc-calc	core:0 --largest
	   Core:0
	   $ hwloc-calc	core:0-1 --largest
	   Package:0
	   $ hwloc-calc	core:4-7 --largest
	   L3Cache:1
	   $ hwloc-calc	core:2-6 --largest
	   Package:1 Package:2 Core:6
	   $ hwloc-calc	pack:2 --largest
	   Package:2
	   $ hwloc-calc	package:2-3 --largest
	   L3Cache:1

       To get the set of first threads of all cores:

	   $ hwloc-calc	core:all.pu:0
	   0xffff0000
	   $ hwloc-calc	--no-smt all -I	pu
	   0,2,4,6,8,10,12,14

Examples of listing or counting	NUMA nodes
       To  display the list of NUMA nodes, by physical indexes,	whose locality
       is exactly equal	to a Package:

	   $ hwloc-calc	--local-memory-flags 0 --physical-output pack:1
	   4,7

       To display the best-capacity NUMA node(s), by physical  indexes,	 whose
       locality	is exactly equal to a Package:

	   $ hwloc-calc	--local-memory-flags 0 --best-memattr capacity --phys-
       ical-output pack:1
	   4

       To find the number of NUMA nodes	with subtype "HBM":

	   $ hwloc-calc	-N "numa[hbm]" all
	   4

       To  find	 the  number  of  NUMA nodes in	memory tier 1 (DRAM nodes on a
       server with HBM and DRAM):

	   $ hwloc-calc	-N "numa[tier=1]" all
	   4

       To find the NUMA	node of	subtype	MCDRAM (on KNL)	near a PU:

	   $ hwloc-calc	-I "numa[mcdram]" pu:157
	   1

Examples with physical and logical indexes
       Converting object logical indexes (default) from/to physical/OS indexes
       may be performed	with --intersect combined with either  --physical-out-
       put  (logical  to physical conversion) or --physical-input (physical to
       logical):

	   $ hwloc-calc	--physical-output PU:2 --intersect PU
	   3
	   $ hwloc-calc	--physical-input PU:3 --intersect PU
	   2

       One should add --nodeset	when converting	indexes	of memory  objects  to
       make  sure  a single NUMA node index is returned	on platforms with het-
       erogeneous memory:

	   $ hwloc-calc	--nodeset --physical-output node:2 --intersect node
	   3
	   $ hwloc-calc	--nodeset --physical-input node:3 --intersect node
	   2

       To combine both physical	and logical indexes as input:

	   $ hwloc-calc	PU:2 --physical-input PU:3
	   0x0000000c

Examples with I/O devices
       To display the set of CPUs near network interface eth0:

	   $ hwloc-calc	os=eth0
	   0x00005555

       To display the indexes of packages near PCI  device  whose  bus	ID  is
       0000:01:02.0:

	   $ hwloc-calc	pci=0000:01:02.0 --intersect Package
	   1

       OS  devices may also be filtered	by subtype. In this example, there are
       8 OS devices in the system, 4 of	them are near NUMA node	#1, and	only 2
       of these	are CoProcessors:

	   $ utils/hwloc/hwloc-calc -I osdev all
	   0,1,2,3,4,5,6,7,8
	   $ utils/hwloc/hwloc-calc -I osdev node:1
	   5,6,7,8
	   $ utils/hwloc/hwloc-calc -I coproc node:1
	   7,8

Examples with other tools
       To make GNU OpenMP use exactly one thread per core, and in logical core
       order:

	   $ export OMP_NUM_THREADS=`hwloc-calc	--number-of core all`
	   $ echo $OMP_NUM_THREADS
	   4
	   $ export GOMP_CPU_AFFINITY=`hwloc-calc  --physical-output  --inter-
       sect PU --no-smt	all`
	   $ echo $GOMP_CPU_AFFINITY
	   0,2,1,3

       To  export  bitmask in a	format that is acceptable by the resctrl Linux
       subsystem (for configuring cache	partitioning, etc), apply a sed	regexp
       to the output of	hwloc-calc:

	   $ hwloc-calc	pack:all.core:7-9.pu:0
	   0x00000380,,0x00000380   <this format cannot	be given to resctrl>
	   $  hwloc-calc  pack:all.core:7-9.pu:0  |  sed   -e	's/0x//g'   -e
       's/,,/,0,/g' -e 's/,,/,0,/g'
	   00000380,0,00000380
	   # echo 00000380,0,00000380 >	/sys/fs/resctrl/test/cpus
	   # cat /sys/fs/resctrl/test/cpus
	   00000000,00000380,00000000,00000380	  <the	modified  bitmask  was
       corrected parsed	by resctrl>

Example	of use of the systemd-dbus-api cpuset output format
       hwloc-calc allows one to	generate the very cryptic AllowedCPUs  string,
       which  the  D-Bus  API of systemd expects, from other supported CPU set
       representations.	This is	especially useful when	the  systemd-run  com-
       mand, which understands CPU set provided	as list, cannot	be used.

       First, create a systemd slice:

	   $ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager StartUnit ss my_slice.slice fail

       Then, configure the CPU set of the slice, using hwloc-calc to translate
       the syntax:

	   $ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager SetUnitProperties 'sba(sv)' my_slice.slice	1 1 $(hwloc-calc pu:0 pu:31 pu:32 pu:63	pu:64 pu:77 --cpuset-output-format systemd-dbus-api)

       Finally,	add the	current	process	to the slice:

	   $ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager StartTransientUnit	'ssa(sv)a(sa(sv))' my_scope.scope fail 3 Delegate b 1 PIDs au 1	$$ Slice s my_slice.slice 0

       More info in the	org.freedesktop.systemd1(5) manual page.

RETURN VALUE
       Upon  successful	execution, hwloc-calc displays the (physical) CPU mask
       string, (physical or logical) object list, or (physical or logical) ob-
       ject number list.  The return value is 0.

       hwloc-calc will return nonzero if any kind of  error  occurs,  such  as
       (but not	limited	to): failure to	parse the command line.

SEE ALSO
       hwloc(7), lstopo(1), hwloc-info(1)

2.11.2				 Sep 26, 2024			 HWLOC-CALC(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=hwloc-calc&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help