Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
HWLOC-CALC(1)			     hwloc			 HWLOC-CALC(1)

NAME
       hwloc-calc - Operate on cpu mask	strings	and objects

SYNOPSIS
       hwloc-calc  [topology options] [options]	<location1> [<location2> [...]
       ]

       Note that hwloc(7) provides a detailed explanation of the hwloc	system
       and  of valid <location>	formats; it should be read before reading this
       man page.

TOPOLOGY OPTIONS
       All topology options must be given before all other options.

       --no-smt, --no-smt=<N>
		 Only keep the first PU	per core in the	input  locations.   If
		 <N>  is  specified, keep the <N>-th instead, if any.  PUs are
		 ordered by physical index during this filtering.

		 Note that this	option is applied after	 searching  locations.
		 Hence	--no-smt  pu:2-5 will first select the PUs #2 to #5 in
		 the machine before keeping one	of them	per core.   To	rather
		 get  PUs  #2  to  #5 after filtering one per core, you	should
		 combine invocations:

		   hwloc-calc --restrict $(hwloc-calc --no-smt all) pu:2-5

       --cpukind <n>, --cpukind	<infoname>=<infovalue>
		 Only keep PUs whose CPU kind match.  Either a single CPU kind
		 is specified as an index, or the  info	 attribute  name-value
		 will select matching kinds.

		 When  specified  by index, it corresponds to hwloc ranking of
		 CPU kinds which returns  energy-efficient  cores  first,  and
		 high-performance  power-hungry	 cores last.  The full list of
		 CPU kinds may be seen with lstopo --cpukinds.

		 Note that this	option is applied after	 searching  locations.
		 Hence	--cpukind  0 core:1 will return	the second core	of the
		 machine if it is of kind 0, and nothing otherwise.  To	rather
		 get the second	core among those of kind 0, you	should combine
		 invocations:

		   hwloc-calc --restrict $(hwloc-calc --cpukind	0 all) core:1

       --default-nodes
		 Only keep NUMA	nodes that are	considered  default  nodes  on
		 heterogeneous	memory	platforms.  This usually includes DRAM
		 memory	nodes (or nodes	of the same memory tier)  rather  than
		 nodes with specific characteristics (HBM, NVM,	CXL, etc).

		 This  option is useful	for splitting the topology by NUMA do-
		 main when binding one task per	domain even if some  NUMA  do-
		 mains	have the same locality (e.g. one DRAM and one HBM node
		 per socket).

		 See hwloc_topology_get_default_nodeset() for details.

       --restrict <cpuset>
		 Restrict the topology to the given cpuset.  This removes some
		 PUs and their now-child-less parents.

		 This is useful	when combining invocations to filter some  ob-
		 jects before selecting	among them.

		 Beware	 that restricting the PUs in a topology	may change the
		 logical indexes of many objects, including NUMA nodes.

       --restrict nodeset=<nodeset>
		 Restrict the topology to  the	given  nodeset	(unless	 --re-
		 strict-flags  specifies  something  different).  This removes
		 some NUMA nodes and their now-child-less parents.

		 Beware	that restricting the NUMA  nodes  in  a	 topology  may
		 change	the logical indexes of many objects, including PUs.

       --restrict-flags	<flags>
		 Enforce  flags	 when  restricting the topology.  Flags	may be
		 given as numeric values or as a comma-separated list of  flag
		 names	that  are  passed to hwloc_topology_restrict().	 Those
		 names may be substrings of actual flag	names  as  long	 as  a
		 single	 one matches, for instance bynodeset,memless.  The de-
		 fault is 0 (or	none).

       --disallowed
		 Include objects disallowed by administrative limitations.

       -i <path>, --input <path>
		 Read the topology from	 <path>	 instead  of  discovering  the
		 topology of the local machine.

		 If  <path> is a file, it may be a XML file exported by	a pre-
		 vious hwloc program.  If <path> is "-",  the  standard	 input
		 may be	used as	a XML file.

		 On  Linux,  <path> may	be a directory containing the topology
		 files gathered	from  another  machine	topology  with	hwloc-
		 gather-topology.

		 On  x86,  <path>  may	be a directory containing a cpuid dump
		 gathered with hwloc-gather-cpuid.

		 When the archivemount program is available, <path>  may  also
		 be a tarball containing such Linux or x86 topology files.

       -i <specification>, --input <specification>
		 Simulate  a fake hierarchy (instead of	discovering the	topol-
		 ogy on	the local  machine).  If  <specification>  is  "node:2
		 pu:3",	 the  topology will contain two	NUMA nodes with	3 pro-
		 cessing units in each of them.	  The  <specification>	string
		 must end with a number	of PUs.

       --if <format>, --input-format <format>
		 Enforce  the  input  in  the given format, among xml, fsroot,
		 cpuid and synthetic.

OUTPUT CONVERSION OPTIONS
       By default, the output is a CPU set (or nodeset).  These	 options  con-
       vert this set into objects, count them, etc.

       All these options must be given after all topology options above.

       -N --number-of <type|depth>
	      Report the number	of objects of the given	type or	depth that in-
	      tersect  the  CPU	 set.  This is convenient for finding how many
	      cores, NUMA nodes	or PUs are available in	a machine.

	      <type may	contain	a filter to select specific objects among  the
	      type.  For instance -N "numa[hbm]" counts	NUMA nodes marked with
	      subtype "HBM", while -N "numa[mcdram]" only counts  MCDRAM  NUMA
	      nodes on KNL.

	      If  an OS	device subtype such as gpu  is given instead of	osdev,
	      only the os devices of that subtype will be counted.

	      Special values such as cpukind and memorytier may	 be  given  to
	      return the number	of cpukinds or memory tiers matching the input
	      location.

       -I --intersect <type|depth>
	      Find  the	list of	objects	of the given type or depth that	inter-
	      sect the CPU set and report the comma-separated  list  of	 their
	      indexes  instead	of  the	cpu mask string.  This may be used for
	      determining the list of objects above or	below  the  input  ob-
	      jects.

	      When combined with --physical, the list is convenient to pass to
	      external	tools  such  as	 taskset  or  numactl --physcpubind or
	      --membind.  This is different from --largest  since  the	latter
	      requires	that all reported objects are strictly included	inside
	      the input	objects.

	      <type may	contain	a filter to select specific objects among  the
	      type.  For  instance -N "numa[hbm]" lists	NUMA nodes marked with
	      subtype "HBM", while -N "numa[mcdram]" only  lists  MCDRAM  NUMA
	      nodes  on	KNL.  Note that	this filter applies when selecting ob-
	      jects, but not when outputting them, e.g.	MCDRAM NUMA node #3 is
	      outputted	as 7 (NUMA node	#7) instead of 3.

	      If an OS device subtype such as gpu is given instead  of	osdev,
	      only the os devices of that subtype will be returned.

	      Special  values  such  as	cpukind	and memorytier may be given to
	      return the list of cpukind or memory tier	indexes	 matching  the
	      input location.

	      If  combined  with  --object-output, object indexes are prefixed
	      with types (e.g. Core:0 instead of 0).

       -H --hierarchical <type1>.<type2>...
	      Find the list of objects of type <type2> that intersect the  CPU
	      set  and	report	the space-separated list of their hierarchical
	      indexes with respect to <type1>, <type2>,	etc.  For instance, if
	      package.core is given,  the  output  would  be  Package:1.Core:2
	      Package:2.Core:3	if  the	 input	contains the third core	of the
	      second package and the fourth core of the	third package.

	      Only normal CPU-side object types	should be used.

	      NUMA nodes may be	used but they may cause	redundancy in the out-
	      put on heterogeneous memory platform. For	instance, on  a	 plat-
	      form  with both DRAM and HBM memory on a package,	the first core
	      will be considered both as first core of first NUMA node	(DRAM)
	      and as first core	of second NUMA node (HBM).

       --largest
	      Report  (in a human readable format) the list of largest objects
	      which exactly include all	input objects (by looking at their CPU
	      sets).  None of these output objects intersect each  other,  and
	      the  sum	of  them is exactly equivalent to the input. No	larger
	      object is	included in the	input.

	      This is different	from --intersect where	reported  objects  may
	      not be strictly included in the input.

       --local-memory
	      Report  the  list	 of NUMA nodes that are	local to the input ob-
	      jects.

	      This option is similar to	-I numa	but the	way nodes are selected
	      is different: The	selection performed by --local-memory  may  be
	      precisely	 configured  with  --local-memory-flags, while -I numa
	      just selects all nodes that are somehow local to any of the  in-
	      put objects.

	      If  combined  with  --object-output, object indexes are prefixed
	      with types (e.g. NUMANode:0 instead of 0).

       --local-memory-flags
	      Change the flags used to select local NUMA nodes.	 Flags may  be
	      given  as	 numeric  values  or as	a comma-separated list of flag
	      names that are passed to hwloc_get_local_numanode_objs().	 Those
	      names may	be substrings of actual	flag names as long as a	single
	      one matches.  The	default	is xb  (or  smaller,larger,intersects)
	      which  means  NUMA  nodes	are displayed if their locality	either
	      contains,	is contained, or intersects the	locality of the	 given
	      object.

	      This option enables --local-memory.

       --best-memattr <name>
	      Enable  the  listing  of local memory nodes with --local-memory,
	      but only display the local nodes that have the  best  value  for
	      the memory attribute given by <name> (or as an index).

	      If  the  memory  attribute  values  depend on the	initiator, the
	      hwloc-calc input objects are used	as the initiator.

	      Standard attribute names are Capacity, Locality, Bandwidth,  and
	      Latency.	All existing attributes	in the current topology	may be
	      listed with

		  $ lstopo --memattrs

	      If  combined  with --object-output, the object index is prefixed
	      with its type (e.g. NUMANode:0 instead of	0).

	      <name> may be suffixed with flags	to tune	the selection of  best
	      nodes, for instance as bandwidth,strict,default.

	      default  means  that default nodes are reported if no best could
	      be found (see --default-nodes).  If  neither  best  nor  default
	      nodes could be found, all	local nodes are	reported.

	      strict  means  that nodes	are selected only if their performance
	      is the best for all the input CPUs.  On  a  dual-socket  machine
	      with  HBM	in each	socket,	both HBMs are the best for their local
	      socket, but not for the remote socket.  Hence both HBM are  also
	      considered  best	for the	entire machine by default, but none if
	      strict.

INPUT /	OUTPUT SET AND OBJECT OPTIONS
       These options configure how objects and CPU/node	sets are parsed	on in-
       put and formatted on output.

       All these options must be given after all topology options above.

       -p --physical
		 Use OS/physical indexes instead of logical indexes  for  both
		 input and output.

       -l --logical
		 Use  logical  indexes instead of physical/OS indexes for both
		 input and output (default).

       --pi --physical-input
		 Use OS/physical indexes instead of logical indexes for	input.

       --li --logical-input
		 Use logical indexes instead of	physical/OS indexes for	 input
		 (default).

       --po --physical-output
		 Use  OS/physical  indexes instead of logical indexes for out-
		 put.

       --lo --logical-output
		 Use logical indexes instead of	physical/OS indexes for	output
		 (default, except for cpusets which are	always physical).

       -n --nodeset
		 Interpret both	input and output sets as nodesets  instead  of
		 CPU sets.  See	--nodeset-output and --nodeset-input below for
		 details.

       --no --nodeset-output
		 Report	 nodesets  instead  of	CPU sets.  This	output is more
		 precise than the default CPU set output when memory  locality
		 matters because it properly describes CPU-less	NUMA nodes, as
		 well as NUMA-nodes that are local to multiple CPUs.

       --ni --nodeset-input
		 Interpret input sets as nodesets instead of CPU sets.

FORMATTING OPTIONS
       All these options must be given after all topology options above.

       --oo --object-output
	      When  reporting object indexes (e.g. with	-I or --local-memory),
	      this option prefixes these indexes with types (e.g.  Core:0  in-
	      stead of 0).

       --sep <sep>
	      Change  the  field separator in the output.  By default, a space
	      is used to separate output objects (for instance when  --hierar-
	      chical  or --largest is given) while a comma is used to separate
	      indexes (for instance when --intersect is	given).

       --single
	      Singlify the output to a single CPU.

       --cpuset-output-format <hwloc|list|taskset|systemd-dbus-api> --cof
       <hwloc|list|taskset|systemd-dbus-api>
	      Change the format	of displayed bitmap strings (CPU set or	 node-
	      set).   By  default, the hwloc-specific format is	used.  If list
	      is given,	the output is a	comma-separated	of numbers or  ranges,
	      e.g.  2,4-5,8  .	 If taskset is given, the output is compatible
	      with the taskset program (replaces the former --taskset option).
	      If systemd-dbus-api is given, the	output is compatible with sys-
	      temd's D-Bus API,	e.g. "ay 0x0002	0x78 0x04"  for	 the  CPU  set
	      list "3-6,10".

	      For  convenience,	--nodeset-output-format	(or --nof) behaves the
	      same but also implies --nodeset-output.

	      This option has no  impact  on  the  format  of  input  CPU  set
	      strings, see --cpuset-input-format.

       --cpuset-input-format <hwloc|list|taskset> --cif	<hwloc|list|taskset>
	      Change  the format of input bitmap strings (CPU set or nodeset).
	      By default, the tool tries to guess the type  automatically  be-
	      tween  hwloc,  list  or taskset formats.	This option forces the
	      parsing format to	avoid ambiguity	for instance when "1,3,5"  may
	      be  parsed  as  a	hwloc cpuset "0x1,0x00000003,0x00000005" or as
	      list "1-1,3-3,5-5".

	      This option has no impact	 on  the  format  of  output  CPU  set
	      strings, see --cpuset-output-format.

       -q --quiet
	      Hide  non-fatal  error  messages.	  It mostly includes locations
	      pointing to non-existing objects.

       -v --verbose
	      Verbose output.

       --version
	      Report version and exit.

       -h --help
	      Display help message and exit.

DESCRIPTION
       hwloc-calc generates and	manipulates CPU	mask strings or	objects.  Both
       input and output	may be either objects (with physical  or  logical  in-
       dexes),	CPU  lists  (with  physical  or	 logical indexes), or CPU mask
       strings (always physically indexed).  Input location  specification  is
       described in hwloc(7).

       If  objects or CPU mask strings are given on the	command-line, they are
       combined	and a single output is printed.	 If  no	 object	 or  CPU  mask
       strings	are given on the command-line, the program will	read the stan-
       dard input.  It will combine multiple objects or	CPU mask strings  that
       are  given  on  the same	line of	the standard input line	with spaces as
       separators.  Different input lines will be processed separately.

       Command-line arguments and  options  are	 processed  in	order.	 First
       topology	 configuration	options	 should	be given.  Then, for instance,
       changing	the type of input indexes with	--li  or  changing  the	 input
       topology	with -i	only affects the processing the	following arguments.

       NOTE: It	is highly recommended that you read the	hwloc(7) overview page
       before  reading	this  man  page.   Most	 of  the concepts described in
       hwloc(7)	directly apply to the hwloc-calc utility.

EXAMPLES
       hwloc-calc's operation is best described	through	several	examples.

       To display the (physical) CPU mask corresponding	to the second package:

	   $ hwloc-calc	package:1
	   0x000000f0

       To display the (physical) CPU mask corresponding	to the third  pacakge,
       excluding its even numbered logical processors:

	   $ hwloc-calc	package:2 ~PU:even
	   0x00000c00

       To  display  the	 (physical) CPU	mask of	the entire topology except the
       third package:

	   $ hwloc-calc	all ~package:3
	   0x0000f0ff

       To combine two (physical) CPU masks:

	   $ hwloc-calc	0x0000ffff 0xff000000
	   0xff00ffff

Examples of listing or counting	objects
       To display the list of logical numbers of processors  included  in  the
       second package:

	   $ hwloc-calc	--intersect PU package:1
	   4,5,6,7

       To bind GNU OpenMP threads logically over the whole machine, we need to
       use physical number output instead:

	   $  export  GOMP_CPU_AFFINITY=`hwloc-calc --physical-output --inter-
       sect PU all`
	   $ echo $GOMP_CPU_AFFINITY
	   0,4,1,5,2,6,3,7

       To display the list of NUMA nodes, by physical indexes, that  intersect
       a given (physical) CPU mask:

	   $ hwloc-calc	--physical --intersect NUMAnode	0xf0f0f0f0
	   0,2

       To  find	 how  many  cores  are in the second CPU kind (those cores are
       likely higher-performance and more power-hungry than cores of the first
       kind):

	   $ hwloc-calc	--cpukind 1 -N core all
	   4

       To convert a cpu	mask to	human-readable output, the -H  option  can  be
       used to emit a space-delimited list of locations:

	   $ echo 0x000000f0 | hwloc-calc -q -H	package.core
	   Package:1.Core1 Package:1.Core:1 Package:1.Core:2 Package:1.Core:3

       To  use	some other character (e.g., a comma) instead of	spaces in out-
       put, use	the --sep option:

	   $ echo 0x000000f0 | hwloc-calc -q -H	package.core --sep ,
	   Package:1.Core1,Package:1.Core:1,Package:1.Core:2,Package:1.Core:3

       To synthetize a set of cores into largest objects on a 2-node 2-package
       2-core machine:

	   $ hwloc-calc	core:0 --largest
	   Core:0
	   $ hwloc-calc	core:0-1 --largest
	   Package:0
	   $ hwloc-calc	core:4-7 --largest
	   L3Cache:1
	   $ hwloc-calc	core:2-6 --largest
	   Package:1 Package:2 Core:6
	   $ hwloc-calc	pack:2 --largest
	   Package:2
	   $ hwloc-calc	package:2-3 --largest
	   L3Cache:1

       To get the set of first threads of all cores:

	   $ hwloc-calc	core:all.pu:0
	   0xffff0000
	   $ hwloc-calc	--no-smt all -I	pu
	   0,2,4,6,8,10,12,14

       To get the number of cpukinds inside a package:

	   $ hwloc-calc	-N cpukind package:0
	   2

Examples of listing or counting	NUMA nodes
       To display the list of NUMA nodes, by physical indexes, whose  locality
       is exactly equal	to a Package:

	   $ hwloc-calc	--local-memory-flags 0 --physical-output pack:1
	   4,7

       To  display  the	list of	default	NUMA nodes, by logical indexes,	in the
       entire machine:

	   $ hwloc-calc	--default-nodes	-I numa	all
	   0,2,4,6

       To display the best-capacity NUMA node(s), by physical  indexes,	 whose
       locality	is exactly equal to a Package:

	   $ hwloc-calc	--local-memory-flags 0 --best-memattr capacity --phys-
       ical-output pack:1
	   4

       To find the number of NUMA nodes	with subtype "HBM":

	   $ hwloc-calc	-N "numa[hbm]" all
	   4

       To  find	 the  number  of  NUMA nodes in	memory tier 1 (DRAM nodes on a
       server with HBM and DRAM):

	   $ hwloc-calc	-N "numa[tier=1]" all
	   4

       To find the NUMA	node of	subtype	MCDRAM (on KNL)	near a PU:

	   $ hwloc-calc	-I "numa[mcdram]" --oo pu:157
	   NUMANode:1

       To find the memory tier of a NUMA node:

	   $ hwloc-calc	-I memorytier node:2
	   1

Examples with physical and logical indexes
       Converting object logical indexes (default) from/to physical/OS indexes
       may be performed	with --intersect combined with either  --physical-out-
       put  (logical  to physical conversion) or --physical-input (physical to
       logical):

	   $ hwloc-calc	--physical-output PU:2 --intersect PU
	   3
	   $ hwloc-calc	--physical-input PU:3 --intersect PU
	   2

       This may	also be	used for converting indexes of	memory	objects,  even
       with heterogeneous memory:

	   $ hwloc-calc	--physical-output node:2 --intersect node
	   3
	   $ hwloc-calc	--physical-input node:3	--intersect node
	   2

       To combine both physical	and logical indexes as input:

	   $ hwloc-calc	PU:2 --physical-input PU:3
	   0x0000000c

Examples with I/O devices
       To display the set of CPUs near network interface eth0:

	   $ hwloc-calc	os=eth0
	   0x00005555

       To  display  the	 indexes  of  packages near PCI	device whose bus ID is
       0000:01:02.0:

	   $ hwloc-calc	pci=0000:01:02.0 --intersect Package
	   1

       OS devices may also be filtered by subtype. In this example, there  are
       8 OS devices in the system, 4 of	them are near NUMA node	#1, and	only 2
       of these	are CoProcessors:

	   $ utils/hwloc/hwloc-calc -I osdev all
	   0,1,2,3,4,5,6,7,8
	   $ utils/hwloc/hwloc-calc -I osdev node:1
	   5,6,7,8
	   $ utils/hwloc/hwloc-calc -I coproc node:1
	   7,8

Examples with other tools
       To make GNU OpenMP use exactly one thread per core, and in logical core
       order:

	   $ export OMP_NUM_THREADS=`hwloc-calc	--number-of core all`
	   $ echo $OMP_NUM_THREADS
	   4
	   $  export  GOMP_CPU_AFFINITY=`hwloc-calc --physical-output --inter-
       sect PU --no-smt	all`
	   $ echo $GOMP_CPU_AFFINITY
	   0,2,1,3

       To export bitmask in a format that is acceptable	by the	resctrl	 Linux
       subsystem (for configuring cache	partitioning, etc), apply a sed	regexp
       to the output of	hwloc-calc:

	   $ hwloc-calc	pack:all.core:7-9.pu:0
	   0x00000380,,0x00000380   <this format cannot	be given to resctrl>
	   $   hwloc-calc   pack:all.core:7-9.pu:0   |	sed  -e	 's/0x//g'  -e
       's/,,/,0,/g' -e 's/,,/,0,/g'
	   00000380,0,00000380
	   # echo 00000380,0,00000380 >	/sys/fs/resctrl/test/cpus
	   # cat /sys/fs/resctrl/test/cpus
	   00000000,00000380,00000000,00000380	  <the	modified  bitmask  was
       corrected parsed	by resctrl>

Example	of use of the systemd-dbus-api cpuset and nodeset outputs format
       hwloc-calc  allows one to generate the very cryptic AllowedCPUs and Al-
       lowedMemoryNodes	strings, which the D-Bus API of	systemd	expects,  from
       other  hwloc  representations.  This is especially useful when the sys-
       temd-run	command, which understands numeric lists, cannot be used.

       First, create a systemd slice:

	   $ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager StartUnit ss my_slice.slice fail

       Then, configure the CPU and Node	sets of	the slice, using hwloc-calc to
       translate the syntax:

	   $ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager SetUnitProperties 'sba(sv)' my_slice.slice	1 1 AllowedCPUs	$(hwloc-calc pu:0 pu:31	pu:32 pu:63 pu:64 pu:77	--cpuset-output-format systemd-dbus-api)
	   $ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager SetUnitProperties 'sba(sv)' my_slice.slice	1 1 AllowedMemoryNodes $(hwloc-calc pu:0 pu:31 pu:32 pu:63 pu:64 pu:77 --nodeset-output-format systemd-dbus-api)

       Finally,	add the	current	process	to the slice:

	   $ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager StartTransientUnit	'ssa(sv)a(sa(sv))' my_scope.scope fail 3 Delegate b 1 PIDs au 1	$$ Slice s my_slice.slice 0

       More info in the	org.freedesktop.systemd1(5) manual page.

RETURN VALUE
       Upon successful execution, hwloc-calc displays the (physical) CPU  mask
       string, (physical or logical) object list, or (physical or logical) ob-
       ject number list.  The return value is 0.

       hwloc-calc  will	 return	 nonzero  if any kind of error occurs, such as
       (but not	limited	to): failure to	parse the command line.

SEE ALSO
       hwloc(7), lstopo(1), hwloc-info(1)

2.12.1				 May 12, 2025			 HWLOC-CALC(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=hwloc-calc&sektion=1&manpath=FreeBSD+Ports+15.0>

home | help