Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
fi_domain(3)		       Libfabric v1.15.1		  fi_domain(3)

NAME
       fi_domain - Open	a fabric access	domain

SYNOPSIS
	      #include <rdma/fabric.h>

	      #include <rdma/fi_domain.h>

	      int fi_domain(struct fid_fabric *fabric, struct fi_info *info,
		  struct fid_domain **domain, void *context);

	      int fi_close(struct fid *domain);

	      int fi_domain_bind(struct	fid_domain *domain, struct fid *eq,
		  uint64_t flags);

	      int fi_open_ops(struct fid *domain, const	char *name, uint64_t flags,
		  void **ops, void *context);

	      int fi_set_ops(struct fid	*domain, const char *name, uint64_t flags,
		  void *ops, void *context);

ARGUMENTS
       fabric Fabric domain

       info   Fabric  information, including domain capabilities and attribut-
	      es.

       domain An opened	access domain.

       context
	      User specified context associated	with the domain.  This context
	      is returned as part of any asynchronous  event  associated  with
	      the domain.

       eq     Event queue for asynchronous operations initiated	on the domain.

       name   Name associated with an interface.

       ops    Fabric interface operations.

DESCRIPTION
       An access domain	typically refers to a physical or virtual NIC or hard-
       ware  port;  however, a domain may span across multiple hardware	compo-
       nents for fail-over or data striping purposes.  A  domain  defines  the
       boundary	 for  associating  different  resources	 together.  Fabric re-
       sources belonging to the	same domain may	share resources.

   fi_domain
       Opens a fabric access domain, also referred to as  a  resource  domain.
       Fabric  domains are identified by a name.  The properties of the	opened
       domain are specified using the info parameter.

   fi_open_ops
       fi_open_ops is used to open provider specific interfaces.  Provider in-
       terfaces	may be used to access low-level	resources and operations  that
       are  specific to	the opened resource domain.  The details of domain in-
       terfaces	are outside the	scope of this documentation.

   fi_set_ops
       fi_set_ops assigns callbacks that a provider should invoke in place  of
       performing  selected  tasks.   This allows users	to modify or control a
       provider's default behavior.  Conceptually, it allows the user to  hook
       specific	functions used by a provider and replace it with their own.

       The operations being modified are identified using a well-known charac-
       ter  string, passed as the name parameter.  The format of the ops para-
       meter is	dependent upon the name	value.	The ops	parameter will	refer-
       ence  a	structure  containing the callbacks and	other fields needed by
       the provider to invoke the user's functions.

       If a provider accepts the override, it will return FI_SUCCESS.  If  the
       override	 is  unknown  or  not  supported,  the	provider  will	return
       -FI_ENOSYS.  Overrides should be	set prior to allocating	 resources  on
       the domain.

       The  following  fi_set_ops operations and corresponding callback	struc-
       tures are defined.

       FI_SET_OPS_HMEM_OVERRIDE	- Heterogeneous	Memory Overrides

       HMEM override allows  users  to	override  HMEM	related	 operations  a
       provider	 may perform.  Currently, the scope of the HMEM	override is to
       allow a user to define the memory movement functions a provider	should
       use  when  accessing  a	user buffer.  The user-defined memory movement
       functions need to account for all the  different	 HMEM  iface  types  a
       provider	may encounter.

       All objects allocated against a domain will inherit this	override.

       The following is	the HMEM override operation name and structure.

	      #define FI_SET_OPS_HMEM_OVERRIDE "hmem_override_ops"

	      struct fi_hmem_override_ops {
		  size_t  size;

		  ssize_t (*copy_from_hmem_iov)(void *dest, size_t size,
		      enum fi_hmem_iface iface,	uint64_t device, const struct iovec *hmem_iov,
		      size_t hmem_iov_count, uint64_t hmem_iov_offset);

		  ssize_t (*copy_to_hmem_iov)(enum fi_hmem_iface iface,	uint64_t device,
		  const	struct iovec *hmem_iov,	size_t hmem_iov_count,
		      uint64_t hmem_iov_offset,	const void *src, size_t	size);
	      };

       All  fields  in struct fi_hmem_override_ops must	be set (non-null) to a
       valid value.

       size   This should be set to the	 sizeof(struct	fi_hmem_override_ops).
	      The  size	 field	is used	for forward and	backward compatibility
	      purposes.

       copy_from_hmem_iov
	      Copy data	from the device/hmem to	host  memory.	This  function
	      should  return  a	 negative  fi_errno on error, or the number of
	      bytes copied on success.

       copy_to_hmem_iov
	      Copy data	from host memory to the	 device/hmem.	This  function
	      should  return  a	 negative  fi_errno on error, or the number of
	      bytes copied on success.

   fi_domain_bind
       Associates an event queue with the domain.  An event queue bound	 to  a
       domain  will  be	 the  default  EQ associated with asynchronous control
       events that occur on the	domain or active endpoints allocated on	a  do-
       main.   This  includes  CM  events.  Endpoints may direct their control
       events to alternate EQs by binding directly with	the EQ.

       Binding an event	queue to a domain with the  FI_REG_MR  flag  indicates
       that  the  provider  should  perform all	memory registration operations
       asynchronously, with the	completion reported through the	 event	queue.
       If  an  event queue is not bound	to the domain with the FI_REG_MR flag,
       then memory registration	requests complete synchronously.

       See fi_av_bind(3), fi_ep_bind(3),  fi_mr_bind(3),  fi_pep_bind(3),  and
       fi_scalable_ep_bind(3) for more information.

   fi_close
       The  fi_close  call  is used to release all resources associated	with a
       domain or interface.  All objects associated  with  the	opened	domain
       must be released	prior to calling fi_close, otherwise the call will re-
       turn -FI_EBUSY.

DOMAIN ATTRIBUTES
       The  fi_domain_attr  structure defines the set of attributes associated
       with a domain.

	      struct fi_domain_attr {
		  struct fid_domain	*domain;
		  char			*name;
		  enum fi_threading	threading;
		  enum fi_progress	control_progress;
		  enum fi_progress	data_progress;
		  enum fi_resource_mgmt	resource_mgmt;
		  enum fi_av_type	av_type;
		  int			mr_mode;
		  size_t		mr_key_size;
		  size_t		cq_data_size;
		  size_t		cq_cnt;
		  size_t		ep_cnt;
		  size_t		tx_ctx_cnt;
		  size_t		rx_ctx_cnt;
		  size_t		max_ep_tx_ctx;
		  size_t		max_ep_rx_ctx;
		  size_t		max_ep_stx_ctx;
		  size_t		max_ep_srx_ctx;
		  size_t		cntr_cnt;
		  size_t		mr_iov_limit;
		  uint64_t		caps;
		  uint64_t		mode;
		  uint8_t		*auth_key;
		  size_t		auth_key_size;
		  size_t		max_err_data;
		  size_t		mr_cnt;
		  uint32_t		tclass;
	      };

   domain
       On input	to fi_getinfo, a user may set this to  an  opened  domain  in-
       stance  to restrict output to the given domain.	On output from fi_get-
       info, if	no domain was specified, but the user has an  opened  instance
       of the named domain, this will reference	the first opened instance.  If
       no instance has been opened, this field will be NULL.

       The  domain  instance  returned by fi_getinfo should only be considered
       valid if	the application	does not close any domain instances  from  an-
       other thread while fi_getinfo is	being processed.

   Name
       The name	of the access domain.

   Multi-threading Support (threading)
       The threading model specifies the level of serialization	required of an
       application when	using the libfabric data transfer interfaces.  Control
       interfaces  are	always	considered thread safe,	and may	be accessed by
       multiple	threads.  Applications which can  guarantee  serialization  in
       their  access  of provider allocated resources and interfaces enables a
       provider	to eliminate lower-level locks.

       FI_THREAD_COMPLETION
	      The completion threading model is	intended  for  providers  that
	      make use of manual progress.  Applications must serialize	access
	      to  all  objects that are	associated through the use of having a
	      shared completion	structure.  This includes  endpoint,  transmit
	      context,	receive	 context, completion queue, counter, wait set,
	      and poll set objects.

       For example, threads must serialize access to an	endpoint and its bound
       completion queue(s) and/or counters.  Access to	endpoints  that	 share
       the same	completion queue must also be serialized.

       The   use   of	FI_THREAD_COMPLETION  can  increase  parallelism  over
       FI_THREAD_SAFE, but requires the	use of isolated	resources.

       FI_THREAD_DOMAIN
	      A	domain serialization model requires applications to  serialize
	      access to	all objects belonging to a domain.

       FI_THREAD_ENDPOINT
	      The  endpoint  threading	model is similar to FI_THREAD_FID, but
	      with the added restriction that serialization is	required  when
	      accessing	 the  same endpoint, even if multiple transmit and re-
	      ceive contexts are used.	Conceptually, FI_THREAD_ENDPOINT  maps
	      well to providers	that implement fabric services in hardware but
	      use a single command queue to access different data flows.

       FI_THREAD_FID
	      A	 fabric	descriptor (FID) serialization model requires applica-
	      tions to serialize access	to individual fabric resources associ-
	      ated with	data transfer operations  and  completions.   Multiple
	      threads  must  be	 serialized  when accessing the	same endpoint,
	      transmit context,	receive	context,  completion  queue,  counter,
	      wait  set,  or  poll  set.   Serialization  is  required only by
	      threads accessing	the same object.

       For example, one	thread may be initiating a data	transfer  on  an  end-
       point,  while  another  thread reads from a completion queue associated
       with the	endpoint.

       Serialization to	endpoint access	is only	required  when	accessing  the
       same  endpoint  data  flow.  Multiple threads may initiate transfers on
       different transmit contexts of the same endpoint	 without  serializing,
       and  no serialization is	required between the submission	of data	trans-
       mit requests and	data receive operations.

       In general, FI_THREAD_FID allows	the provider to	be implemented without
       needing internal	locking	when handling data  transfers.	 Conceptually,
       FI_THREAD_FID  maps well	to providers that implement fabric services in
       hardware	and provide separate command queues to different data flows.

       FI_THREAD_SAFE
	      A	thread safe serialization model	allows a multi-threaded	appli-
	      cation to	access any allocated resources through	any  interface
	      without  restriction.   All  providers  are  required to support
	      FI_THREAD_SAFE.

       FI_THREAD_UNSPEC
	      This value indicates that	no threading model has	been  defined.
	      It  may  be  used	 on  input hints to the	fi_getinfo call.  When
	      specified, providers will	return a threading model  that	allows
	      for the greatest level of	parallelism.

   Progress Models (control_progress / data_progress)
       Progress	 is  the  ability of the underlying implementation to complete
       processing of an	asynchronous request.  In many cases,  the  processing
       of an asynchronous request requires the use of the host processor.  For
       example,	 a  received  message  may need	to be matched with the correct
       buffer, or a timed out request may need to be retransmitted.  For  per-
       formance	 reasons, it may be undesirable	for the	provider to allocate a
       thread for this	purpose,  which	 will  compete	with  the  application
       threads.

       Control	progress  indicates  the method	that the provider uses to make
       progress	on asynchronous	control	operations.   Control  operations  are
       functions which do not directly involve the transfer of application da-
       ta  between  endpoints.	 They include address vector, memory registra-
       tion, and connection management routines.

       Data progress indicates the method  that	 the  provider	uses  to  make
       progress	 on  data  transfer  operations.  This includes	message	queue,
       RMA, tagged messaging, and atomic operations, along with	their  comple-
       tion processing.

       Progress	 frequently  requires action being taken at both the transmit-
       ting and	receiving sides	of an operation.  This is often	a  requirement
       for  reliable  transfers, as a result of	retry and acknowledgement pro-
       cessing.

       To balance between performance and ease of use, two progress models are
       defined.

       FI_PROGRESS_AUTO
	      This progress model indicates that the provider will  make  for-
	      ward  progress  on an asynchronous operation without further in-
	      tervention by the	application.  When FI_PROGRESS_AUTO is provid-
	      ed as output to fi_getinfo in the	absence	of any progress	hints,
	      it often indicates that the desired functionality	is implemented
	      by the provider hardware or is a standard	service	of the operat-
	      ing system.

       It is recommended that providers	support	FI_PROGRESS_AUTO.  However, if
       a provider does not natively support automatic  progress,  forcing  the
       use of FI_PROGRESS_AUTO may result in threads being allocated below the
       fabric interfaces.

       Note  that  prior versions of the library required providers to support
       FI_PROGRESS_AUTO.  However, in some cases progress  threads  cannot  be
       blocked	when  communication is idle, which results in threads spinning
       in progress functions.  As a result,  those  providers  only  supported
       FI_PROGRESS_MANUAL.

       FI_PROGRESS_MANUAL
	      This progress model indicates that the provider requires the use
	      of  an  application  thread to complete an asynchronous request.
	      When manual progress is set, the provider	will  attempt  to  ad-
	      vance an asynchronous operation forward when the application at-
	      tempts  to  wait on or read an event queue, completion queue, or
	      counter  where  the  completed  operation	 will	be   reported.
	      Progress	also  occurs  when the application processes a poll or
	      wait set that has	been associated	with the event	or  completion
	      queue.

       Only  wait operations defined by	the fabric interface will result in an
       operation progressing.  Operating system	or  external  wait  functions,
       such as select, poll, or	pthread	routines, cannot.

       Manual  progress	requirements not only apply to endpoints that initiate
       transmit	operations, but	also to	endpoints that may be  the  target  of
       such  operations.  This holds true even if the target endpoint will not
       generate	completion events for the operations.  For  example,  an  end-
       point  that  acts purely	as the target of RMA or	atomic operations that
       uses manual progress may	still need application assistance  to  process
       received	operations.

       FI_PROGRESS_UNSPEC
	      This  value  indicates  that no progress model has been defined.
	      It may be	used on	input hints to the fi_getinfo call.

   Resource Management (resource_mgmt)
       Resource	management (RM)	is provider and	protocol  support  to  protect
       against	overrunning  local  and	remote resources.  This	includes local
       and remote transmit contexts, receive contexts, completion queues,  and
       source and target data buffers.

       When  enabled,  applications are	given some level of protection against
       overrunning provider queues and local and remote	 data  buffers.	  Such
       support	may  be	built directly into the	hardware and/or	network	proto-
       col, but	may also require that checks be	enabled	in the provider	 soft-
       ware.  By disabling resource management,	an application assumes all re-
       sponsibility for	preventing queue and buffer overruns, but doing	so may
       allow  a	 provider to eliminate internal	synchronization	calls, such as
       atomic variables	or locks.

       It should be noted that even if resource	management  is	disabled,  the
       provider	 implementation	 and  protocol may still provide some level of
       protection against overruns.  However, such protection is  not  guaran-
       teed.  The following values for resource	management are defined.

       FI_RM_DISABLED
	      The  provider  is	 free to select	an implementation and protocol
	      that does	not protect against resource overruns.	 The  applica-
	      tion is responsible for resource protection.

       FI_RM_ENABLED
	      Resource management is enabled for this provider domain.

       FI_RM_UNSPEC
	      This  value indicates that no resource management	model has been
	      defined.	It may be used on input	hints to the fi_getinfo	call.

       The behavior of the various  resource  management  options  depends  on
       whether the endpoint is reliable	or unreliable, as well as provider and
       protocol	specific implementation	details, as shown in the following ta-
       ble.  The table assumes that all	peers enable or	disable	RM the same.

       Resource	   DGRAM EP-no RM    DGRAM EP-with RM	RDM/MSG	  EP-no	   RDM/MSG EP-with
							RM		   RM
       ------------------------------------------------------------------------------------
	Tx Ctx	   undefined error	  EAGAIN	undefined error	       EAGAIN
	Rx Ctx	   undefined error	  EAGAIN	undefined error	       EAGAIN
	Tx CQ	   undefined error	  EAGAIN	undefined error	       EAGAIN
	Rx CQ	   undefined error	  EAGAIN	undefined error	       EAGAIN
	Target	       dropped		  dropped	 transmit error	       retried
	EP
       No    Rx	       dropped		  dropped	 transmit error	       retried
       Buffer
       Rx   Buf	  truncate or drop   truncate or drop	truncate or er-	   truncate or er-
       Overrun						ror		   ror
       Un-	   not applicable     not applicable	 transmit error	   transmit error
       matched
       RMA
       RMA	   not applicable     not applicable	 transmit error	   transmit error
       Overrun

       The resource column indicates the resource being	 accessed  by  a  data
       transfer	operation.

       Tx Ctx /	Rx Ctx
	      Refers to	the transmit/receive contexts when a data transfer op-
	      eration  is submitted.  When RM is enabled, attempting to	submit
	      a	request	will fail if the context is full.  If RM is  disabled,
	      an  undefined error (provider specific) will occur.  Such	errors
	      should be	considered fatal to the	context, and applications must
	      take steps to avoid queue	overruns.

       Tx CQ / Rx CQ
	      Refers to	the completion queue associated	with the Tx or Rx con-
	      text when	a local	operation completes.  When RM is disabled, ap-
	      plications must take care	to ensure that	completion  queues  do
	      not  get overrun.	 When an overrun occurs, an undefined, but fa-
	      tal, error will occur affecting all  endpoints  associated  with
	      the CQ.  Overruns	can be avoided by sizing the CQs appropriately
	      or  by deferring the posting of a	data transfer operation	unless
	      CQ space is available to store its completion.  When RM  is  en-
	      abled,  providers	 may  use  different  mechanisms to prevent CQ
	      overruns.	 This  includes	 failing  (returning  -FI_EAGAIN)  the
	      posting  of  operations that could result	in CQ overruns,	or in-
	      ternally retrying	requests (which	will be	hidden from the	appli-
	      cation).	See notes at the end of	this section regarding CQ  re-
	      source management	restrictions.

       Target EP / No Rx Buffer
	      Target  EP refers	to resources associated	with the endpoint that
	      is the target of a transmit operation.  This includes the	target
	      endpoint's  receive  queue,  posted  receive  buffers   (no   Rx
	      buffers),	 the  receive side completion queue, and other related
	      packet processing	queues.	 The defined behavior is that seen  by
	      the  initiator  of a request.  For FI_EP_DGRAM endpoints,	if the
	      target EP	queues are unable to  accept  incoming	messages,  re-
	      ceived  messages will be dropped.	 For reliable endpoints, if RM
	      is disabled, the transmit	operation will complete	in  error.   A
	      provider may choose to return an error completion	with the error
	      code  FI_ENORX for that transmit operation so that it can	be re-
	      tried.  If RM is enabled,	the provider will internally retry the
	      operation.

       Rx Buffer Overrun
	      This refers to buffers posted to receive incoming	tagged or  un-
	      tagged messages, with the	behavior defined from the viewpoint of
	      the  sender.   The  behavior for handling	received messages that
	      are larger than the  buffers  provided  by  the  application  is
	      provider	specific.   Providers  may either truncate the message
	      and report a successful completion, or fail the operation.   For
	      datagram	endpoints, failed sends	will result in the message be-
	      ing dropped.  For	reliable endpoints, send operations  may  com-
	      plete  successfully, yet be truncated at the receive side.  This
	      can occur	when the target	side buffers received  data  until  an
	      application buffer is made available.  The completion status may
	      also be dependent	upon the completion model selected byt the ap-
	      plication	  (e.g.	FI_DELIVERY_COMPLETE  versus  FI_TRANSMIT_COM-
	      PLETE).

       Unmatched RMA / RMA Overrun
	      Unmatched	RMA and	RMA overruns deal with the processing  of  RMA
	      and  atomic  operations.	Unlike send operations,	RMA operations
	      that attempt to access a memory address that is either not  reg-
	      istered for such operations, or attempt to access	outside	of the
	      target memory region will	fail, resulting	in a transmit error.

       When a resource management error	occurs on an endpoint, the endpoint is
       transitioned  into a disabled state.  Any operations which have not al-
       ready completed will fail and be	discarded.   For  connectionless  end-
       points,	the endpoint must be re-enabled	before it will accept new data
       transfer	operations.  For connected endpoints, the connection  is  torn
       down and	must be	re-established.

       There is	one notable restriction	on the protections offered by resource
       management.  This occurs	when resource management is enabled on an end-
       point  that  has	 been bound to completion queue(s) using the FI_SELEC-
       TIVE_COMPLETION flag.  Operations posted	to such	an endpoint may	speci-
       fy that a successful completion should not generate a entry on the cor-
       responding completion queue.  (I.e.  the	operation leaves  the  FI_COM-
       PLETION	flag unset).  In such situations, the provider is not required
       to reserve an entry in the completion queue to handle  the  case	 where
       the  operation  fails  and does generate	a CQ entry, which would	effec-
       tively require tracking the operation to	completion.  Applications con-
       cerned with avoiding CQ overruns	in the occurrence of errors  must  en-
       sure  that  there is sufficient space in	the CQ to report failed	opera-
       tions.  This can	typically be achieved by sizing	the CQ to at least the
       same size as the	endpoint queue(s) that are bound to it.

   AV Type (av_type)
       Specifies the type of address vectors that are usable with this domain.
       For additional details on AV type, see fi_av(3).	 The following	values
       may be specified.

       FI_AV_MAP
	      Only address vectors of type AV map are requested	or supported.

       FI_AV_TABLE
	      Only  address vectors of type AV index are requested or support-
	      ed.

       FI_AV_UNSPEC
	      Any address vector format	is requested and supported.

       Address vectors are only	used by	 connectionless	 endpoints.   Applica-
       tions  that require the use of a	specific type of address vector	should
       set the domain attribute	av_type	to the necessary  value	 when  calling
       fi_getinfo.   The  value	 FI_AV_UNSPEC may be used to indicate that the
       provider	can support either address vector format.   In	this  case,  a
       provider	may return FI_AV_UNSPEC	to indicate that either	format is sup-
       portable, or may	return another AV type to indicate the optimal AV type
       supported by this domain.

   Memory Registration Mode (mr_mode)
       Defines	memory	registration specific mode bits	used with this domain.
       Full details on MR mode options are available in	fi_mr(3).  The follow-
       ing values may be specified.

       FI_MR_ALLOCATED
	      Indicates	that memory  registration  occurs  on  allocated  data
	      buffers,	and physical pages must	back all virtual addresses be-
	      ing registered.

       FI_MR_COLLECTIVE
	      Requires data buffers passed to collective operations be explic-
	      itly registered for collective operations	using  the  FI_COLLEC-
	      TIVE flag.

       FI_MR_ENDPOINT
	      Memory  registration  occurs  at the endpoint level, rather than
	      domain.

       FI_MR_LOCAL
	      The provider is optimized	around	having	applications  register
	      memory  for locally accessed data	buffers.  Data buffers used in
	      send and receive operations and as the source buffer for RMA and
	      atomic operations	must be	registered by the application for  ac-
	      cess domains opened with this capability.

       FI_MR_MMU_NOTIFY
	      Indicates	 that the application is responsible for notifying the
	      provider when the	page tables referencing	 a  registered	memory
	      region may have been updated.

       FI_MR_PROV_KEY
	      Memory  registration  keys  are  selected	 and  returned	by the
	      provider.

       FI_MR_RAW
	      The provider requires additional setup as	part of	 their	memory
	      registration  process.   This mode is required by	providers that
	      use a memory key that is larger than 64-bits.

       FI_MR_RMA_EVENT
	      Indicates	that the memory	 regions  associated  with  completion
	      counters	must  be  explicitly  enabled after being bound	to any
	      counter.

       FI_MR_UNSPEC
	      Defined for compatibility	- library versions  1.4	 and  earlier.
	      Setting  mr_mode	to 0 indicates that FI_MR_BASIC	or FI_MR_SCAL-
	      ABLE are requested and supported.

       FI_MR_VIRT_ADDR
	      Registered memory	regions	are referenced by peers	using the vir-
	      tual address of the registered  memory  region,  rather  than  a
	      0-based offset.

       FI_MR_BASIC
	      Defined  for  compatibility  - library versions 1.4 and earlier.
	      Only basic memory	registration operations	are requested or  sup-
	      ported.	 This  mode  is	 equivalent  to	 the  FI_MR_VIRT_ADDR,
	      FI_MR_ALLOCATED, and FI_MR_PROV_KEY flags	being set in later li-
	      brary versions.  This flag may not be used in  conjunction  with
	      other mr_mode bits.

       FI_MR_SCALABLE
	      Defined  for  compatibility  - library versions 1.4 and earlier.
	      Only scalable memory registration	operations  are	 requested  or
	      supported.   Scalable registration uses offset based addressing,
	      with application selectable memory keys.	For  library  versions
	      1.5  and	later, this is the default if no mr_mode bits are set.
	      This flag	may not	be used	 in  conjunction  with	other  mr_mode
	      bits.

       Buffers	used  in  data	transfer  operations may require notifying the
       provider	of their use before a data transfer can	 occur.	  The  mr_mode
       field  indicates	 the type of memory registration that is required, and
       when registration is necessary.	Applications that require the use of a
       specific	registration mode should set the domain	attribute  mr_mode  to
       the  necessary  value  when calling fi_getinfo.	The value FI_MR_UNSPEC
       may be used to indicate support for any registration mode.

   MR Key Size (mr_key_size)
       Size of the memory region remote	access key,  in	 bytes.	  Applications
       that  request  their  own  MR  key must select a	value within the range
       specified by this value.	 Key sizes larger than 8 bytes	require	 using
       the FI_RAW_KEY mode bit.

   CQ Data Size	(cq_data_size)
       Applications  may  include a small message with a data transfer that is
       placed directly into a remote completion	queue as part of a  completion
       event.  This is referred	to as remote CQ	data (sometimes	referred to as
       immediate  data).   This	 field	indicates the number of	bytes that the
       provider	supports for remote CQ data.  If supported (non-zero value  is
       returned), the minimum size of remote CQ	data must be at	least 4-bytes.

   Completion Queue Count (cq_cnt)
       The  optimal number of completion queues	supported by the domain, rela-
       tive to any specified or	default	CQ attributes.	The cq_cnt  value  may
       be a fixed value	of the maximum number of CQs supported by the underly-
       ing hardware, or	may be a dynamic value,	based on the default attribut-
       es of an	allocated CQ, such as the CQ size and data format.

   Endpoint Count (ep_cnt)
       The  total number of endpoints supported	by the domain, relative	to any
       specified or default endpoint attributes.  The ep_cnt value  may	 be  a
       fixed  value of the maximum number of endpoints supported by the	under-
       lying hardware, or may be a dynamic value, based	on the default attrib-
       utes of an allocated endpoint, such as the  endpoint  capabilities  and
       size.   The  endpoint count is the number of addressable	endpoints sup-
       ported by the provider.	Providers return capability  limits  based  on
       configured hardware maximum capabilities.  Providers cannot predict all
       possible	 system	limitations without posteriori knowledge acquired dur-
       ing runtime that	will further limit these hardware  maximums  (e.g. ap-
       plication memory	consumption, FD	usage, etc.).

   Transmit Context Count (tx_ctx_cnt)
       The  number  of	outbound  command  queues  optimally  supported	by the
       provider.  For a	low-level provider, this represents the	number of com-
       mand queues to the hardware and/or the number of	parallel transmit  en-
       gines  effectively  supported by	the hardware and caches.  Applications
       which allocate more transmit contexts than this value will end up shar-
       ing underlying resources.  By default, there is a single	transmit  con-
       text  associated	with each endpoint, but	in an advanced usage model, an
       endpoint	may be configured with multiple	transmit contexts.

   Receive Context Count (rx_ctx_cnt)
       The number of inbound processing	 queues	 optimally  supported  by  the
       provider.   For	a low-level provider, this represents the number hard-
       ware queues that	can be effectively utilized  for  processing  incoming
       packets.	  Applications	which allocate more receive contexts than this
       value will end up sharing underlying resources.	By default,  a	single
       receive	context	 is  associated	with each endpoint, but	in an advanced
       usage model, an endpoint	may be configured with multiple	 receive  con-
       texts.

   Maximum Endpoint Transmit Context (max_ep_tx_ctx)
       The  maximum number of transmit contexts	that may be associated with an
       endpoint.

   Maximum Endpoint Receive Context (max_ep_rx_ctx)
       The maximum number of receive contexts that may be associated  with  an
       endpoint.

   Maximum Sharing of Transmit Context (max_ep_stx_ctx)
       The  maximum  number  of	endpoints that may be associated with a	shared
       transmit	context.

   Maximum Sharing of Receive Context (max_ep_srx_ctx)
       The maximum number of endpoints that may	be associated  with  a	shared
       receive context.

   Counter Count (cntr_cnt)
       The optimal number of completion	counters supported by the domain.  The
       cq_cnt  value  may  be  a fixed value of	the maximum number of counters
       supported by the	underlying hardware, or	may be a dynamic value,	 based
       on the default attributes of the	domain.

   MR IOV Limit	(mr_iov_limit)
       This is the maximum number of IO	vectors	(scatter-gather	elements) that
       a single	memory registration operation may reference.

   Capabilities	(caps)
       Domain  level  capabilities.  Domain capabilities indicate domain level
       features	that are supported by the provider.

       FI_LOCAL_COMM
	      At a conceptual level, this field	indicates that the  underlying
	      device supports loopback communication.  More specifically, this
	      field indicates that an endpoint may communicate with other end-
	      points that are allocated	from the same underlying named domain.
	      If  this field is	not set, an application	may need to use	an al-
	      ternate domain or	mechanism (e.g.	shared memory) to  communicate
	      with peers that execute on the same node.

       FI_REMOTE_COMM
	      This  field indicates that the underlying	provider supports com-
	      munication with nodes that are reachable over the	 network.   If
	      this  field is not set, then the provider	only supports communi-
	      cation between processes that execute  on	 the  same  node  -  a
	      shared memory provider, for example.

       FI_SHARED_AV
	      Indicates	 that the domain supports the ability to share address
	      vectors among multiple processes using the named address	vector
	      feature.

       See fi_getinfo(3) for a discussion on primary versus secondary capabil-
       ities.  All domain capabilities are considered secondary	capabilities.

   mode
       The operational mode bit	related	to using the domain.

       FI_RESTRICTED_COMP
	      This  bit	indicates that the domain limits completion queues and
	      counters to only be used with endpoints, transmit	contexts,  and
	      receive contexts that have the same set of capability flags.

   Default authorization key (auth_key)
       The  default  authorization  key	 to associate with endpoint and	memory
       registrations created within the	domain.	 This field is ignored	unless
       the fabric is opened with API version 1.5 or greater.

   Default authorization key length (auth_key_size)
       The  length  in	bytes of the default authorization key for the domain.
       If set to 0, then no authorization key will  be	associated  with  end-
       points and memory registrations created within the domain unless	speci-
       fied  in	the endpoint or	memory registration attributes.	 This field is
       ignored unless the fabric is opened with	API version 1.5	or greater.

   Max Error Data Size (max_err_data)
       : The maximum amount of error data, in bytes, that may be  returned  as
       part  of	 a completion or event queue error.  This value	corresponds to
       the  err_data_size  field  in   struct	fi_cq_err_entry	  and	struct
       fi_eq_err_entry.

   Memory Regions Count	(mr_cnt)
       The  optimal  number of memory regions supported	by the domain, or end-
       point if	the mr_mode FI_MR_ENDPOINT bit has been	set.  The mr_cnt value
       may be a	fixed value of the maximum number of MRs supported by the  un-
       derlying	 hardware, or may be a dynamic value, based on the default at-
       tributes	of the domain,	such  as  the  supported  memory  registration
       modes.	Applications can set the mr_cnt	on input to fi_getinfo,	in or-
       der to indicate their memory registration requirements.	Doing  so  may
       allow  the provider to optimize any memory registration cache or	lookup
       tables.

   Traffic Class (tclass)
       This specifies the default traffic class	that will  be  associated  any
       endpoints created within	the domain.  See fi_endpoint(3)	for additional
       information.

RETURN VALUE
       Returns 0 on success.  On error,	a negative value corresponding to fab-
       ric  errno is returned.	Fabric errno values are	defined	in rdma/fi_er-
       rno.h.

NOTES
       Users should call fi_close to release all resources  allocated  to  the
       fabric domain.

       The following fabric resources are associated with domains: active end-
       points, memory regions, completion event	queues,	and address vectors.

       Domain  attributes  reflect the limitations and capabilities of the un-
       derlying	hardware and/or	software provider.  They do not	reflect	system
       limitations, such as the	number of physical pages that  an  application
       may  pin	 or  number of file descriptors	that the application may open.
       As a result, the	reported maximums may not be  achievable,  even	 on  a
       lightly loaded systems, without an administrator	configuring system re-
       sources appropriately for the installed provider(s).

SEE ALSO
       fi_getinfo(3), fi_endpoint(3), fi_av(3),	fi_eq(3), fi_mr(3)

AUTHORS
       OpenFabrics.

Libfabric Programmer's Manual	  2022-03-30			  fi_domain(3)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=fi_domain_query&sektion=3&manpath=FreeBSD+Ports+14.3.quarterly>

home | help