Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
fi_poll(3)		       Libfabric v1.15.1		    fi_poll(3)

       fi_poll - Polling and wait set operations

       fi_poll_open / fi_close
	      Open/close a polling set

       fi_poll_add / fi_poll_del
	      Add/remove a completion queue or counter to/from a poll set.

	      Poll  for	 progress and events across multiple completion	queues
	      and counters.

       fi_wait_open / fi_close
	      Open/close a wait	set

	      Waits for	one or more wait objects in a set to be	signaled.

	      Indicate when it is safe to block	on wait	objects	 using	native
	      OS calls.

	      Control wait set operation or attributes.

	      #include <rdma/fi_domain.h>

	      int fi_poll_open(struct fid_domain *domain, struct fi_poll_attr *attr,
		  struct fid_poll **pollset);

	      int fi_close(struct fid *pollset);

	      int fi_poll_add(struct fid_poll *pollset,	struct fid *event_fid,
		  uint64_t flags);

	      int fi_poll_del(struct fid_poll *pollset,	struct fid *event_fid,
		  uint64_t flags);

	      int fi_poll(struct fid_poll *pollset, void **context, int	count);

	      int fi_wait_open(struct fid_fabric *fabric, struct fi_wait_attr *attr,
		  struct fid_wait **waitset);

	      int fi_close(struct fid *waitset);

	      int fi_wait(struct fid_wait *waitset, int	timeout);

	      int fi_trywait(struct fid_fabric *fabric,	struct fid **fids, size_t count);

	      int fi_control(struct fid	*waitset, int command, void *arg);

       fabric Fabric provider

       domain Resource domain

	      Event poll set

	      Wait object set

       attr   Poll or wait set attributes

	      On success, an array of user context values associated with com-
	      pletion queues or	counters.

       fids   An array of fabric descriptors, each one associated with	a  na-
	      tive wait	object.

       count  Number of	entries	in context or fids array.

	      Time to wait for a signal, in milliseconds.

	      Command of control operation to perform on the wait set.

       arg    Optional control argument.

       fi_poll_open  creates  a	 new polling set.  A poll set enables an opti-
       mized method for	progressing asynchronous  operations  across  multiple
       completion queues and counters and checking for their completions.

       A poll set is defined with the following	attributes.

	      struct fi_poll_attr {
		  uint64_t	       flags;	  /* operation flags */

       flags  Flags  that  set the default operation of	the poll set.  The use
	      of this field is reserved	and must be set	to 0 by	the caller.

       The fi_close call releases all resources	associated with	 a  poll  set.
       The  poll  set must not be associated with any other resources prior to
       being closed, otherwise the call	will return -FI_EBUSY.

       Associates a completion queue or	counter	with a poll set.

       Removes a completion queue or counter from a poll set.

       Progresses all completion queues	and counters associated	 with  a  poll
       set and checks for events.  If events might have	occurred, contexts as-
       sociated	with the completion queues and/or counters are returned.  Com-
       pletion	queues	will  return their context if they are not empty.  The
       context associated with a counter will be  returned  if	the  counter's
       success	value or error value have changed since	the last time fi_poll,
       fi_cntr_set, or fi_cntr_add were	called.	 The  number  of  contexts  is
       limited to the size of the context array, indicated by the count	param-

       Note that fi_poll only indicates	that events might  be  available.   In
       some  cases,  providers	may  consume  such events internally, to drive
       progress, for example.  This can	result in fi_poll returning false pos-
       itives.	 Applications should drive their progress based	on the results
       of reading events from a	completion queue or  reading  counter  values.
       The fi_poll function will always	return all completion queues and coun-
       ters that do have new events.

       fi_wait_open allocates a	new wait set.  A wait set enables an optimized
       method  of  waiting  for	 events	 across	multiple completion queues and
       counters.  Where	possible, a wait set uses a single underlying wait ob-
       ject that is signaled when a specified condition	occurs on an associat-
       ed completion queue or counter.

       The properties and behavior  of	a  wait	 set  are  defined  by	struct

	      struct fi_wait_attr {
		  enum fi_wait_obj     wait_obj;  /* requested wait object */
		  uint64_t	       flags;	  /* operation flags */

	      Wait sets	are associated with specific wait object(s).  Wait ob-
	      jects allow applications to block	until the wait object is  sig-
	      naled,  indicating  that	an event is available to be read.  The
	      following	values may be used to specify the type of wait	object
	      associated   with	  a   wait  set:  FI_WAIT_UNSPEC,  FI_WAIT_FD,

	      Specifies	that the user will only	wait on	 the  wait  set	 using
	      fabric  interface	calls, such as fi_wait.	 In this case, the un-
	      derlying provider	may select the	most  appropriate  or  highest
	      performing  wait	object available, including custom wait	mecha-
	      nisms.  Applications that	select FI_WAIT_UNSPEC are not  guaran-
	      teed to retrieve the underlying wait object.

       - FI_WAIT_FD
	      Indicates	 that the wait set should use a	single file descriptor
	      as its wait mechanism, as	exposed	to the application.  Internal-
	      ly,  this	may require the	use of epoll in	order to support wait-
	      ing on a single file descriptor.	File descriptor	 wait  objects
	      must  be	usable	in  the	POSIX select(2)	and poll(2), and Linux
	      epoll(7) routines	(if available).	 Provider signal  an  FD  wait
	      object by	marking	it as readable or with an error.

	      Specifies	 that the wait set should use a	pthread	mutex and cond
	      variable as a wait object.

	      This option is similar to	FI_WAIT_FD, but	allows the wait	mecha-
	      nism  to use multiple file descriptors as	its wait mechanism, as
	      viewed by	the application.  The use of FI_WAIT_POLLFD can	elimi-
	      nate  the	 need  to  use epoll to	abstract away needing to check
	      multiple file descriptors	when waiting for events.  The file de-
	      scriptors	must be	usable in the POSIX select(2) and poll(2) rou-
	      tines, and match directly	to being  used	with  poll.   See  the
	      NOTES section below for details on using pollfd.

       - FI_WAIT_YIELD
	      Indicates	 that the wait set will	wait without a wait object but
	      instead yield on every wait.

       flags  Flags that set the default operation of the wait set.   The  use
	      of this field is reserved	and must be set	to 0 by	the caller.

       The  fi_close  call  releases all resources associated with a wait set.
       The wait	set must not be	bound to any other opened resources  prior  to
       being closed, otherwise the call	will return -FI_EBUSY.

       Waits on	a wait set until one or	more of	its underlying wait objects is

       The fi_trywait call was introduced in libfabric version 1.3.   The  be-
       havior  of  using  native wait objects without the use of fi_trywait is
       provider	specific and should be considered non-deterministic.

       The fi_trywait()	call is	used in	conjunction with native	operating sys-
       tem  calls to block on wait objects, such as file descriptors.  The ap-
       plication must call fi_trywait and obtain a return value	of  FI_SUCCESS
       prior to	blocking on a native wait object.  Failure to do so may	result
       in the wait object not being signaled, and the application not  observ-
       ing the desired events.	The following pseudo-code demonstrates the use
       of fi_trywait in	conjunction with the OS	select(2) call.

	      fi_control(&cq->fid, FI_GETWAIT, (void *)	&fd);
	      FD_SET(fd, &fds);

	      while (1)	{
		  if (fi_trywait(&cq, 1) == FI_SUCCESS)
		      select(fd	+ 1, &fds, NULL, &fds, &timeout);

		  do {
		      ret = fi_cq_read(cq, &comp, 1);
		  } while (ret > 0);

       fi_trywait() will return	FI_SUCCESS if it is safe to block on the  wait
       object(s)  corresponding	 to the	fabric descriptor(s), or -FI_EAGAIN if
       there are events	queued on the fabric descriptor	or if  blocking	 could
       hang the	application.

       The  call  takes	 an array of fabric descriptors.  For each wait	object
       that will be passed to the native wait routine, the corresponding  fab-
       ric  descriptor	should	first be passed	to fi_trywait.	All fabric de-
       scriptors passed	into a single fi_trywait call must  make  use  of  the
       same underlying wait object type.

       The  following  types  of fabric	descriptors may	be passed into fi_try-
       wait: event queues, completion queues, counters,	and wait sets.	Appli-
       cations	that wish to use native	wait calls should select specific wait
       objects when allocating such resources.	For example,  by  setting  the
       item's creation attribute wait_obj value	to FI_WAIT_FD.

       In  the	case  the wait object to check belongs to a wait set, only the
       wait set	itself needs to	be passed into	fi_trywait.   The  fabric  re-
       sources associated with the wait	set do not.

       On  receiving a return value of -FI_EAGAIN from fi_trywait, an applica-
       tion should read	all queued completions and events, and call fi_trywait
       again  before attempting	to block.  Applications	can make use of	a fab-
       ric poll	set to identify	completion queues and counters	that  may  re-
       quire processing.

       The  fi_control	call is	used to	access provider	or implementation spe-
       cific details of	a fids that support blocking calls, such as wait sets,
       completion  queues, counters, and event queues.	Access to the wait set
       or fid should be	serialized across all calls  when  fi_control  is  in-
       voked,  as  it  may redirect the	implementation of wait set operations.
       The following control commands are usable with a	wait set or fid.

       FI_GETWAIT (void	**)
	      This command allows the user to retrieve the low-level wait  ob-
	      ject  associated with a wait set or fid.	The format of the wait
	      set is specified during wait set creation, through the wait  set
	      attributes.   The	 fi_control arg	parameter should be an address
	      where a pointer to the returned wait  object  will  be  written.
	      This should be an	'int *'	for FI_WAIT_FD,	`struct	fi_mutex_cond'
	      for   FI_WAIT_MUTEX_COND,	  or   `struct	 fi_wait_pollfd'   for
	      FI_WAIT_POLLFD.  Support for FI_GETWAIT is provider specific.

       FI_GETWAITOBJ (enum fi_wait_obj *)
	      This  command  returns the type of wait object associated	with a
	      wait set or fid.

       Returns FI_SUCCESS on success.  On error, a negative value  correspond-
       ing to fabric errno is returned.

       Fabric errno values are defined in rdma/fi_errno.h.

	      On  success,  if events are available, returns the number	of en-
	      tries written to the context array.

       In many situations, blocking calls may need to wait on signals sent  to
       a number	of file	descriptors.  For example, this	is the case for	socket
       based providers,	such as	tcp and	udp, as	well as	utility	providers such
       as multi-rail.  For simplicity, when epoll is available,	it can be used
       to limit	the number of file descriptors that an application must	 moni-
       tor.   The  use	of  epoll  may	also  be  required in order to support

       However,	in order to support waiting on multiple	 file  descriptors  on
       systems	where  epoll  support is not available,	or where epoll perfor-
       mance may negatively impact performance,	FI_WAIT_POLLFD	provides  this
       mechanism.  A significant different between using POLLFD	versus FD wait
       objects is that with FI_WAIT_POLLFD, the	file  descriptors  may	change
       dynamically.   As  an  example,	the file descriptors associated	with a
       completion queues' wait set may change as  endpoint  associations  with
       the CQ are added	and removed.

       Struct fi_wait_pollfd is	used to	retrieve all file descriptors for fids
       using FI_WAIT_POLLFD to support blocking	calls.

	      struct fi_wait_pollfd {
		  uint64_t	change_index;
		  size_t	nfds;
		  struct pollfd	*fd;

	      The change_index may be used to determine	if there have been any
	      changes  to the file descriptor list.  Anytime a file descriptor
	      is added,	removed, or its	events are updated, this field is  in-
	      cremented	by the provider.  Applications wishing to wait on file
	      descriptors directly should cache	the change_index  value.   Be-
	      fore  blocking  on  file	descriptor  events, the	app should use
	      fi_control() to retrieve the current  change_index  and  compare
	      that  against  its cached	value.	If the values differ, then the
	      app should update	its file descriptor list prior to blocking.

       nfds   On input to fi_control(),	this indicates the number  of  entries
	      in  the  struct  pollfd *	array.	On output, this	will be	set to
	      the number of entries needed to store the	current	number of file
	      descriptors.  If the input value is smaller than the output val-
	      ue, fi_control() will return the error -FI_ETOOSMALL.  Note that
	      setting  nfds  =	0  allows  an  efficient  way  of checking the

       fd     This points to an	array of struct	pollfd entries.	 The number of
	      entries  is  specified through the nfds field.  If the number of
	      needed entries is	less than or equal to the  number  of  entries
	      available,  the  struct  pollfd  array will be filled out	with a
	      list of file descriptors and corresponding events	 that  can  be
	      used in the select(2) and	poll(2)	calls.

       The  change_index  is updated only when the file	descriptors associated
       with the	pollfd file set	has changed.  Checking the change_index	is  an
       additional  step	 needed	 when working with FI_WAIT_POLLFD wait objects
       directly.  The use of the fi_trywait() function is  still  required  if
       accessing wait objects directly.

       fi_getinfo(3), fi_domain(3), fi_cntr(3),	fi_eq(3)


Libfabric Programmer's Manual	  2021-03-22			    fi_poll(3)


Want to link to this manual page? Use this URL:

home | help