Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
fi_msg(3)		       Libfabric v1.15.1		     fi_msg(3)

NAME
       fi_msg -	Message	data transfer operations

       fi_recv / fi_recvv / fi_recvmsg
	      Post a buffer to receive an incoming message

       fi_send	/  fi_sendv / fi_sendmsg fi_inject / fi_senddata : Initiate an
       operation to send a message

SYNOPSIS
	      #include <rdma/fi_endpoint.h>

	      ssize_t fi_recv(struct fid_ep *ep, void *	buf, size_t len,
		  void *desc, fi_addr_t	src_addr, void *context);

	      ssize_t fi_recvv(struct fid_ep *ep, const	struct iovec *iov, void	**desc,
		  size_t count,	fi_addr_t src_addr, void *context);

	      ssize_t fi_recvmsg(struct	fid_ep *ep, const struct fi_msg	*msg,
		  uint64_t flags);

	      ssize_t fi_send(struct fid_ep *ep, const void *buf, size_t len,
		  void *desc, fi_addr_t	dest_addr, void	*context);

	      ssize_t fi_sendv(struct fid_ep *ep, const	struct iovec *iov,
		  void **desc, size_t count, fi_addr_t dest_addr, void *context);

	      ssize_t fi_sendmsg(struct	fid_ep *ep, const struct fi_msg	*msg,
		  uint64_t flags);

	      ssize_t fi_inject(struct fid_ep *ep, const void *buf, size_t len,
		  fi_addr_t dest_addr);

	      ssize_t fi_senddata(struct fid_ep	*ep, const void	*buf, size_t len,
		  void *desc, uint64_t data, fi_addr_t dest_addr, void *context);

	      ssize_t fi_injectdata(struct fid_ep *ep, const void *buf,	size_t len,
		  uint64_t data, fi_addr_t dest_addr);

ARGUMENTS
       ep     Fabric endpoint on  which	 to  initiate  send  or	 post  receive
	      buffer.

       buf    Data buffer to send or receive.

       len    Length  of  data	buffer to send or receive, specified in	bytes.
	      Valid  transfers	are  from  0  bytes  up	 to   the   endpoint's
	      max_msg_size.

       iov    Vectored data buffer.

       count  Count of vectored	data entries.

       desc   Descriptor associated with the data buffer.  See fi_mr(3).

       data   Remote CQ	data to	transfer with the sent message.

       dest_addr
	      Destination  address  for	connectionless transfers.  Ignored for
	      connected	endpoints.

       src_addr
	      Source address to	receive	 from  for  connectionless  transfers.
	      Applies  only  to	 connectionless	 endpoints with	the FI_DIRECT-
	      ED_RECV capability enabled, otherwise this field is ignored.  If
	      set to FI_ADDR_UNSPEC, any source	address	may match.

       msg    Message descriptor for send and receive operations.

       flags  Additional flags to apply	for the	send or	receive	operation.

       context
	      User specified pointer to	associate with	the  operation.	  This
	      parameter	 is  ignored if	the operation will not generate	a suc-
	      cessful completion, unless an op flag specifies the context  pa-
	      rameter be used for required input.

DESCRIPTION
       The  send  functions  -	fi_send,  fi_sendv, fi_sendmsg,	fi_inject, and
       fi_senddata - are used to transmit a message from one endpoint  to  an-
       other  endpoint.	  The  main  difference	between	send functions are the
       number and type of parameters that they accept  as  input.   Otherwise,
       they perform the	same general function.	Messages sent using fi_msg op-
       erations	 are received by a remote endpoint into	a buffer posted	to re-
       ceive such messages.

       The receive functions - fi_recv,	fi_recvv, fi_recvmsg  -	 post  a  data
       buffer to an endpoint to	receive	inbound	messages.  Similar to the send
       operations,  receive  operations	 operate asynchronously.  Users	should
       not touch the posted data buffer(s) until  the  receive	operation  has
       completed.

       An  endpoint must be enabled before an application can post send	or re-
       ceive operations	to it.	For connected endpoints, receive  buffers  may
       be  posted  prior  to  connect  or accept being called on the endpoint.
       This ensures that buffers are available to receive incoming data	 imme-
       diately after the connection has	been established.

       Completed  message  operations  are reported to the user	through	one or
       more event collectors associated	with the endpoint.  Users provide con-
       text which are associated with each operation, and is returned  to  the
       user  as	 part of the event completion.	See fi_cq for completion event
       details.

   fi_send
       The call	fi_send	transfers the data contained in	the user-specified da-
       ta buffer to a remote endpoint, with  message  boundaries  being	 main-
       tained.

   fi_sendv
       The  fi_sendv  call  adds support for a scatter-gather list to fi_send.
       The fi_sendv transfers the set of data buffers referenced  by  the  iov
       parameter to a remote endpoint as a single message.

   fi_sendmsg
       The  fi_sendmsg	call  supports	data transfers over both connected and
       connectionless endpoints, with the ability to control the  send	opera-
       tion  per call through the use of flags.	 The fi_sendmsg	function takes
       a struct	fi_msg as input.

	      struct fi_msg {
		  const	struct iovec *msg_iov; /* scatter-gather array */
		  void		     **desc;   /* local	request	descriptors */
		  size_t	     iov_count;/* # elements in	iov */
		  fi_addr_t	     addr;     /* optional endpoint address */
		  void		     *context; /* user-defined context */
		  uint64_t	     data;     /* optional message data	*/
	      };

   fi_inject
       The send	inject call is an optimized version of fi_send with  the  fol-
       lowing characteristics.	The data buffer	is available for reuse immedi-
       ately  on  return from the call,	and no CQ entry	will be	written	if the
       transfer	completes successfully.

       Conceptually, this means	that the fi_inject function behaves as if  the
       FI_INJECT  transfer  flag  were set, selective completions are enabled,
       and the FI_COMPLETION flag is not specified.  Note that	the  CQ	 entry
       will  be	 suppressed even if the	default	behavior of the	endpoint is to
       write CQ	entries	for all	successful completions.	 See the flags discus-
       sion below for more details.  The requested message size	 that  can  be
       used with fi_inject is limited by inject_size.

   fi_senddata
       The send	data call is similar to	fi_send, but allows for	the sending of
       remote CQ data (see FI_REMOTE_CQ_DATA flag) as part of the transfer.

   fi_injectdata
       The  inject data	call is	similar	to fi_inject, but allows for the send-
       ing of remote CQ	data (see  FI_REMOTE_CQ_DATA  flag)  as	 part  of  the
       transfer.

   fi_recv
       The fi_recv call	posts a	data buffer to the receive queue of the	corre-
       sponding	 endpoint.  Posted receives are	searched in the	order in which
       they were posted	in order to match sends.  Message boundaries are main-
       tained.	The order in which the receives	complete is dependent  on  the
       endpoint	type and protocol.  For	connectionless endpoints, the src_addr
       parameter can be	used to	indicate that a	buffer should be posted	to re-
       ceive incoming data from	a specific remote endpoint.

   fi_recvv
       The  fi_recvv  call  adds support for a scatter-gather list to fi_recv.
       The fi_recvv posts the set of data buffers referenced by	the iov	 para-
       meter to	a receive incoming data.

   fi_recvmsg
       The  fi_recvmsg	call  supports posting buffers over both connected and
       connectionless endpoints, with the ability to control the receive oper-
       ation per call through the use of flags.	 The fi_recvmsg	function takes
       a struct	fi_msg as input.

FLAGS
       The fi_recvmsg and fi_sendmsg calls allow the  user  to	specify	 flags
       which  can  change the default message handling of the endpoint.	 Flags
       specified with fi_recvmsg / fi_sendmsg override most  flags  previously
       configured  with	 the endpoint, except where noted (see fi_endpoint.3).
       The  following  list  of	 flags	are  usable  with  fi_recvmsg	and/or
       fi_sendmsg.

       FI_REMOTE_CQ_DATA
	      Applies to fi_sendmsg and	fi_senddata.  Indicates	that remote CQ
	      data  is	available  and	should be sent as part of the request.
	      See fi_getinfo for additional details on FI_REMOTE_CQ_DATA.

       FI_CLAIM
	      Applies to posted	receive	operations  for	 endpoints  configured
	      for  FI_BUFFERED_RECV  or	FI_VARIABLE_MSG.  This flag is used to
	      retrieve a message that was buffered by the provider.   See  the
	      Buffered Receives	section	for details.

       FI_COMPLETION
	      Indicates	 that  a  completion entry should be generated for the
	      specified	operation.  The	endpoint must be bound to a completion
	      queue with FI_SELECTIVE_COMPLETION that corresponds to the spec-
	      ified operation, or this flag is ignored.

       FI_DISCARD
	      Applies to posted	receive	operations  for	 endpoints  configured
	      for  FI_BUFFERED_RECV  or	FI_VARIABLE_MSG.  This flag is used to
	      free a message that was  buffered	 by  the  provider.   See  the
	      Buffered Receives	section	for details.

       FI_MORE
	      Indicates	 that the user has additional requests that will imme-
	      diately be posted	after the current call returns.	 Use  of  this
	      flag  may	 improve performance by	enabling the provider to opti-
	      mize its access to the fabric hardware.

       FI_INJECT
	      Applies to fi_sendmsg.  Indicates	that the outbound data	buffer
	      should  be  returned to user immediately after the send call re-
	      turns, even if the operation is  handled	asynchronously.	  This
	      may require that the underlying provider implementation copy the
	      data  into a local buffer	and transfer out of that buffer.  This
	      flag can only be used with messages smaller than inject_size.

       FI_MULTI_RECV
	      Applies to posted	receive	operations.  This flag allows the user
	      to post a	single buffer that will	receive	multiple incoming mes-
	      sages.  Received messages	will be	packed into the	receive	buffer
	      until the	buffer has been	consumed.  Use of this flag may	 cause
	      a	single posted receive operation	to generate multiple events as
	      messages	are placed into	the buffer.  The placement of received
	      data into	the buffer  may	 be  subjected	to  provider  specific
	      alignment	restrictions.

       The  buffer  will be released by	the provider when the available	buffer
       space falls below the specified	minimum	 (see  FI_OPT_MIN_MULTI_RECV).
       Note  that an entry to the associated receive completion	queue will al-
       ways be generated when the buffer has been consumed, even if other  re-
       ceive  completions  have	 been suppressed (i.e. the Rx context has been
       configured for FI_SELECTIVE_COMPLETION).	 See the FI_MULTI_RECV comple-
       tion flag fi_cq(3).

       FI_INJECT_COMPLETE
	      Applies to fi_sendmsg.  Indicates	that a	completion  should  be
	      generated	when the source	buffer(s) may be reused.

       FI_TRANSMIT_COMPLETE
	      Applies to fi_sendmsg and	fi_recvmsg.  For sends,	indicates that
	      a	 completion  should  not  be generated until the operation has
	      been successfully	transmitted and	is no longer being tracked  by
	      the  provider.  For receive operations, indicates	that a comple-
	      tion may be generated as soon as the message has been  processed
	      by the local provider, even if the message data may not be visi-
	      ble  to  all  processing elements.  See fi_cq(3) for target side
	      completion semantics.

       FI_DELIVERY_COMPLETE
	      Applies to fi_sendmsg.  Indicates	that a	completion  should  be
	      generated	 when the operation has	been processed by the destina-
	      tion.

       FI_FENCE
	      Applies to transmits.  Indicates that the	 requested  operation,
	      also known as the	fenced operation, and any operation posted af-
	      ter the fenced operation will be deferred	until all previous op-
	      erations targeting the same peer endpoint	have completed.	 Oper-
	      ations  posted after the fencing will see	and/or replace the re-
	      sults of any operations initiated	prior to the fenced operation.

       The ordering of operations starting at the posting of the fenced	opera-
       tion (inclusive)	to the posting of a subsequent fenced  operation  (ex-
       clusive)	is controlled by the endpoint's	ordering semantics.

       FI_MULTICAST
	      Applies  to  transmits.	This  flag  indicates that the address
	      specified	as the data transfer destination is  a	multicast  ad-
	      dress.   This  flag  must	be used	in all multicast transfers, in
	      conjunction with a multicast fi_addr_t.

Buffered Receives
       Buffered	receives indicate that the networking layer allocates and man-
       ages the	data buffers used to receive network data transfers.  As a re-
       sult, received messages must be copied from the	network	 buffers  into
       application  buffers  for  processing.  However,	applications can avoid
       this copy if they are able to process the message  in  place  (directly
       from the	networking buffers).

       Handling	buffered receives differs based	on the size of the message be-
       ing  sent.  In general, smaller messages	are passed directly to the ap-
       plication for processing.  However, for large messages, an  application
       will  only  receive  the	 start of the message and must claim the rest.
       The details for how small messages are reported and large messages  may
       be claimed are described	below.

       When  a provider	receives a message, it will write an entry to the com-
       pletion queue associated	with the receiving endpoint.   For  discussion
       purposes,  the  completion  queue  is  assumed  to  be  configured  for
       FI_CQ_FORMAT_DATA.  Since buffered receives are not associated with ap-
       plication posted	buffers, the CQ	 entry	op_context  will  point	 to  a
       struct fi_recv_context.

	      struct fi_recv_context {
		  struct fid_ep	*ep;
		  void *context;
	      };

       The  `ep' field will point to the receiving endpoint or Rx context, and
       `context' will be NULL.	The CQ entry's `buf' will point	to a  provider
       managed	buffer where the start of the received message is located, and
       `len' will be set to the	total size of the message.

       The maximum sized message that a	provider can buffer is limited	by  an
       FI_OPT_BUFFERED_LIMIT.	This  threshold	can be obtained	and may	be ad-
       justed by the application using the fi_getopt and fi_setopt calls,  re-
       spectively.   Any  adjustments  must be made prior to enabling the end-
       point.  The CQ entry `buf' will point to	a buffer of received data.  If
       the sent	message	is larger than	the  buffered  amount,	the  CQ	 entry
       `flags'	will  have  the	FI_MORE	bit set.  When the FI_MORE bit is set,
       `buf' will reference at least FI_OPT_BUFFERED_MIN bytes	of  data  (see
       fi_endpoint.3 for more info).

       After  being notified that a buffered receive has arrived, applications
       must either claim or discard the	message.   Typically,  small  messages
       are  processed and discarded, while large messages are claimed.	Howev-
       er, an application is free to claim or discard any  message  regardless
       of message size.

       To  claim  a message, an	application must post a	receive	operation with
       the FI_CLAIM flag set.  The struct fi_recv_context returned as part  of
       the  notification  must be provided as the receive operation's context.
       The struct fi_recv_context contains a  `context'	 field.	  Applications
       may  modify  this  field	prior to claiming the message.	When the claim
       operation completes, a standard receive completion entry	will be	gener-
       ated on the completion queue.  The `context' of the associated CQ entry
       will be set to the `context' value passed in through  the  fi_recv_con-
       text structure, and the CQ entry	flags will have	the FI_CLAIM bit set.

       Buffered	 receives that are not claimed must be discarded by the	appli-
       cation when it is done processing the CQ	entry data.  To	discard	a mes-
       sage, an	application must post a	receive	operation with the  FI_DISCARD
       flag set.  The struct fi_recv_context returned as part of the notifica-
       tion  must  be  provided	 as the	receive	operation's context.  When the
       FI_DISCARD flag is set for  a  receive  operation,  the	receive	 input
       buffer(s) and length parameters are ignored.

       IMPORTANT:  Buffered  receives must be claimed or discarded in a	timely
       manner.	Failure	to do so may result in increased memory	usage for net-
       work buffering or communication stalls.	Once a	buffered  receive  has
       been  claimed  or  discarded,  the  original  CQ	 entry `buf' or	struct
       fi_recv_context data may	no longer be accessed by the application.

       The use of the FI_CLAIM and FI_DISCARD  operation  flags	 is  also  de-
       scribed	with  respect  to  tagged  message  transfers  in fi_tagged.3.
       Buffered	receives of tagged messages will include the  message  tag  as
       part of the CQ entry, if	available.

       The handling of buffered	receives follows all message ordering restric-
       tions  assigned	to an endpoint.	 For example, completions may indicate
       the order in which received messages arrived at the receiver  based  on
       the endpoint attributes.

Variable Length	Messages
       Variable	 length	 messages,  or simply variable messages, are transfers
       where the size of the message is	unknown	to the receiver	prior  to  the
       message	being sent.  It	indicates that the recipient of	a message does
       not know	the amount of data to expect prior to  the  message  arriving.
       It  is  most  commonly  used  when the size of message transfers	varies
       greatly,	with very large	messages interspersed with much	 smaller  mes-
       sages,  making  receive	side  message  buffering  difficult to manage.
       Variable	messages are not subject to max	 message  length  restrictions
       (i.e. struct  fi_ep_attr::max_msg_size  limits),	 and  may be up	to the
       maximum value of	size_t (e.g. SIZE_MAX) in length.

       Variable	length messages	support	requests that  the  provider  allocate
       and  manage  the	network	message	buffers.  As a result, the application
       requirements and	provider behavior is identical as  those  defined  for
       supporting  the	FI_BUFFERED_RECV  mode	bit.  See the Buffered Receive
       section above for details.  The main difference is  that	 buffered  re-
       ceives  are  limited by the fi_ep_attr::max_msg_size threshold, whereas
       variable	length messages	are not.

       Support for variable messages is	indicated through the  FI_VARIABLE_MSG
       capability bit.

NOTES
       If  an endpoint has been	configured with	FI_MSG_PREFIX, the application
       must include buffer space of size msg_prefix_size, as specified by  the
       endpoint	 attributes.  The prefix buffer	must occur at the start	of the
       data referenced by the buf parameter, or	be referenced by the first  IO
       vector.	 Message prefix	space cannot be	split between multiple IO vec-
       tors.  The size of the prefix buffer should be included as part of  the
       total buffer length.

RETURN VALUE
       Returns 0 on success.  On error,	a negative value corresponding to fab-
       ric  errno is returned.	Fabric errno values are	defined	in rdma/fi_er-
       rno.h.

       See the discussion below	for details handling FI_EAGAIN.

ERRORS
       -FI_EAGAIN
	      Indicates	that the underlying provider currently lacks  the  re-
	      sources needed to	initiate the requested operation.  The reasons
	      for  a provider returning	FI_EAGAIN are varied.  However,	common
	      reasons include insufficient internal buffering or full process-
	      ing queues.

       Insufficient internal buffering is  often  associated  with  operations
       that  use  FI_INJECT.   In  such	cases, additional buffering may	become
       available as posted operations complete.

       Full processing queues may be a temporary state related to  local  pro-
       cessing	(for example, a	large message is being transferred), or	may be
       the result of flow control.  In the latter case,	the queues may	remain
       blocked	until  additional  resources  are made available at the	remote
       side of the transfer.

       In all cases, the operation may be retried after	 additional  resources
       become  available.   It is strongly recommended that applications check
       for transmit and	receive	completions after receiving FI_EAGAIN as a re-
       turn value, independent of the operation	which failed.  This is partic-
       ularly important	in cases where manual progress	is  employed,  as  ac-
       knowledgements or flow control messages may need	to be processed	in or-
       der to resume execution.

SEE ALSO
       fi_getinfo(3), fi_endpoint(3), fi_domain(3), fi_cq(3)

AUTHORS
       OpenFabrics.

Libfabric Programmer's Manual	  2021-03-23			     fi_msg(3)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=fi_injectdata&sektion=3&manpath=FreeBSD+Ports+14.3.quarterly>

home | help