Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
fi_rxm(7)		       Libfabric v1.15.1		     fi_rxm(7)

NAME
       fi_rxm -	The RxM	(RDM over MSG) Utility Provider

OVERVIEW
       The  RxM	 provider  (ofi_rxm)  is  an  utility  provider	 that supports
       FI_EP_RDM type endpoint emulated	over FI_EP_MSG type endpoint(s)	of  an
       underlying core provider.  FI_EP_RDM endpoints have a reliable datagram
       interface  and RxM emulates this	by hiding the connection management of
       underlying FI_EP_MSG endpoints from the user.   Additionally,  RxM  can
       hide memory registration	requirement from a core	provider like verbs if
       the apps	don't support it.

REQUIREMENTS
   Requirements	for core provider
       RxM  provider  requires the core	provider to support the	following fea-
       tures:

        MSG endpoints (FI_EP_MSG)

        RMA read/write	(FI_RMA) - Used	for implementing  rendezvous  protocol
	 for large messages.

        FI_OPT_CM_DATA_SIZE of	at least 24 bytes.

   Requirements	for applications
       Since  RxM  emulates  RDM endpoints by hiding connection	management and
       connections are established only	on-demand (when	app tries to send  da-
       ta), the	first several data transfer calls would	return EAGAIN.	Appli-
       cations should be aware of this and retry until the operation succeeds.

       If  an  application  has	 chosen	 manual	progress for data progress, it
       should also read	the CQ so that the connection establishment  progress-
       es.  Not	doing so would result in a stall.  See also the	ERRORS section
       in fi_msg(3).

SUPPORTED FEATURES
       The  RxM	 provider  currently  supports	FI_MSG,	 FI_TAGGED, FI_RMA and
       FI_ATOMIC capabilities.

       Endpoint	types
	      The provider supports only FI_EP_RDM.

       Endpoint	capabilities
	      The following data  transfer  interface  is  supported:  FI_MSG,
	      FI_TAGGED, FI_RMA, FI_ATOMIC.

       Progress
	      The   RxM	  provider   supports	both   FI_PROGRESS_MANUAL  and
	      FI_PROGRESS_AUTO.	 Manual	progress in general has	better connec-
	      tion scale-up and	lower CPU utilization since there's  no	 sepa-
	      rate auto-progress thread.

       Addressing Formats
	      FI_SOCKADDR, FI_SOCKADDR_IN

       Memory Region
	      FI_MR_VIRT_ADDR,	FI_MR_ALLOCATED,  FI_MR_PROV_KEY  MR mode bits
	      would be required	from the app in	case  the  core	 provider  re-
	      quires it.

LIMITATIONS
       When  using  RxM	 provider,  some  limitations  from the	underlying MSG
       provider	could also show	up.  Please refer  to  the  corresponding  MSG
       provider	man pages to find about	those limitations.

   Unsupported features
       RxM provider does not support the following features:

        op_flags: FI_FENCE.

        Scalable endpoints

        Shared	contexts

        FABRIC_DIRECT

        FI_MR_SCALABLE

        Authorization keys

        Application error data	buffers

        Multicast

        FI_SYNC_ERR

        Reporting unknown source addr data as part of completions

        Triggered operations

   Progress limitations
       When sending large messages, an app doing an sread or waiting on	the CQ
       file  descriptor	may not	get a completion when reading the CQ after be-
       ing woken up from the wait.  The	app has	to do sread  or	 wait  on  the
       file  descriptor	 again.	  This is needed because RxM uses a rendezvous
       protocol	for large message sends.  An app would get woken up from wait-
       ing on CQ fd when rendezvous protocol request completes	but  it	 would
       have  to	 wait again to get an ACK from the receiver indicating comple-
       tion of large message transfer by remote	RMA read.

   FI_ATOMIC limitations
       The FI_ATOMIC capability	will only be listed  in	 the  fi_info  if  the
       fi_info	hints parameter	specifies FI_ATOMIC.  If FI_ATOMIC is request-
       ed, message  order  FI_ORDER_RAR,  FI_ORDER_RAW,	 FI_ORDER_WAR,	FI_OR-
       DER_WAW,	FI_ORDER_SAR, and FI_ORDER_SAW can not be supported.

   Miscellaneous limitations
        RxM protocol peers should have	same endian-ness otherwise connections
	 won't	successfully  complete.	  This	enables	 better	performance at
	 run-time as byte order	translations are avoided.

RUNTIME	PARAMETERS
       The ofi_rxm provider checks for the following environment variables.

       FI_OFI_RXM_BUFFER_SIZE
	      Defines the transmit buffer size /  inject  size.	  Messages  of
	      size  less  than this would be transmitted via an	eager protocol
	      and those	above would be transmitted via	a  rendezvous  or  SAR
	      (Segmentation  And Reassembly) protocol.	Transmit data would be
	      copied up	to this	size (default: ~16k).

       FI_OFI_RXM_COMP_PER_PROGRESS
	      Defines the maximum number of MSG	provider CQ entries  (default:
	      1) that would be read per	progress (RxM CQ read).

       FI_OFI_RXM_ENABLE_DYN_RBUF
	      Enables  support	for dynamic receive buffering, if available by
	      the message  endpoint  provider.	 This  feature	allows	direct
	      placement	of received message data into application buffers, by-
	      passing RxM bounce buffers.  This	feature	targets	providers that
	      provide  internal	 network  buffering, such as the tcp provider.
	      (default:	false)

       FI_OFI_RXM_SAR_LIMIT
	      Set this environment variable to control the RxM SAR  (Segmenta-
	      tion  And	 Reassembly)  protocol.	 Messages of size greater than
	      this (default: 128 Kb) would be transmitted via rendezvous  pro-
	      tocol.

       FI_OFI_RXM_USE_SRX
	      Set  this	 to 1 to use shared receive context from MSG provider,
	      or 0 to disable using shared receive  context.   Shared  receive
	      contexts	reduce	overall	memory usage, but may increase in mes-
	      sage latency.  If	not set, verbs will  not  use  shared  receive
	      contexts by default, but the tcp provider	will.

       FI_OFI_RXM_TX_SIZE
	      Defines default TX context size (default:	1024)

       FI_OFI_RXM_RX_SIZE
	      Defines default RX context size (default:	1024)

       FI_OFI_RXM_MSG_TX_SIZE
	      Defines  FI_EP_MSG  TX  size  that  would	be requested (default:
	      128).

       FI_OFI_RXM_MSG_RX_SIZE
	      Defines FI_EP_MSG	RX size	 that  would  be  requested  (default:
	      128).

       FI_UNIVERSE_SIZE
	      Defines  the  expected number of ranks / peers an	endpoint would
	      communicate with (default: 256).

       FI_OFI_RXM_CM_PROGRESS_INTERVAL
	      Defines the duration of time in microseconds  between  calls  to
	      RxM CM progression functions when	using manual progress.	Higher
	      values may provide less noise for	calls to fi_cq read functions,
	      but may increase connection setup	time (default: 10000)

       FI_OFI_RXM_CQ_EQ_FAIRNESS
	      Defines  the  maximum number of message provider CQ entries that
	      can be consecutively read	across progress	calls without checking
	      to see if	the CM progress	interval has  been  reached  (default:
	      128)

Tuning
   Bandwidth
       To  optimize  for  bandwidth, ensure you	use higher values than default
       for  FI_OFI_RXM_TX_SIZE,	 FI_OFI_RXM_RX_SIZE,   FI_OFI_RXM_MSG_TX_SIZE,
       FI_OFI_RXM_MSG_RX_SIZE  subject	to memory limits of the	system and the
       tx and rx sizes supported by the	MSG provider.

       FI_OFI_RXM_SAR_LIMIT is another knob that can be	experimented  with  to
       optimze for bandwidth.

   Memory
       To  conserve  memory,  ensure FI_UNIVERSE_SIZE set to what is required.
       Similarly   check    that    FI_OFI_RXM_TX_SIZE,	   FI_OFI_RXM_RX_SIZE,
       FI_OFI_RXM_MSG_TX_SIZE and FI_OFI_RXM_MSG_RX_SIZE env variables are set
       to only required	values.

NOTES
       The data	transfer API may return	-FI_EAGAIN during on-demand connection
       setup of	the core provider FI_MSG_EP.  See fi_msg(3) for	a detailed de-
       scription of handling FI_EAGAIN.

Troubleshooting	/ Known	issues
       If  an RxM endpoint is expected to communicate with more	peers than the
       default value of	FI_UNIVERSE_SIZE (256) CQ  overruns  can  happen.   To
       avoid  this  set	 a  higher value for FI_UNIVERSE_SIZE.	CQ overrun can
       make a MSG endpoint unusable.

       At higher # of ranks, there may be connection errors due	to a node run-
       ning out	of memory.  The	workaround is to use shared  receive  contexts
       for  the	 MSG  provider	(FI_OFI_RXM_USE_SRX=1) or reduce eager message
       size  (FI_OFI_RXM_BUFFER_SIZE)  and  MSG	 provider  TX/RX  queue	 sizes
       (FI_OFI_RXM_MSG_TX_SIZE / FI_OFI_RXM_MSG_RX_SIZE).

SEE ALSO
       fabric(7), fi_provider(7), fi_getinfo(3)

AUTHORS
       OpenFabrics.

Libfabric Programmer's Manual	  2021-03-22			     fi_rxm(7)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=fi_rxm&sektion=7&manpath=FreeBSD+Ports+14.3.quarterly>

home | help