FreeBSD Manual Pages

home | help
SMR(9)			   Kernel Developer's Manual			SMR(9)

NAME
       smr -- safe memory reclamation for lock-free data structures

SYNOPSIS
       #include	<sys/smr.h>

       smr_seq_t
       smr_advance(smr_t smr);

       smr_t
       smr_create(const	char *name);

       void
       smr_destroy(smr_t smr);

       void
       smr_enter(smr_t smr);

       void
       smr_exit(smr_t smr);

       bool
       smr_poll(smr_t smr, smr_seq_t goal, bool	wait);

       void
       smr_synchronize(smr_t smr);

       void
       smr_wait(smr_t smr, smr_seq_t goal);

DESCRIPTION
       Safe Memory Reclamation (SMR) is	a facility which enables the implemen-
       tation  of  memory-safe	lock-free  data	structures.  In	typical	usage,
       read accesses to	an SMR-protected data structure, such as a hash	 table
       or tree,	are performed in a "read section" consisting of	code bracketed
       by smr_enter() and smr_exit() calls, while mutations of the data	struc-
       ture  are  serialized by	a traditional mutex such as mutex(9).  In con-
       trast with reader-writer	locks such as rwlock(9), rmlock(9), and	sx(9),
       SMR allows readers and writers to access	 the  data  structure  concur-
       rently.	  Readers   can	  always  enter	 a  read  section  immediately
       (smr_enter() never blocks), so mutations	do not introduce read latency.
       Furthermore, smr_enter()	and smr_exit() operate only  on	 per-CPU  data
       and  thus avoid some of the performance problems	inherent in the	imple-
       mentation of traditional	reader-writer mutexes.	SMR can	therefore be a
       useful building block for data structures which are accessed frequently
       but are only rarely modified.

       Note that any SMR-protected data	structure must	be  implemented	 care-
       fully  such  that  operations behave correctly in the absence of	mutual
       exclusion between readers and writers.  The data	structure must be  de-
       signed  to be lock-free;	SMR merely facilitates the implementation, for
       example by making it safe to follow dangling pointers  and  by  helping
       avoid the ABA problem.

       When  shared  accesses to and mutations of a data structure can proceed
       concurrently, writers must take care to ensure that any	items  removed
       from the	structure are not freed	and recycled while readers are access-
       ing them	in parallel.  This requirement results in a two-phase approach
       to  the	removal	 of  items:  first, the	item is	unlinked such that all
       pointers	to the item are	removed	from the structure, preventing any new
       readers from observing the item.	 Then, the  writer  waits  until  some
       mechanism  guarantees  that no existing readers are still accessing the
       item.  At that point the	memory for that	item can be freed  and	reused
       safely.	 SMR  provides	this mechanism:	readers	may access a lock-free
       data structure in between calls to the smr_enter() and smr_exit() func-
       tions, which together create a read  section,  and  the	smr_advance(),
       smr_poll(),  smr_wait(),	and smr_synchronize() functions	can be used to
       wait for	threads	in read	sections to finish.  All  of  these  functions
       operate	on  a  smr_t  state  block which holds both per-CPU and	global
       state.  Readers load global state and modify per-CPU state, while writ-
       ers must	scan all per-CPU states	to detect active readers.  SMR is  de-
       signed to amortize this cost by batching	to give	acceptable performance
       in write-heavy workloads.

   Readers
       Threads	enter  a  read	section	by calling smr_enter().	 Read sections
       should be short,	and many operations are	not permitted while in a  read
       section.	 Specifically, context switching is disabled, and thus readers
       may not acquire blocking	mutexes	such as	mutex(9).  Another consequence
       of  this	 is that the thread is pinned to the current CPU for the dura-
       tion of the read	 section.   Furthermore,  read	sections  may  not  be
       nested:	it  is	incorrect to call smr_enter() with a given smr_t state
       block when already in a read section for	that state block.

   UMA Integration
       To simplify the integration of SMR into consumers,  the	uma(9)	kernel
       memory  allocator  provides some	SMR-specified facilities.  This	elimi-
       nates a good deal of complexity from the	 implementation	 of  consumers
       and automatically batches write operations.

       In  typical  usage,  a  UMA zone	(created with the UMA_ZONE_SMR flag or
       initialized with	the uma_zone_set_smr() function)  is  coupled  with  a
       smr_t state block.  To insert an	item into an SMR-protected data	struc-
       ture,  memory  is  allocated  from  the zone using the uma_zalloc_smr()
       function.  Insertions and removals are serialized using traditional mu-
       tual exclusion and items	are freed using	the uma_zfree_smr()  function.
       Read-only  lookup  operations  are  performed  in  SMR  read  sections.
       uma_zfree_smr() waits for all active readers which may be accessing the
       freed item to finish their read sections	before recycling  that	item's
       memory.

       If  the	zone has an associated per-item	destructor, it will be invoked
       at some point when no readers can be accessing a	given item.   The  de-
       structor	 can  be  used to release additional resources associated with
       the item.  Note however that there is no	guarantee that the  destructor
       will be invoked in a bounded time period.

   Writers
       Consumers are expected to use SMR in conjunction	with UMA and thus need
       only  make use of the smr_enter() and smr_exit()	functions, and the SMR
       helper macros defined in	sys/smr_types.h.  However, an introduction  to
       the write-side interface	of SMR can be useful.

       Internally,  SMR	 maintains a global `write sequence' number.  When en-
       tering a	read section, smr_enter() loads	a copy of the  write  sequence
       and stores it in	per-CPU	memory,	hence `observing' that value.  To exit
       a  read	section,  this	per-CPU	 memory	is overwritten with an invalid
       value, making the CPU inactive.	Writers	perform	 two  operations:  ad-
       vancing	the write sequence number, and polling all CPUs	to see whether
       active readers have observed a given sequence number.  These operations
       are performed by	smr_advance() and smr_poll(), respectively,  which  do
       not require serialization between multiple writers.

       After a writer unlinks an item from a data structure, it	increments the
       write  sequence number and tags the item	with the new value returned by
       smr_advance().  Once all	CPUs have observed the new value,  the	writer
       can  use	smr_poll() to deduce that no active readers have access	to the
       unlinked	item, and thus the item	is safe	to recycle.  Because this pair
       of operations is	relatively expensive, it is generally a	good  idea  to
       amortize	 this  cost  by	accumulating a collection of multiple unlinked
       items and tagging the entire batch with a target	write sequence number.

       smr_poll() is a non-blocking operation and returns true only if all ac-
       tive readers are	guaranteed to have observed the	target sequence	number
       value.  smr_wait() is a variant of smr_poll()  which  waits  until  all
       CPUs have observed the target sequence number value.  smr_synchronize()
       combines	 smr_advance() with smr_wait() to wait for all CPUs to observe
       a new write sequence number.  This is an	expensive operation and	should
       only be used if polling cannot be deferred in some way.

   Memory Ordering
       The smr_enter() function	has  acquire  semantics,  and  the  smr_exit()
       function	  has	release	 semantics.   The  smr_advance(),  smr_poll(),
       smr_wait(), and smr_synchronize() functions should not  be  assumed  to
       have any	guarantees with	respect	to memory ordering.  In	practice, some
       of  these  functions  have  stronger  ordering semantics	than is	stated
       here, but this is specific to the implementation	and should not be  re-
       lied upon.  See atomic(9) for more details.

NOTES
       Outside	of FreeBSD the acronym SMR typically refers to a family	of al-
       gorithms	which enable memory-safe read-only access to a data  structure
       concurrent with modifications to	that data structure.  Here, SMR	refers
       to  a particular	algorithm belonging to this family, as well as its im-
       plementation in FreeBSD.	 See sys/sys/smr.h and sys/kern/subr_smr.c  in
       the  FreeBSD  source  tree for further details on the algorithm and the
       context.

       The acronym SMR is also used to mean "shingled magnetic	recording",  a
       technology  used	to store data on hard disk drives which	requires oper-
       ating system support.  These two	uses of	the acronym are	unrelated.

SEE ALSO
       atomic(9), locking(9), uma(9)

AUTHORS
       The SMR algorithm and its implementation	were provided by Jeff Roberson
       <jeff@FreeBSD.org>.  This manual	page was written by
       Mark Johnston <markj@FreeBSD.org>.

FreeBSD	15.0		       January 17, 2023				SMR(9)
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=smr_wait&sektion=9&manpath=FreeBSD+15.0-RELEASE+and+Ports.quarterly>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages