Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
LOCKING(9)		 BSD Kernel Developer's	Manual		    LOCKING(9)

     locking --	kernel synchronization primitives

     All sorts of stuff	to go here.

     The FreeBSD kernel	is written to run across multiple CPUs and as such re-
     quires several different synchronization primitives to allow the develop-
     ers to safely access and manipulate the many data types required.

     These include:

     1.	  Spin Mutexes

     2.	  Sleep	Mutexes

     3.	  pool Mutexes

     4.	  Shared-Exclusive locks

     5.	  Reader-Writer	locks

     6.	  Turnstiles

     7.	  Semaphores

     8.	  Condition variables

     9.	  Sleep/wakeup

     10.  Giant

     11.  Lockmanager locks

     The primitives interact and have a	number of rules	regarding how they can
     and can not be combined.  There are too many for the average human	mind
     and they keep changing.  (if you disagree,	please write replacement text)

     Some of these primitives may be used at the low (interrupt) level and
     some may not.

     There are strict ordering requirements and	for some of the	types this is
     checked using the witness(4) code.

   SPIN	Mutexes
     Mutexes are the basic primitive.  You either hold it or you don't.	 If
     you don't own it then you just spin, waiting for the holder (on another
     CPU) to release it.  Hopefully they are doing something fast.  You	must
     not do anything that deschedules the thread while you are holding a SPIN

     Basically (regular) mutexes will deschedule the thread if the mutex can
     not be acquired.  A non-spin mutex	can be considered to be	equivalent to
     getting a write lock on an	rw_lock	(see below), and in fact non-spin mu-
     texes and rw_locks	may soon become	the same thing.	 As in spin mutexes,
     you either	get it or you don't.  You may only call	the sleep(9) call via
     msleep() or the new mtx_sleep() variant.  These will atomically drop the
     mutex and reacquire it as part of waking up.  This	is often however a BAD
     idea because it generally relies on you having such a good	knowledge of
     all the call graph	above you and what assumptions it is making that there
     are a lot of ways to make hard-to-find mistakes.  For example you MUST
     re-test all the assumptions you made before, all the way up the call
     graph to where you	got the	lock.  You can not just	assume that mtx_sleep
     can be inserted anywhere.	If any caller above you	has any	mutex or
     rwlock, your sleep, will cause a panic.  If the sleep only	happens	rarely
     it	may be years before the	bad code path is found.

   Pool	Mutexes
     A variant of regular mutexes where	the allocation of the mutex is handled
     more by the system.

     Reader/writer locks allow shared access to	protected data by multiple
     threads, or exclusive access by a single thread.  The threads with	shared
     access are	known as readers since they should only	read the protected
     data.  A thread with exclusive access is known as a writer	since it may
     modify protected data.

     Although reader/writer locks look very similar to sx(9) (see below)
     locks, their usage	pattern	is different.  Reader/writer locks can be
     treated as	mutexes	(see above and mutex(9)) with shared/exclusive seman-
     tics.  More specifically, regular mutexes can be considered to be equiva-
     lent to a write-lock on an	rw_lock. In the	future this may	in fact	become
     literally the fact.  An rw_lock can be locked while holding a regular mu-
     tex, but can not be held while sleeping.  The rw_lock locks have priority
     propagation like mutexes, but priority can	be propagated only to an ex-
     clusive holder.  This limitation comes from the fact that shared owners
     are anonymous.  Another important property	is that	shared holders of
     rw_lock can recurse, but exclusive	locks are not allowed to recurse.
     This ability should not be	used lightly and may go	away. Users of recur-
     sion in any locks should be prepared to defend their decision against
     vigorous criticism.

     Shared/exclusive locks are	used to	protect	data that are read far more
     often than	they are written.  Mutexes are inherently more efficient than
     shared/exclusive locks, so	shared/exclusive locks should be used pru-
     dently.  The main reason for using	an sx_lock is that a thread may	hold a
     shared or exclusive lock on an sx_lock lock while sleeping.  As a conse-
     quence of this however, an	sx_lock	lock may not be	acquired while holding
     a mutex.  The reason for this is that, if one thread slept	while holding
     an	sx_lock	lock while another thread blocked on the same sx_lock lock af-
     ter acquiring a mutex, then the second thread would effectively end up
     sleeping while holding a mutex, which is not allowed.  The	sx_lock	should
     be	considered to be closely related to sleep(9).  In fact it could	in
     some cases	be considered a	conditional sleep.

     Turnstiles	are used to hold a queue of threads blocked on non-sleepable
     locks.  Sleepable locks use condition variables to	implement their
     queues.  Turnstiles differ	from a sleep queue in that turnstile queue's
     are assigned to a lock held by an owning thread.  Thus, when one thread
     is	enqueued onto a	turnstile, it can lend its priority to the owning
     thread.  If this sounds confusing,	we need	to describe it better.

   Condition variables
     Condition variables are used in conjunction with mutexes to wait for con-
     ditions to	occur.	A thread must hold the mutex before calling the
     cv_wait*(), functions.  When a thread waits on a condition, the mutex is
     atomically	released before	the thread is blocked, then reacquired before
     the function call returns.

     Giant is a	special	instance of a sleep lock.  It has several special

     1.	  It is	recursive.

     2.	  Drivers can request that Giant be locked around them,	but this is
	  going	away.

     3.	  You can sleep	while it has recursed, but other recursive locks can-

     4.	  Giant	must be	locked first before other locks.

     5.	  There	are places in the kernel that drop Giant and pick it back up
	  again.  Sleep	locks will do this before sleeping.  Parts of the Net-
	  work or VM code may do this as well, depending on the	setting	of a
	  sysctl.  This	means that you cannot count on Giant keeping other
	  code from running if your code sleeps, even if you want it to.

     The functions tsleep(), msleep(), msleep_spin(), pause(), wakeup(), and
     wakeup_one() handle event-based thread blocking.  If a thread must	wait
     for an external event, it is put to sleep by tsleep(), msleep(),
     msleep_spin(), or pause().	 Threads may also wait using one of the	lock-
     ing primitive sleep routines mtx_sleep(9),	rw_sleep(9), or	sx_sleep(9).

     The parameter chan	is an arbitrary	address	that uniquely identifies the
     event on which the	thread is being	put to sleep.  All threads sleeping on
     a single chan are woken up	later by wakeup(), often called	from inside an
     interrupt routine,	to indicate that the resource the thread was blocking
     on	is available now.

     Several of	the sleep functions including msleep(),	msleep_spin(), and the
     locking primitive sleep routines specify an additional lock parameter.
     The lock will be released before sleeping and reacquired before the sleep
     routine returns.  If priority includes the	PDROP flag, then the lock will
     not be reacquired before returning.  The lock is used to ensure that a
     condition can be checked atomically, and that the current thread can be
     suspended without missing a change	to the condition, or an	associated
     wakeup.  In addition, all of the sleep routines will fully	drop the Giant
     mutex (even if recursed) while the	thread is suspended and	will reacquire
     the Giant mutex before the	function returns.

   lockmanager locks
     Largely deprecated.  See the lock(9) page for more	information.  I	don't
     know what the downsides are but I'm sure someone will fill	in this	part.

Usage tables.
   Interaction table.
     The following table shows what you	can and	can not	do if you hold one of
     the synchronization primitives discussed here: (someone who knows what
     they are talking about should write this table)

	   You have:  You want:	Spin_mtx  Slp_mtx sx_lock rw_lock sleep
	   SPIN	mutex		ok-1	  no	  no	  no	  no-3
	   Sleep mutex		ok	  ok-1	  no	  ok	  no-3
	   sx_lock		ok	  ok	  ok-2	  ok	  ok-4
	   rw_lock		ok	  ok	  no	  ok-2	  no-3

     *1	Recursion is defined per lock.	Lock order is important.

     *2	readers	can recurse though writers can not.  Lock order	is important.

     *3	There are calls	atomically release this	primitive when going to	sleep
     and reacquire it on wakeup	(e.g.  mtx_sleep(), rw_sleep() and
     msleep_spin() ).

     *4	Though one can sleep holding an	sx lock, one can also use sx_sleep()
     which atomically release this primitive when going	to sleep and reacquire
     it	on wakeup.

   Context mode	table.
     The next table shows what can be used in different	contexts.  At this
     time this is a rather easy	to remember table.

	   Context:		Spin_mtx  Slp_mtx sx_lock rw_lock sleep
	   interrupt:		ok	  no	  no	  no	  no
	   idle:		ok	  no	  no	  no	  no

     condvar(9), lock(9), mtx_pool(9), mutex(9), rwlock(9), sema(9), sleep(9),
     sx(9), LOCK_PROFILING(9), WITNESS(9)

     These functions appeared in BSD/OS	4.1 through FreeBSD 7.0

BSD				March 14, 2007				   BSD


Want to link to this manual page? Use this URL:

home | help