Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
KQUEUE(2)		      System Calls Manual		     KQUEUE(2)

NAME
       kqueue, kevent -- kernel	event notification mechanism

LIBRARY
       Standard	C Library (libc, -lc)

SYNOPSIS
       #include	<sys/event.h>

       int
       kqueue(void);

       int
       kqueuex(u_int flags);

       int
       kqueue1(int flags);

       int
       kevent(int kq,	   const struct	kevent *changelist,	 int nchanges,
	   struct kevent *eventlist,				  int nevents,
	   const struct	timespec *timeout);

       EV_SET(kev, ident, filter, flags, fflags, data, udata);

DESCRIPTION
       The  kqueue()  system  call  provides a generic method of notifying the
       user when an event happens or a condition holds,	based on  the  results
       of  small pieces	of kernel code termed filters.	A kevent is identified
       by the (ident, filter) pair; there may only be one  unique  kevent  per
       kqueue.

       The filter is executed upon the initial registration of a kevent	in or-
       der  to	detect whether a preexisting condition is present, and is also
       executed	whenever an event is passed to the filter for evaluation.   If
       the  filter  determines that the	condition should be reported, then the
       kevent is placed	on the kqueue for the user to retrieve.

       The filter is also run when the user attempts to	 retrieve  the	kevent
       from the	kqueue.	 If the	filter indicates that the condition that trig-
       gered  the event	no longer holds, the kevent is removed from the	kqueue
       and is not returned.

       Multiple	events which trigger the filter	 do  not  result  in  multiple
       kevents	being placed on	the kqueue; instead, the filter	will aggregate
       the events into a single	struct kevent.	Calling	close()	on a file  de-
       scriptor	will remove any	kevents	that reference the descriptor.

       The kqueue() system call	creates	a new kernel event queue and returns a
       descriptor.   The  queue	 is  not  inherited  by	 a  child created with
       fork(2).	 However, if rfork(2) is called	without	the RFFDG  flag,  then
       the  descriptor table is	shared,	which will allow sharing of the	kqueue
       between two processes.

       The kqueuex() system call also creates a	new kernel  event  queue,  and
       additionally takes a flags argument, which is a bitwise-inclusive OR of
       the following flags:

       KQUEUE_CLOEXEC  The returned file descriptor is automatically closed on
		       execve(2)

       The  kqueue() system call is equivalent to calling kqueuex() with flags
       set to 0.

       The kqueue1() function exists for compatibility with NetBSD.  The flags
       argument	accepts	zero or	more of	the following values:

       O_CLOEXEC  The returned file  descriptor	 is  automatically  closed  on
		  execve(2)

       The kevent() system call	is used	to register events with	the queue, and
       return  any  pending  events to the user.  The changelist argument is a
       pointer to an array of kevent structures, as defined in	<sys/event.h>.
       All  changes contained in the changelist	are applied before any pending
       events are read from the	queue.	The nchanges argument gives  the  size
       of  changelist.	 The  eventlist	 argument  is a	pointer	to an array of
       kevent  structures.   The  nevents  argument  determines	 the  size  of
       eventlist.  When	nevents	is zero, kevent() will return immediately even
       if there	is a timeout specified unlike select(2).  If timeout is	a non-
       NULL  pointer,  it  specifies  a	maximum	interval to wait for an	event,
       which will be interpreted as a struct timespec.	If timeout is  a  NULL
       pointer,	 kevent()  waits  indefinitely.	 To effect a poll, the timeout
       argument	should be non-NULL, pointing to	a zero-valued timespec	struc-
       ture.  The same array may be used for the changelist and	eventlist.

       The EV_SET() macro is provided for ease of initializing a kevent	struc-
       ture.

       The kevent structure is defined as:

       struct kevent {
	       uintptr_t  ident;       /* identifier for this event */
	       short	 filter;       /* filter for event */
	       u_short	 flags;	       /* action flags for kqueue */
	       u_int	 fflags;       /* filter flag value */
	       int64_t	 data;	       /* filter data value */
	       void	 *udata;       /* opaque user data identifier */
	       uint64_t	 ext[4];       /* extensions */
       };

       The fields of struct kevent are:

       ident	  Value	used to	identify this event.  The exact	interpretation
		  is  determined  by  the attached filter, but often is	a file
		  descriptor.

       filter	  Identifies the kernel	filter used  to	 process  this	event.
		  The pre-defined system filters are described below.

       flags	  Actions to perform on	the event.

       fflags	  Filter-specific flags.

       data	  Filter-specific data value.

       udata	  Opaque  user-defined	value  passed  through	the kernel un-
		  changed.

       ext	  Extended data	passed to and from kernel.  The	meaning	of the
		  ext[0] and ext[1] members is defined by the  filter.	 If  a
		  filter  does	not use	them, these members are	passed through
		  the kernel unchanged.	 The ext[2] and	ext[3] members are al-
		  ways passed through the kernel  unchanged,  providing	 addi-
		  tional user-defined values.

       The flags field can contain the following values:

       EV_ADD	     Adds  the	event  to  the	kqueue.	 Re-adding an existing
		     event will	modify the parameters of the  original	event,
		     and not result in a duplicate entry.  Adding an event au-
		     tomatically  enables it, unless overridden	by the EV_DIS-
		     ABLE flag.

       EV_ENABLE     Permit kevent() to	return the event if it is triggered.

       EV_DISABLE    Disable the event so kevent() will	not  return  it.   The
		     filter itself is not disabled.

       EV_DISPATCH   Disable the event source immediately after	delivery of an
		     event.  See EV_DISABLE above.

       EV_DELETE     Removes  the event	from the kqueue.  Events which are at-
		     tached to file descriptors	are automatically  deleted  on
		     the last close of the descriptor.

       EV_RECEIPT    This  flag	 is useful for making bulk changes to a	kqueue
		     without draining any pending events.  When	passed as  in-
		     put,  it  forces  EV_ERROR	to always be returned.	When a
		     filter is successfully added the data field will be zero.
		     Note that if this flag is encountered and there is	no re-
		     maining space in eventlist	to hold	 the  EV_ERROR	event,
		     then subsequent changes will not get processed.

       EV_ONESHOT    Return  only  the	first  occurrence  of the filter being
		     triggered.	 After the user	retrieves the event  from  the
		     kqueue, it	is deleted.

       EV_CLEAR	     Reset the state of	the event after	it is retrieved	by the
		     user.   This  is  useful  for  filters which report state
		     transitions instead of the	current	state.	Note that some
		     filters may automatically set this	flag internally.

       EV_EOF	     Filters may set this flag to indicate filter-specific EOF
		     condition.

       EV_ERROR	     See "RETURN VALUES" below.

       EV_KEEPUDATA  Preserve the udata	associated  with  an  existing	event.
		     This  allows  other  aspects  of the event	to be modified
		     without requiring the caller to know the udata value pre-
		     viously registered	with the event.	  This	is  especially
		     useful with NOTE_TRIGGER or EV_ENABLE.  This flag may not
		     be	used with EV_ADD.

       The  predefined	system	filters	 are  listed  below.  Arguments	may be
       passed to and from the filter via the fflags and	 data  fields  in  the
       kevent structure.

       EVFILT_READ	   Takes  a  descriptor	as the identifier, and returns
			   whenever there is data available to read.  The  be-
			   havior  of the filter is slightly different depend-
			   ing on the descriptor type.

			   Sockets
			       Sockets which have previously  been  passed  to
			       listen(2) return	when there is an incoming con-
			       nection pending.	 data contains the size	of the
			       listen backlog.

			       Other  socket  descriptors return when there is
			       data to be read,	 subject  to  the  SO_RCVLOWAT
			       value  of the socket buffer.  This may be over-
			       ridden with a per-filter	low water mark at  the
			       time   the  filter  is  added  by  setting  the
			       NOTE_LOWAT flag in fflags, and  specifying  the
			       new  low	 water	mark in	data.  On return, data
			       contains	the number of bytes of	protocol  data
			       available to read.

			       If  the	read direction of the socket has shut-
			       down, then  the	filter	also  sets  EV_EOF  in
			       flags, and returns the socket error (if any) in
			       fflags.	 It is possible	for EOF	to be returned
			       (indicating the connection is gone) while there
			       is still	data pending in	the socket buffer.

			   Vnodes
			       Returns when the	file pointer is	not at the end
			       of file.	 data contains the offset from current
			       position	to end of file,	and may	be negative.

			       This behavior is	different from poll(2),	 where
			       read events are triggered for regular files un-
			       conditionally.  This event can be triggered un-
			       conditionally  by  setting  the	NOTE_FILE_POLL
			       flag in fflags.

			   Fifos, Pipes
			       Returns when the	there is data  to  read;  data
			       contains	the number of bytes available.

			       When  the  last	writer disconnects, the	filter
			       will set	EV_EOF in flags.  This will be cleared
			       by the filter when a new	 writer	 connects,  at
			       which  point the	filter will resume waiting for
			       data to become available	before returning.

			   BPF devices
			       Returns when the	BPF buffer is  full,  the  BPF
			       timeout	has  expired,  or  when	 the  BPF  has
			       "immediate mode"	enabled	and there is any  data
			       to  read;  data	contains  the  number of bytes
			       available.

			   Eventfds
			       Returns when the	counter	 is  greater  than  0;
			       data  contains the counter value, which must be
			       cast to uint64_t.

			   Kqueues
			       Returns when pending events are present on  the
			       queue;  data  contains  the  number  of	events
			       available.

       EVFILT_WRITE	   Takes a descriptor as the identifier,  and  returns
			   whenever it is possible to write to the descriptor.
			   For sockets,	pipes and fifos, data will contain the
			   amount of space remaining in	the write buffer.  The
			   filter will set EV_EOF when the reader disconnects,
			   and	for the	fifo case, this	will be	cleared	when a
			   new reader connects.	 Note that this	filter is  not
			   supported for vnodes.

			   For	sockets,  the  low water mark and socket error
			   handling is identical to the	EVFILT_READ case.

			   For eventfds, data will contain the	maximum	 value
			   that	can be added to	the counter without blocking.

			   For BPF devices, when the descriptor	is attached to
			   an interface	the filter always indicates that it is
			   possible  to	 write	and  data will contain the MTU
			   size	of the underlying interface.

       EVFILT_EMPTY	   Takes a descriptor as the identifier,  and  returns
			   whenever  there  is	no remaining data in the write
			   buffer.

       EVFILT_AIO	   Events for this  filter  are	 not  registered  with
			   kevent()   directly	but  are  registered  via  the
			   aio_sigevent	member of an asynchronous I/O  request
			   when	it is scheduled	via an asynchronous I/O	system
			   call	 such as aio_read().  The filter returns under
			   the same conditions as aio_error().	For  more  de-
			   tails on this filter	see sigevent(3)	and aio(4).

       EVFILT_VNODE	   Takes  a  file descriptor as	the identifier and the
			   events to watch for in fflags, and returns when one
			   or more of the requested events occurs on  the  de-
			   scriptor.  The events to monitor are:

			   NOTE_ATTRIB		The file referenced by the de-
						scriptor  had  its  attributes
						changed.

			   NOTE_CLOSE		A file descriptor  referencing
						the    monitored   file,   was
						closed.	 The closed  file  de-
						scriptor  did  not  have write
						access.

			   NOTE_CLOSE_WRITE	A file descriptor  referencing
						the    monitored   file,   was
						closed.	 The closed  file  de-
						scriptor had write access.

						This	note,	as   well   as
						NOTE_CLOSE, are	not  activated
						when files are closed forcibly
						by  unmount(2)	or  revoke(2).
						Instead, NOTE_REVOKE  is  sent
						for such events.

			   NOTE_DELETE		The  unlink()  system call was
						called on the file  referenced
						by the descriptor.

			   NOTE_EXTEND		For  regular  file,  the  file
						referenced by  the  descriptor
						was extended.

						For  directory,	reports	that a
						directory entry	was  added  or
						removed,  as the result	of re-
						name	  operation.	   The
						NOTE_EXTEND  event  is not re-
						ported when a name is  changed
						inside the directory.

			   NOTE_LINK		The  link  count  on  the file
						changed.  In  particular,  the
						NOTE_LINK event	is reported if
						a  subdirectory	was created or
						deleted	inside	the  directory
						referenced by the descriptor.

			   NOTE_OPEN		The file referenced by the de-
						scriptor was opened.

			   NOTE_READ		A  read	 occurred  on the file
						referenced by the descriptor.

			   NOTE_RENAME		The file referenced by the de-
						scriptor was renamed.

			   NOTE_REVOKE		Access to the file was revoked
						via revoke(2) or the  underly-
						ing file system	was unmounted.

			   NOTE_WRITE		A  write  occurred on the file
						referenced by the descriptor.

			   On return, fflags contains the events  which	 trig-
			   gered the filter.

       EVFILT_PROC	   Takes  the  process ID to monitor as	the identifier
			   and the events to watch for in fflags, and  returns
			   when	 the  process  performs	one or more of the re-
			   quested events.  If a process can normally see  an-
			   other  process,  it can attach an event to it.  The
			   events to monitor are:

			   NOTE_EXIT	     The process has exited.  The exit
					     status will be stored in data  in
					     the same format as	the status re-
					     turned by wait(2).

			   NOTE_FORK	     The process has called fork().

			   NOTE_EXEC	     The  process  has	executed a new
					     process via execve(2) or a	 simi-
					     lar call.

			   NOTE_TRACK	     Follow  a	process	 across	fork()
					     calls.  The parent	process	regis-
					     ters a new	kevent to monitor  the
					     child   process  using  the  same
					     fflags  as	 the  original	event.
					     The  child	process	will signal an
					     event  with  NOTE_CHILD  set   in
					     fflags  and  the  parent  PID  in
					     data.

					     If	the parent  process  fails  to
					     register  a  new  kevent (usually
					     due to resource limitations),  it
					     will   signal   an	  event	  with
					     NOTE_TRACKERR set in fflags,  and
					     the child process will not	signal
					     a NOTE_CHILD event.

			   On  return,	fflags contains	the events which trig-
			   gered the filter.

       EVFILT_PROCDESC	   Takes the process descriptor	created	 by  pdfork(2)
			   to  monitor	as  the	 identifier  and the events to
			   watch for in	fflags,	and returns when  the  associ-
			   ated	 process performs one or more of the requested
			   events.  The	events to monitor are:

			   NOTE_EXIT	 The process  has  exited.   The  exit
					 status	will be	stored in data.

			   On  return,	fflags contains	the events which trig-
			   gered the filter.

       EVFILT_SIGNAL	   Takes the signal number to monitor as  the  identi-
			   fier	and returns when the given signal is delivered
			   to  the  process.   This coexists with the signal()
			   and sigaction() facilities, and has a lower	prece-
			   dence.   The	filter will record all attempts	to de-
			   liver a signal to a process,	even if	the signal has
			   been	marked as SIG_IGN, except for the SIGCHLD sig-
			   nal,	which, if ignored, will	not be recorded	by the
			   filter.  Event notification	happens	 after	normal
			   signal  delivery processing.	 data returns the num-
			   ber of times	the signal has occurred	since the last
			   call	to kevent().  This filter  automatically  sets
			   the EV_CLEAR	flag internally.

       EVFILT_TIMER	   Establishes an arbitrary timer identified by	ident.
			   When	 adding	 a timer, data specifies the moment to
			   fire	the timer (for NOTE_ABSTIME)  or  the  timeout
			   period.    The   timer   will  be  periodic	unless
			   EV_ONESHOT or NOTE_ABSTIME is  specified.   On  re-
			   turn, data contains the number of times the timeout
			   has	expired	 since the last	call to	kevent().  For
			   non-monotonic  timers,  this	 filter	 automatically
			   sets	the EV_CLEAR flag internally.

			   The	filter	accepts	 the  following	 flags	in the
			   fflags argument:

			   NOTE_SECONDS	     data is in	seconds.

			   NOTE_MSECONDS     data is in	milliseconds.

			   NOTE_USECONDS     data is in	microseconds.

			   NOTE_NSECONDS     data is in	nanoseconds.

			   NOTE_ABSTIME	     The specified expiration time  is
					     absolute.

			   If  fflags is not set, the default is milliseconds.
			   On return, fflags contains the events  which	 trig-
			   gered the filter.

			   Periodic  timers with a specified timeout of	0 will
			   be silently adjusted	to timeout after 1 of the time
			   units  specified  by	 the  requested	 precision  in
			   fflags.   If	an absolute time is specified that has
			   already passed, then	it is treated as if  the  cur-
			   rent	time were specified and	the event will fire as
			   soon	as possible.

			   If  an  existing  timer  is	re-added, the existing
			   timer will be effectively canceled  (throwing  away
			   any	undelivered  record  of	previous timer expira-
			   tion) and re-started	using the new parameters  con-
			   tained in data and fflags.

			   There  is  a	 system	 wide  limit  on the number of
			   timers    which    is     controlled	    by	   the
			   kern.kq_calloutmax sysctl.

       EVFILT_USER	   Establishes	a user event identified	by ident which
			   is not associated with any kernel mechanism but  is
			   triggered by	user level code.  The lower 24 bits of
			   the	fflags	may be used for	user defined flags and
			   manipulated using the following:

			   NOTE_FFNOP	       Ignore the input	fflags.

			   NOTE_FFAND	       Bitwise AND fflags.

			   NOTE_FFOR	       Bitwise OR fflags.

			   NOTE_FFCOPY	       Copy fflags.

			   NOTE_FFCTRLMASK     Control mask for	fflags.

			   NOTE_FFLAGSMASK     User  defined  flag  mask   for
					       fflags.

			   A  user event is triggered for output with the fol-
			   lowing:

			   NOTE_TRIGGER	       Cause the  event	 to  be	 trig-
					       gered.

			   On  return, fflags contains the users defined flags
			   in the lower	24 bits.

CANCELLATION BEHAVIOUR
       If nevents is non-zero, i.e., the function is potentially blocking, the
       call is a cancellation point.  Otherwise, i.e., if nevents is zero, the
       call is not  cancellable.   Cancellation	 can  only  occur  before  any
       changes	are  made  to  the kqueue, or when the call was	blocked	and no
       changes to the queue were requested.

RETURN VALUES
       The kqueue() system call	creates	a new kernel event queue and returns a
       file descriptor.	 If there was  an  error  creating  the	 kernel	 event
       queue, a	value of -1 is returned	and errno set.

       The  kevent()  system  call  returns the	number of events placed	in the
       eventlist, up to	the value given	by nevents.  If	an error occurs	 while
       processing an element of	the changelist and there is enough room	in the
       eventlist, then the event will be placed	in the eventlist with EV_ERROR
       set  in	flags and the system error in data.  Otherwise,	-1 will	be re-
       turned, and errno will be set to	indicate the error condition.  If  the
       time limit expires, then	kevent() returns 0.

EXAMPLES
       #include	<sys/event.h>
       #include	<err.h>
       #include	<fcntl.h>
       #include	<stdio.h>
       #include	<stdlib.h>
       #include	<string.h>

       int
       main(int	argc, char **argv)
       {
	   struct kevent event;	   /* Event we want to monitor */
	   struct kevent tevent;   /* Event triggered */
	   int kq, fd, ret;

	   if (argc != 2)
	       err(EXIT_FAILURE, "Usage: %s path\n", argv[0]);
	   fd =	open(argv[1], O_RDONLY);
	   if (fd == -1)
	       err(EXIT_FAILURE, "Failed to open '%s'",	argv[1]);

	   /* Create kqueue. */
	   kq =	kqueue();
	   if (kq == -1)
	       err(EXIT_FAILURE, "kqueue() failed");

	   /* Initialize kevent	structure. */
	   EV_SET(&event, fd, EVFILT_VNODE, EV_ADD | EV_CLEAR, NOTE_WRITE,
	       0, NULL);
	   /* Attach event to the kqueue. */
	   ret = kevent(kq, &event, 1, NULL, 0,	NULL);
	   if (ret == -1)
	       err(EXIT_FAILURE, "kevent register");

	   for (;;) {
	       /* Sleep	until something	happens. */
	       ret = kevent(kq,	NULL, 0, &tevent, 1, NULL);
	       if (ret == -1) {
		   err(EXIT_FAILURE, "kevent wait");
	       } else if (ret >	0) {
		   if (tevent.flags & EV_ERROR)
		       errx(EXIT_FAILURE, "Event error:	%s", strerror(event.data));
		   else
		       printf("Something was written in	'%s'\n", argv[1]);
	       }
	   }

	   /* kqueues are destroyed upon close() */
	   (void)close(kq);
	   (void)close(fd);
       }

ERRORS
       The kqueue() system call	fails if:

       [ENOMEM]		  The  kernel failed to	allocate enough	memory for the
			  kernel queue.

       [ENOMEM]		  The RLIMIT_KQUEUES rlimit (see getrlimit(2)) for the
			  current user would be	exceeded.

       [EMFILE]		  The per-process descriptor table is full.

       [ENFILE]		  The system file table	is full.

       The kevent() system call	fails if:

       [EACCES]		  The process does not have permission to  register  a
			  filter.

       [EFAULT]		  There	 was  an  error	 reading or writing the	kevent
			  structure.

       [EBADF]		  The specified	descriptor is invalid.

       [EINTR]		  A signal was delivered before	 the  timeout  expired
			  and  before any events were placed on	the kqueue for
			  return.

       [EINTR]		  A cancellation request was delivered to the  thread,
			  but not yet handled.

       [EINVAL]		  The specified	time limit or filter is	invalid.

       [EINVAL]		  The specified	length of the event or change lists is
			  negative.

       [ENOENT]		  The  event  could  not  be  found  to	be modified or
			  deleted.

       [ENOMEM]		  No memory was	available to register the event	or, in
			  the special case of a	timer, the maximum  number  of
			  timers  has  been exceeded.  This maximum is config-
			  urable via the kern.kq_calloutmax sysctl.

       [ESRCH]		  The specified	process	to attach to does not exist.

       When  kevent()  call  fails  with  EINTR	 error,	 all  changes  in  the
       changelist have been applied.

SEE ALSO
       aio_error(2),  aio_read(2), aio_return(2), poll(2), read(2), select(2),
       sigaction(2), write(2), pthread_setcancelstate(3), signal(3)

       Jonathan	Lemon, "Kqueue:	A  Generic  and	 Scalable  Event  Notification
       Facility",  Proceedings	of  the	 FREENIX  Track:  2001	USENIX	Annual
       Technical Conference, USENIX Association, June 25-30, 2001.

HISTORY
       The kqueue() and	kevent() system	calls first appeared in	 FreeBSD  4.1.
       The  kqueuex()  system  call  and  kevent1() function first appeared in
       FreeBSD 14.0.

AUTHORS
       The kqueue() system and this manual page	were written by	Jonathan Lemon
       <jlemon@FreeBSD.org>.

BUGS
       In versions older than FreeBSD  12.0,  <sys/event.h>  failed  to	 parse
       without including <sys/types.h> manually.

FreeBSD	14.3			March 26, 2023			     KQUEUE(2)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2&manpath=FreeBSD+14.3-RELEASE+and+Ports>

home | help