Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
KQUEUE(2)		      System Calls Manual		     KQUEUE(2)

NAME
       kqueue, kevent -- kernel	event notification mechanism

LIBRARY
       Standard	C Library (libc, -lc)

SYNOPSIS
       #include	<sys/event.h>

       int
       kqueue(void);

       int
       kqueuex(u_int flags);

       int
       kqueue1(int flags);

       int
       kevent(int kq,	   const struct	kevent *changelist,	 int nchanges,
	   struct kevent *eventlist,				  int nevents,
	   const struct	timespec *timeout);

       EV_SET(kev, ident, filter, flags, fflags, data, udata);

DESCRIPTION
       The  kqueue()  system  call  provides a generic method of notifying the
       user when an event happens or a condition holds,	based on  the  results
       of  small pieces	of kernel code termed filters.	A kevent is identified
       by the (ident, filter) pair; there may only be one  unique  kevent  per
       kqueue.

       The filter is executed upon the initial registration of a kevent	in or-
       der  to	detect whether a preexisting condition is present, and is also
       executed	whenever an event is passed to the filter for evaluation.   If
       the  filter  determines that the	condition should be reported, then the
       kevent is placed	on the kqueue for the user to retrieve.

       The filter is also run when the user attempts to	 retrieve  the	kevent
       from the	kqueue.	 If the	filter indicates that the condition that trig-
       gered  the event	no longer holds, the kevent is removed from the	kqueue
       and is not returned.

       Multiple	events which trigger the filter	 do  not  result  in  multiple
       kevents	being placed on	the kqueue; instead, the filter	will aggregate
       the events into a single	struct kevent.	Calling	close()	on a file  de-
       scriptor	will remove any	kevents	that reference the descriptor.

       The kqueue() system call	creates	a new kernel event queue and returns a
       descriptor.   The  queue	 is  not  inherited  by	 a  child created with
       fork(2).	 However, if rfork(2) is called	without	the RFFDG  flag,  then
       the  descriptor table is	shared,	which will allow sharing of the	kqueue
       between two processes.

       The kqueuex() system call also creates a	new kernel  event  queue,  and
       additionally takes a flags argument, which is a bitwise-inclusive OR of
       the following flags:

       KQUEUE_CLOEXEC  The returned file descriptor is automatically closed on
		       execve(2)

       The  kqueue() system call is equivalent to calling kqueuex() with flags
       set to 0.

       The kqueue1() function exists for compatibility with NetBSD.  The flags
       argument	accepts	zero or	more of	the following values:

       O_CLOEXEC  The returned file  descriptor	 is  automatically  closed  on
		  execve(2)

       The kevent() system call	is used	to register events with	the queue, and
       return  any  pending  events to the user.  The changelist argument is a
       pointer to an array of kevent structures, as defined in	<sys/event.h>.
       All  changes contained in the changelist	are applied before any pending
       events are read from the	queue.	The nchanges argument gives  the  size
       of  changelist.	 The  eventlist	 argument  is a	pointer	to an array of
       kevent  structures.   The  nevents  argument  determines	 the  size  of
       eventlist.  When	nevents	is zero, kevent() will return immediately even
       if there	is a timeout specified unlike select(2).  If timeout is	a non-
       NULL  pointer,  it  specifies  a	maximum	interval to wait for an	event,
       which will be interpreted as a struct timespec.	If timeout is  a  NULL
       pointer,	 kevent()  waits  indefinitely.	 To effect a poll, the timeout
       argument	should be non-NULL, pointing to	a zero-valued timespec	struc-
       ture.  The same array may be used for the changelist and	eventlist.

       The EV_SET() macro is provided for ease of initializing a kevent	struc-
       ture.

       The kevent structure is defined as:

       struct kevent {
	       uintptr_t  ident;       /* identifier for this event */
	       short	 filter;       /* filter for event */
	       u_short	 flags;	       /* action flags for kqueue */
	       u_int	 fflags;       /* filter flag value */
	       int64_t	 data;	       /* filter data value */
	       void	 *udata;       /* opaque user data identifier */
	       uint64_t	 ext[4];       /* extensions */
       };

       The fields of struct kevent are:

       ident	  Value	used to	identify this event.  The exact	interpretation
		  is  determined  by  the attached filter, but often is	a file
		  descriptor.

       filter	  Identifies the kernel	filter used  to	 process  this	event.
		  The pre-defined system filters are described below.

       flags	  Actions to perform on	the event.

       fflags	  Filter-specific flags.

       data	  Filter-specific data value.

       udata	  Opaque  user-defined	value  passed  through	the kernel un-
		  changed.

       ext	  Extended data	passed to and from kernel.  The	meaning	of the
		  ext[0] and ext[1] members is defined by the  filter.	 If  a
		  filter  does	not use	them, these members are	passed through
		  the kernel unchanged.	 The ext[2] and	ext[3] members are al-
		  ways passed through the kernel  unchanged,  providing	 addi-
		  tional user-defined values.

       The flags field can contain the following values:

       EV_ADD	     Adds  the	event  to  the	kqueue.	 Re-adding an existing
		     event will	modify the parameters of the  original	event,
		     and not result in a duplicate entry.  Adding an event au-
		     tomatically  enables it, unless overridden	by the EV_DIS-
		     ABLE flag.

       EV_ENABLE     Permit kevent() to	return the event if it is triggered.

       EV_DISABLE    Disable the event so kevent() will	not  return  it.   The
		     filter itself is not disabled.

       EV_DISPATCH   Disable the event source immediately after	delivery of an
		     event.  See EV_DISABLE above.

       EV_DELETE     Removes  the event	from the kqueue.  Events which are at-
		     tached to file descriptors	are automatically  deleted  on
		     the last close of the descriptor.

       EV_RECEIPT    This  flag	 is useful for making bulk changes to a	kqueue
		     without draining any pending events.  When	passed as  in-
		     put,  it  forces  EV_ERROR	to always be returned.	When a
		     filter is successfully added the data field will be zero.
		     Note that if this flag is encountered and there is	no re-
		     maining space in eventlist	to hold	 the  EV_ERROR	event,
		     then subsequent changes will not get processed.

       EV_ONESHOT    Return  only  the	first  occurrence  of the filter being
		     triggered.	 After the user	retrieves the event  from  the
		     kqueue, it	is deleted.

       EV_CLEAR	     Reset the state of	the event after	it is retrieved	by the
		     user.   This  is  useful  for  filters which report state
		     transitions instead of the	current	state.	Note that some
		     filters may automatically set this	flag internally.

       EV_EOF	     Filters may set this flag to indicate filter-specific EOF
		     condition.

       EV_ERROR	     See "RETURN VALUES" below.

       EV_KEEPUDATA  Preserve the udata	associated  with  an  existing	event.
		     This  allows  other  aspects  of the event	to be modified
		     without requiring the caller to know the udata value pre-
		     viously registered	with the event.	  This	is  especially
		     useful with NOTE_TRIGGER or EV_ENABLE.  This flag may not
		     be	used with EV_ADD.

       The  predefined	system	filters	 are  listed  below.  Arguments	may be
       passed to and from the filter via the fflags and	 data  fields  in  the
       kevent structure.

       EVFILT_READ	   Takes  a  descriptor	as the identifier, and returns
			   whenever there is data available to read.  The  be-
			   havior  of the filter is slightly different depend-
			   ing on the descriptor type.

			   Sockets
			       Sockets which have previously  been  passed  to
			       listen(2) return	when there is an incoming con-
			       nection pending.	 data contains the size	of the
			       listen backlog.

			       Other  socket  descriptors return when there is
			       data to be read,	 subject  to  the  SO_RCVLOWAT
			       value  of the socket buffer.  This may be over-
			       ridden with a per-filter	low water mark at  the
			       time   the  filter  is  added  by  setting  the
			       NOTE_LOWAT flag in fflags, and  specifying  the
			       new  low	 water	mark in	data.  On return, data
			       contains	the number of bytes of	protocol  data
			       available to read.

			       If  the	read direction of the socket has shut-
			       down, then  the	filter	also  sets  EV_EOF  in
			       flags, and returns the socket error (if any) in
			       fflags.	 It is possible	for EOF	to be returned
			       (indicating the connection is gone) while there
			       is still	data pending in	the socket buffer.

			   Vnodes
			       Returns when the	file pointer is	not at the end
			       of file.	 data contains the offset from current
			       position	to end of file,	and may	be negative.

			       This behavior is	different from poll(2),	 where
			       read events are triggered for regular files un-
			       conditionally.  This event can be triggered un-
			       conditionally  by  setting  the	NOTE_FILE_POLL
			       flag in fflags.

			   Fifos, Pipes
			       Returns when the	there is data  to  read;  data
			       contains	the number of bytes available.

			       When  the  last	writer disconnects, the	filter
			       will set	EV_EOF in flags.  This will be cleared
			       by the filter when a new	 writer	 connects,  at
			       which  point the	filter will resume waiting for
			       data to become available	before returning.

			   BPF devices
			       Returns when the	BPF buffer is  full,  the  BPF
			       timeout	has  expired,  or  when	 the  BPF  has
			       "immediate mode"	enabled	and there is any  data
			       to  read;  data	contains  the  number of bytes
			       available.

			   Eventfds
			       Returns when the	counter	 is  greater  than  0;
			       data  contains the counter value, which must be
			       cast to uint64_t.

			   Kqueues
			       Returns when pending events are present on  the
			       queue;  data  contains  the  number  of	events
			       available.

       EVFILT_WRITE	   Takes a descriptor as the identifier,  and  returns
			   whenever it is possible to write to the descriptor.
			   For sockets,	pipes and fifos, data will contain the
			   amount of space remaining in	the write buffer.  The
			   filter will set EV_EOF when the reader disconnects,
			   and	for the	fifo case, this	will be	cleared	when a
			   new reader connects.	 Note that this	filter is  not
			   supported for vnodes.

			   For	sockets,  the  low water mark and socket error
			   handling is identical to the	EVFILT_READ case.

			   For eventfds, data will contain the	maximum	 value
			   that	can be added to	the counter without blocking.

			   For BPF devices, when the descriptor	is attached to
			   an interface	the filter always indicates that it is
			   possible  to	 write	and  data will contain the MTU
			   size	of the underlying interface.

       EVFILT_EMPTY	   Takes a descriptor as the identifier,  and  returns
			   whenever  there  is	no remaining data in the write
			   buffer.

       EVFILT_AIO	   Events for this  filter  are	 not  registered  with
			   kevent()   directly	but  are  registered  via  the
			   aio_sigevent	member of an asynchronous I/O  request
			   when	it is scheduled	via an asynchronous I/O	system
			   call	 such as aio_read().  The filter returns under
			   the same conditions as aio_error().	For  more  de-
			   tails on this filter	see sigevent(3)	and aio(4).

       EVFILT_VNODE	   Takes  a  file descriptor as	the identifier and the
			   events to watch for in fflags, and returns when one
			   or more of the requested events occurs on  the  de-
			   scriptor.  The events to monitor are:

			   NOTE_ATTRIB		The file referenced by the de-
						scriptor  had  its  attributes
						changed.

			   NOTE_CLOSE		A file descriptor  referencing
						the    monitored   file,   was
						closed.	 The closed  file  de-
						scriptor  did  not  have write
						access.

			   NOTE_CLOSE_WRITE	A file descriptor  referencing
						the    monitored   file,   was
						closed.	 The closed  file  de-
						scriptor had write access.

						This	note,	as   well   as
						NOTE_CLOSE, are	not  activated
						when files are closed forcibly
						by  unmount(2)	or  revoke(2).
						Instead, NOTE_REVOKE  is  sent
						for such events.

			   NOTE_DELETE		The  unlink()  system call was
						called on the file  referenced
						by the descriptor.

			   NOTE_EXTEND		For  regular  file,  the  file
						referenced by  the  descriptor
						was extended.

						For  directory,	reports	that a
						directory entry	was  added  or
						removed,  as the result	of re-
						name	  operation.	   The
						NOTE_EXTEND  event  is not re-
						ported when a name is  changed
						inside the directory.

			   NOTE_LINK		The  link  count  on  the file
						changed.  In  particular,  the
						NOTE_LINK event	is reported if
						a  subdirectory	was created or
						deleted	inside	the  directory
						referenced by the descriptor.

			   NOTE_OPEN		The file referenced by the de-
						scriptor was opened.

			   NOTE_READ		A  read	 occurred  on the file
						referenced by the descriptor.

			   NOTE_RENAME		The file referenced by the de-
						scriptor was renamed.

			   NOTE_REVOKE		Access to the file was revoked
						via revoke(2) or the  underly-
						ing file system	was unmounted.

			   NOTE_WRITE		A  write  occurred on the file
						referenced by the descriptor.

			   On return, fflags contains the events  which	 trig-
			   gered the filter.

       EVFILT_PROC	   Takes  the  process ID to monitor as	the identifier
			   and the events to watch for in fflags, and  returns
			   when	 the  process  performs	one or more of the re-
			   quested events.  If a process can normally see  an-
			   other  process,  it can attach an event to it.  The
			   events to monitor are:

			   NOTE_EXIT	     The process has exited.  The exit
					     status will be stored in data  in
					     the same format as	the status re-
					     turned by wait(2).

			   NOTE_FORK	     The process has called fork().

			   NOTE_EXEC	     The  process  has	executed a new
					     process via execve(2) or a	 simi-
					     lar call.

			   NOTE_TRACK	     Follow  a	process	 across	fork()
					     calls.  The parent	process	regis-
					     ters a new	kevent to monitor  the
					     child   process  using  the  same
					     fflags  as	 the  original	event.
					     The  child	process	will signal an
					     event  with  NOTE_CHILD  set   in
					     fflags  and  the  parent  PID  in
					     data.

					     If	the parent  process  fails  to
					     register  a  new  kevent (usually
					     due to resource limitations),  it
					     will   signal   an	  event	  with
					     NOTE_TRACKERR set in fflags,  and
					     the child process will not	signal
					     a NOTE_CHILD event.

			   On  return,	fflags contains	the events which trig-
			   gered the filter.

       EVFILT_PROCDESC	   Takes the process descriptor	created	 by  pdfork(2)
			   to  monitor	as  the	 identifier  and the events to
			   watch for in	fflags,	and returns when  the  associ-
			   ated	 process performs one or more of the requested
			   events.  The	events to monitor are:

			   NOTE_EXIT	 The process  has  exited.   The  exit
					 status	will be	stored in data.

			   On  return,	fflags contains	the events which trig-
			   gered the filter.

       EVFILT_SIGNAL	   Takes the signal number to monitor as  the  identi-
			   fier	and returns when the given signal is delivered
			   to  the  process.   This coexists with the signal()
			   and sigaction() facilities, and has a lower	prece-
			   dence.   The	filter will record all attempts	to de-
			   liver a signal to a process,	even if	the signal has
			   been	marked as SIG_IGN, except for the SIGCHLD sig-
			   nal,	which, if ignored, will	not be recorded	by the
			   filter.  Event notification	happens	 after	normal
			   signal  delivery processing.	 data returns the num-
			   ber of times	the signal has occurred	since the last
			   call	to kevent().  This filter  automatically  sets
			   the EV_CLEAR	flag internally.

       EVFILT_JAIL	   Takes  the jail ID to monitor as the	identifier and
			   the events to watch for in fflags, and returns when
			   the jail performs one  or  more  of	the  requested
			   events.   If	 a process can normally	see a jail, it
			   can attach an event to it.  An identifier  of  zero
			   will	 watch	the process's own jail.	 The events to
			   monitor are:

			   NOTE_JAIL_SET	The jail has been changed  via
						jail_set(2).

			   NOTE_JAIL_ATTACH	A  process has attached	to the
						jail via jail_attach(2)	 or  a
						similar	 call.	The process ID
						will be	stored	in  data.   If
						more  than one process has at-
						tached since the last call  to
						kevent(), data will be zero.

			   NOTE_JAIL_REMOVE	The jail has been removed.

			   NOTE_JAIL_CHILD	A  child  of  the watched jail
						has been created.  Its jail ID
						will be	stored	in  data.   If
						more  than  one	 jail has been
						created	since the last call to
						kevent(), data will be zero.

			   On return, fflags contains the events  which	 trig-
			   gered   the	 filter.    It	 will	also   contain
			   NOTE_JAIL_MULTI if more than	 one  NOTE_JAIL_ATTACH
			   or  NOTE_JAIL_CHILD	event  has been	received since
			   the last call to kevent().

       EVFILT_JAILDESC	   Takes a jail	descriptor returned by jail_set(2)  or
			   jail_get(2)	as  the	 identifier  and the events to
			   watch for in	fflags,	and returns when the jail per-
			   forms one or	more of	 the  requested	 events.   The
			   events  to monitor and the resulting	fflags are the
			   same	as those listed	in EVFILT_JAIL,	above.

       EVFILT_TIMER	   Establishes an arbitrary timer identified by	ident.
			   When	adding a timer,	data specifies the  moment  to
			   fire	 the  timer  (for NOTE_ABSTIME)	or the timeout
			   period.   The  timer	 will	be   periodic	unless
			   EV_ONESHOT  or  NOTE_ABSTIME	 is specified.	On re-
			   turn, data contains the number of times the timeout
			   has expired since the last call to  kevent().   For
			   non-monotonic  timers,  this	 filter	 automatically
			   sets	the EV_CLEAR flag internally.

			   The filter  accepts	the  following	flags  in  the
			   fflags argument:

			   NOTE_SECONDS	     data is in	seconds.

			   NOTE_MSECONDS     data is in	milliseconds.

			   NOTE_USECONDS     data is in	microseconds.

			   NOTE_NSECONDS     data is in	nanoseconds.

			   NOTE_ABSTIME	     The  specified expiration time is
					     absolute.

			   If fflags is	not set, the default is	 milliseconds.
			   On  return,	fflags contains	the events which trig-
			   gered the filter.

			   Periodic timers with	a specified timeout of 0  will
			   be silently adjusted	to timeout after 1 of the time
			   units  specified  by	 the  requested	 precision  in
			   fflags.  If an absolute time	is specified that  has
			   already  passed,  then it is	treated	as if the cur-
			   rent	time were specified and	the event will fire as
			   soon	as possible.

			   If an existing  timer  is  re-added,	 the  existing
			   timer  will	be effectively canceled	(throwing away
			   any undelivered record of  previous	timer  expira-
			   tion)  and re-started using the new parameters con-
			   tained in data and fflags.

			   There is a system  wide  limit  on  the  number  of
			   timers     which	is     controlled    by	   the
			   kern.kq_calloutmax sysctl.

       EVFILT_USER	   Establishes a user event identified by ident	 which
			   is  not associated with any kernel mechanism	but is
			   triggered by	user level code.  The lower 24 bits of
			   the fflags may be used for user defined  flags  and
			   manipulated using the following:

			   NOTE_FFNOP	       Ignore the input	fflags.

			   NOTE_FFAND	       Bitwise AND fflags.

			   NOTE_FFOR	       Bitwise OR fflags.

			   NOTE_FFCOPY	       Copy fflags.

			   NOTE_FFCTRLMASK     Control mask for	fflags.

			   NOTE_FFLAGSMASK     User   defined  flag  mask  for
					       fflags.

			   A user event	is triggered for output	with the  fol-
			   lowing:

			   NOTE_TRIGGER	       Cause  the  event  to  be trig-
					       gered.

			   On return, fflags contains the users	defined	 flags
			   in the lower	24 bits.

CANCELLATION BEHAVIOUR
       If nevents is non-zero, i.e., the function is potentially blocking, the
       call is a cancellation point.  Otherwise, i.e., if nevents is zero, the
       call  is	 not  cancellable.   Cancellation  can	only  occur before any
       changes are made	to the kqueue, or when the call	 was  blocked  and  no
       changes to the queue were requested.

RETURN VALUES
       The kqueue() system call	creates	a new kernel event queue and returns a
       file  descriptor.   If  there  was  an  error creating the kernel event
       queue, a	value of -1 is returned	and errno set.

       The kevent() system call	returns	the number of  events  placed  in  the
       eventlist,  up to the value given by nevents.  If an error occurs while
       processing an element of	the changelist and there is enough room	in the
       eventlist, then the event will be placed	in the eventlist with EV_ERROR
       set in flags and	the system error in data.  Otherwise, -1 will  be  re-
       turned,	and errno will be set to indicate the error condition.	If the
       time limit expires, then	kevent() returns 0.

EXAMPLES
       #include	<sys/event.h>
       #include	<err.h>
       #include	<fcntl.h>
       #include	<stdio.h>
       #include	<stdlib.h>
       #include	<string.h>

       int
       main(int	argc, char **argv)
       {
	   struct kevent event;	   /* Event we want to monitor */
	   struct kevent tevent;   /* Event triggered */
	   int kq, fd, ret;

	   if (argc != 2)
	       err(EXIT_FAILURE, "Usage: %s path\n", argv[0]);
	   fd =	open(argv[1], O_RDONLY);
	   if (fd == -1)
	       err(EXIT_FAILURE, "Failed to open '%s'",	argv[1]);

	   /* Create kqueue. */
	   kq =	kqueue();
	   if (kq == -1)
	       err(EXIT_FAILURE, "kqueue() failed");

	   /* Initialize kevent	structure. */
	   EV_SET(&event, fd, EVFILT_VNODE, EV_ADD | EV_CLEAR, NOTE_WRITE,
	       0, NULL);
	   /* Attach event to the kqueue. */
	   ret = kevent(kq, &event, 1, NULL, 0,	NULL);
	   if (ret == -1)
	       err(EXIT_FAILURE, "kevent register");

	   for (;;) {
	       /* Sleep	until something	happens. */
	       ret = kevent(kq,	NULL, 0, &tevent, 1, NULL);
	       if (ret == -1) {
		   err(EXIT_FAILURE, "kevent wait");
	       } else if (ret >	0) {
		   if (tevent.flags & EV_ERROR)
		       errx(EXIT_FAILURE, "Event error:	%s", strerror(event.data));
		   else
		       printf("Something was written in	'%s'\n", argv[1]);
	       }
	   }

	   /* kqueues are destroyed upon close() */
	   (void)close(kq);
	   (void)close(fd);
       }

ERRORS
       The kqueue() system call	fails if:

       [ENOMEM]		  The kernel failed to allocate	enough memory for  the
			  kernel queue.

       [ENOMEM]		  The RLIMIT_KQUEUES rlimit (see getrlimit(2)) for the
			  current user would be	exceeded.

       [EMFILE]		  The per-process descriptor table is full.

       [ENFILE]		  The system file table	is full.

       The kevent() system call	fails if:

       [EACCES]		  The  process	does not have permission to register a
			  filter.

       [EFAULT]		  There	was an error reading  or  writing  the	kevent
			  structure.

       [EBADF]		  The specified	descriptor is invalid.

       [EINTR]		  A  signal  was  delivered before the timeout expired
			  and before any events	were placed on the kqueue  for
			  return.

       [EINTR]		  A  cancellation request was delivered	to the thread,
			  but not yet handled.

       [EINVAL]		  The specified	time limit or filter is	invalid.

       [EINVAL]		  The specified	length of the event or change lists is
			  negative.

       [ENOENT]		  The event could not  be  found  to  be  modified  or
			  deleted.

       [ENOMEM]		  No memory was	available to register the event	or, in
			  the  special	case of	a timer, the maximum number of
			  timers has been exceeded.  This maximum  is  config-
			  urable via the kern.kq_calloutmax sysctl.

       [ESRCH]		  The specified	process	to attach to does not exist.

       When  kevent()  call  fails  with  EINTR	 error,	 all  changes  in  the
       changelist have been applied.

SEE ALSO
       aio_error(2), aio_read(2), aio_return(2), poll(2), read(2),  select(2),
       sigaction(2), write(2), pthread_setcancelstate(3), signal(3)

       Jonathan	 Lemon,	 "Kqueue:  A  Generic  and Scalable Event Notification
       Facility",  Proceedings	of  the	 FREENIX  Track:  2001	USENIX	Annual
       Technical Conference, USENIX Association, June 25-30, 2001.

HISTORY
       The  kqueue()  and kevent() system calls	first appeared in FreeBSD 4.1.
       The kqueuex() system call and  kevent1()	 function  first  appeared  in
       FreeBSD 14.0.

AUTHORS
       The kqueue() system and this manual page	were written by	Jonathan Lemon
       <jlemon@FreeBSD.org>.

BUGS
       In  versions  older  than  FreeBSD  12.0, <sys/event.h> failed to parse
       without including <sys/types.h> manually.

FreeBSD	15.0		      September	12, 2025		     KQUEUE(2)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=kevent&manpath=FreeBSD+15.0-RELEASE+and+Ports>

home | help