Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
PATTERNS(7)		Miscellaneous Information Manual	   PATTERNS(7)

NAME
       patterns	-- Lua's pattern matching rules

DESCRIPTION
       Pattern	matching in httpd(8) is	based on the implementation of the Lua
       scripting language and provides a simple	and fast  alternative  to  the
       regular expressions (REs) that are described in re_format(7).  Patterns
       are  described by regular strings, which	are interpreted	as patterns by
       the pattern-matching "find" and "match" functions.  This	 document  de-
       scribes	the syntax and the meaning (that is, what they match) of these
       strings.

CHARACTER CLASS
       A character class is used to represent a	set of characters.   The  fol-
       lowing combinations are allowed in describing a character class:

       x       (where  x  is  not  one of the magic characters `^$()%.[]*+-?')
	       represents the character	x itself.

       .       (a dot) represents all characters.

       %a      represents all letters.

       %c      represents all control characters.

       %d      represents all digits.

       %g      represents all printable	characters except space.

       %l      represents all lowercase	letters.

       %p      represents all punctuation characters.

       %s      represents all space characters.

       %u      represents all uppercase	letters.

       %w      represents all alphanumeric characters.

       %x      represents all hexadecimal digits.

       %x      (where x	is  any	 non-alphanumeric  character)  represents  the
	       character  x.   This  is	 the  standard way to escape the magic
	       characters.   Any  non-alphanumeric  character  (including  all
	       punctuation  characters,	 even the non-magical) can be preceded
	       by a `%'	when used to represent itself in a pattern.

       [set]   represents the class which is the union of  all	characters  in
	       set.   A	range of characters can	be specified by	separating the
	       end characters of the range, in ascending order,	 with  a  `-'.
	       All classes `%x'	described above	can also be used as components
	       in set.	All other characters in	set represent themselves.  For
	       example,	`[%w_]'	(or `[_%w]') represents	all alphanumeric char-
	       acters  plus  the underscore, `[0-7]' represents	the octal dig-
	       its, and	`[0-7%l%-]' represents the octal digits	plus the  low-
	       ercase letters plus the `-' character.

	       The  interaction	 between  ranges  and  classes is not defined.
	       Therefore, patterns like	`[%a-z]' or `[a-%%]' have no meaning.

       [^set]  represents the complement of set, where set is  interpreted  as
	       above.

       For  all	classes	represented by single letters (	`%a', `%c', etc.), the
       corresponding uppercase letter represents the complement	of the	class.
       For instance, `%S' represents all non-space characters.

       The  definitions	of letter, space, and other character groups depend on
       the current locale.  In particular, the class `[a-z]' may not be	equiv-
       alent to	`%l'.

PATTERN	ITEM
       A pattern item can be

          a single character class, which matches any single character	in the
	   class;

          a single character class followed by	`*',  which  matches  zero  or
	   more	 repetitions  of  characters  in  the class.  These repetition
	   items will always match the longest possible	sequence;

          a single character class followed by	`+', which matches one or more
	   repetitions of characters in	the  class.   These  repetition	 items
	   will	always match the longest possible sequence;

          a  single  character	class followed by `-', which also matches zero
	   or more repetitions of characters in	the class.  Unlike `*',	 these
	   repetition items will always	match the shortest possible sequence;

          a single character class followed by	`?', which matches zero	or one
	   occurrence  of a character in the class.  It	always matches one oc-
	   currence if possible;

          `%n', for n between 1 and 9;	such item matches a substring equal to
	   the n-th captured string (see below);

          `%bxy', where x and	y  are	two  distinct  characters;  such  item
	   matches  strings that start with x, end with	y, and where the x and
	   y are balanced.  This means that if one reads the string from  left
	   to  right, counting +1 for an x and -1 for a	y, the ending y	is the
	   first y where the count reaches 0.  For instance, the  item	`%b()'
	   matches expressions with balanced parentheses.

          `%f[set]', a	frontier pattern; such item matches an empty string at
	   any	position  such	that the next character	belongs	to set and the
	   previous character does not belong to set.  The set set  is	inter-
	   preted  as  previously described.  The beginning and	the end	of the
	   subject are handled as if they were the character `\0'.

PATTERN
       A pattern is a sequence of pattern items.  A caret `^' at the beginning
       of a pattern anchors the	match at the beginning of the subject  string.
       A  `$' at the end of a pattern anchors the match	at the end of the sub-
       ject string.  At	other positions, `^' and `$' have no  special  meaning
       and represent themselves.

CAPTURES
       A  pattern  can	contain	sub-patterns enclosed in parentheses; they de-
       scribe captures.	 When a	match succeeds,	the substrings of the  subject
       string  that match captures are stored (captured) for future use.  Cap-
       tures are numbered according to their left parentheses.	For  instance,
       in  the	pattern	 "(a*(.)%w(%s*))",  the	 part  of  the string matching
       "a*(.)%w(%s*)" is stored	as the first capture (and therefore has	number
       1); the character matching "." is captured with number 2, and the  part
       matching	"%s*" has number 3.

       As  a  special case, the	empty capture `()' captures the	current	string
       position	(a number).  For instance, if we apply the pattern "()aa()" on
       the string "flaaap", there will be two captures:	2 and 4.

SEE ALSO
       fnmatch(3), re_format(7), httpd(8)

       Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar	Celes,
       Patterns,	  Lua	       5.3	    Reference	       Manual,
       https://www.lua.org/manual/5.3/manual.html#6.4.1, Lua.org PUC-Rio, June
       2015.

HISTORY
       The  first implementation of the	pattern	rules were introduced with Lua
       2.5.  Almost twenty years later,	an implementation based	on  Lua	 5.3.1
       appeared	in OpenBSD 5.8.

AUTHORS
       The pattern matching is derived from the	original implementation	of the
       Lua  scripting  language	 written  by  Roberto  Ierusalimschy, Waldemar
       Celes, and Luiz Henrique	de Figueiredo at PUC-Rio.  It was turned  into
       a native	C API for httpd(8) by Reyk Floeter <reyk@openbsd.org>.

CAVEATS
       A notable difference with the Lua implementation	is the position	in the
       string returned by captures.  It	follows	the C-style indexing (position
       starting	 from 0) instead of Lua-style indexing (position starting from
       1).

FreeBSD	Ports 14.quarterly     November	8, 2023			   PATTERNS(7)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=patterns&sektion=7&manpath=FreeBSD+Ports+14.3.quarterly>

home | help