Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Tcl_RegExpMatch(3)	    Tcl	Library	Procedures	    Tcl_RegExpMatch(3)

______________________________________________________________________________

NAME
       Tcl_RegExpMatch,	 Tcl_RegExpCompile,  Tcl_RegExpExec,  Tcl_RegExpRange,
       Tcl_GetRegExpFromObj, Tcl_RegExpMatchObj,  Tcl_RegExpExecObj,  Tcl_Reg-
       ExpGetInfo - Pattern matching with regular expressions

SYNOPSIS
       #include	<tcl.h>

       int
       Tcl_RegExpMatchObj(interp, textObj, patObj)

       int
       Tcl_RegExpMatch(interp, text, pattern)

       Tcl_RegExp
       Tcl_RegExpCompile(interp, pattern)

       int
       Tcl_RegExpExec(interp, regexp, text, start)

       void
       Tcl_RegExpRange(regexp, index, startPtr,	endPtr)

       Tcl_RegExp
       Tcl_GetRegExpFromObj(interp, patObj, cflags)

       int
       Tcl_RegExpExecObj(interp, regexp, textObj, offset, nmatches, eflags)

       void
       Tcl_RegExpGetInfo(regexp, infoPtr)

ARGUMENTS
       Tcl_Interp *interp (in)		    Tcl	 interpreter  to use for error
					    reporting.	The interpreter	may be
					    NULL  if no	error reporting	is de-
					    sired.

       Tcl_Obj *textObj	(in/out)	    Refers to the value	from which  to
					    get	 the  text to search.  The in-
					    ternal representation of the value
					    may	 be  converted	to a form that
					    can	be efficiently searched.

       Tcl_Obj *patObj (in/out)		    Refers to the value	from which  to
					    get	a regular expression. The com-
					    piled regular expression is	cached
					    in the value.

       char *text (in)			    Text  to search for	a match	with a
					    regular expression.

       const char *pattern (in)		    String in the form	of  a  regular
					    expression pattern.

       Tcl_RegExp regexp (in)		    Compiled regular expression.  Must
					    have been returned	previously  by
					    Tcl_GetRegExpFromObj or Tcl_RegEx-
					    pCompile.

       char *start (in)			    If text is just a portion of  some
					    other  string, this	argument iden-
					    tifies the beginning of the	larger
					    string.   If it is not the same as
					    text, then no "^" matches will  be
					    allowed.

       int index (in)			    Specifies  which range is desired:
					    0 means the	range  of  the	entire
					    match,  1  or  greater  means  the
					    range that matched a parenthesized
					    sub-expression.

       const char **startPtr (out)	    The	address	of the first character
					    in the range is  stored  here,  or
					    NULL if there is no	such range.

       const char **endPtr (out)	    The	 address of the	character just
					    after the last one in the range is
					    stored  here,  or NULL if there is
					    no such range.

       int cflags (in)			    OR-ed combination of the  compila-
					    tion    flags    TCL_REG_ADVANCED,
					    TCL_REG_EXTENDED,	TCL_REG_BASIC,
					    TCL_REG_EXPANDED,	TCL_REG_QUOTE,
					    TCL_REG_NOCASE,   TCL_REG_NEWLINE,
					    TCL_REG_NLSTOP,    TCL_REG_NLANCH,
					    TCL_REG_NOSUB,  and	  TCL_REG_CAN-
					    MATCH. See below for more informa-
					    tion.

       int offset (in)			    The	character offset into the text
					    where  matching should begin.  The
					    value of the offset	has no	impact
					    on	^  matches.   This behavior is
					    controlled by eflags.

       int nmatches (in)		    The	number of matching  subexpres-
					    sions  that	 should	 be remembered
					    for	later use.  If this  value  is
					    0, then no subexpression match in-
					    formation will  be	computed.   If
					    the	 value	is -1, then all	of the
					    matching  subexpressions  will  be
					    remembered.	  Any other value will
					    be taken as	the maximum number  of
					    subexpressions to remember.

       int eflags (in)			    OR-ed combination of the execution
					    flags      TCL_REG_NOTBOL	   and
					    TCL_REG_NOTEOL. See	below for more
					    information.

       Tcl_RegExpInfo *infoPtr (out)	    The	address	of the location	 where
					    information	about a	previous match
					    should be stored by	Tcl_RegExpGet-
					    Info.
______________________________________________________________________________

DESCRIPTION
       Tcl_RegExpMatch determines whether its pattern argument matches regexp,
       where regexp is interpreted as a	regular	expression using the rules  in
       the re_syntax reference page.  If there is a match then Tcl_RegExpMatch
       returns 1.  If there is no match	then Tcl_RegExpMatch returns 0.	 If an
       error occurs in the matching process (e.g. pattern is not a valid regu-
       lar expression) then Tcl_RegExpMatch returns -1	and  leaves  an	 error
       message	in  the	 interpreter result.  Tcl_RegExpMatchObj is similar to
       Tcl_RegExpMatch except it operates on the Tcl values textObj and	patObj
       instead of UTF strings.	Tcl_RegExpMatchObj is generally	more efficient
       than Tcl_RegExpMatch, so	it is the preferred interface.

       Tcl_RegExpCompile, Tcl_RegExpExec, and Tcl_RegExpRange  provide	lower-
       level access to the regular expression pattern matcher.	Tcl_RegExpCom-
       pile compiles a regular expression string into the internal  form  used
       for  efficient  pattern matching.  The return value is a	token for this
       compiled	form, which can	be used	in subsequent calls to	Tcl_RegExpExec
       or Tcl_RegExpRange.  If an error	occurs while compiling the regular ex-
       pression	then Tcl_RegExpCompile returns NULL and	leaves an  error  mes-
       sage  in	the interpreter	result.	 Note:	the return value from Tcl_Reg-
       ExpCompile is only valid	up to the next call to Tcl_RegExpCompile;   it
       is not safe to retain these values for long periods of time.

       Tcl_RegExpExec executes the regular expression pattern matcher.	It re-
       turns 1 if text contains	a range	of characters that match regexp, 0  if
       no match	is found, and -1 if an error occurs.  In the case of an	error,
       Tcl_RegExpExec leaves an	error message in the interpreter result.  When
       searching  a  string for	multiple matches of a pattern, it is important
       to distinguish between the start	of the original	string and  the	 start
       of  the current search.	For example, when searching for	the second oc-
       currence	of a match, the	text argument might  point  to	the  character
       just  after  the	first match;  however, it is important for the pattern
       matcher to know that this is not	the start of  the  entire  string,  so
       that  it	 does  not allow "^" atoms in the pattern to match.  The start
       argument	provides this information by pointing  to  the	start  of  the
       overall	string	containing  text.  Start will be less than or equal to
       text;  if it is less than text then no ^	matches	will be	allowed.

       Tcl_RegExpRange may be invoked after Tcl_RegExpExec returns;   it  pro-
       vides detailed information about	what ranges of the string matched what
       parts of	the pattern.  Tcl_RegExpRange returns a	pair  of  pointers  in
       *startPtr and *endPtr that identify a range of characters in the	source
       string for the most recent call	to  Tcl_RegExpExec.   Index  indicates
       which  of  several ranges is desired: if	index is 0, information	is re-
       turned about the	overall	range of characters that  matched  the	entire
       pattern;	 otherwise, information	is returned about the range of charac-
       ters that matched the index'th parenthesized subexpression  within  the
       pattern.	  If  there  is	 no  range corresponding to index then NULL is
       stored in *startPtr and *endPtr.

       Tcl_GetRegExpFromObj,  Tcl_RegExpExecObj,  and  Tcl_RegExpGetInfo   are
       value  interfaces  that	provide	 the  most  direct  control  of	 Henry
       Spencer's regular expression library.  For users	that  need  to	modify
       compilation  and	execution options directly, it is recommended that you
       use these interfaces instead of calling the internal regexp  functions.
       These  interfaces  handle the details of	UTF to Unicode translations as
       well as providing improved performance through caching in  the  pattern
       and string values.

       Tcl_GetRegExpFromObj  attempts  to return a compiled regular expression
       from the	patObj.	 If the	value does not already contain a compiled reg-
       ular  expression	 it  will attempt to create one	from the string	in the
       value and assign	it to the internal representation of the patObj.   The
       return  value of	this function is of type Tcl_RegExp.  The return value
       is a token for this compiled form, which	 can  be  used	in  subsequent
       calls  to  Tcl_RegExpExecObj  or	Tcl_RegExpGetInfo.  If an error	occurs
       while compiling the regular expression  then  Tcl_GetRegExpFromObj  re-
       turns  NULL and leaves an error message in the interpreter result.  The
       regular expression token	can be used as long as the internal  represen-
       tation of patObj	refers to the compiled form.  The cflags argument is a
       bit-wise	OR of zero or more of the following  flags  that  control  the
       compilation of patObj:

	 TCL_REG_ADVANCED
		Compile	advanced regular expressions ("ARE"s).	This mode cor-
		responds to the	normal regular expression syntax  accepted  by
		the Tcl	regexp and regsub commands.

	 TCL_REG_EXTENDED
		Compile	extended regular expressions ("ERE"s).	This mode cor-
		responds to the	regular	expression syntax  recognized  by  Tcl
		8.0 and	earlier	versions.

	 TCL_REG_BASIC
		Compile	 basic regular expressions ("BRE"s).  This mode	corre-
		sponds to the regular expression syntax	recognized  by	common
		Unix  utilities	 like sed and grep.  This is the default if no
		flags are specified.

	 TCL_REG_EXPANDED
		Compile	the regular expression (basic, extended, or  advanced)
		using  an expanded syntax that allows comments and whitespace.
		This mode causes non-backslashed non-bracket-expression	 white
		space and #-to-end-of-line comments to be ignored.

	 TCL_REG_QUOTE
		Compile	a literal string, with all characters treated as ordi-
		nary characters.

	 TCL_REG_NOCASE
		Compile	for matching that ignores  upper/lower	case  distinc-
		tions.

	 TCL_REG_NEWLINE
		Compile	 for  newline-sensitive	matching.  By default, newline
		is a completely	ordinary character with	no special meaning  in
		either	regular	 expressions or	strings.  With this flag, "[^"
		bracket	expressions and	"."  never match newline, "^"  matches
		an  empty  string  after any newline in	addition to its	normal
		function, and "$" matches an empty string before  any  newline
		in  addition  to its normal function.  REG_NEWLINE is the bit-
		wise OR	of REG_NLSTOP and REG_NLANCH.

	 TCL_REG_NLSTOP
		Compile	for partial newline-sensitive matching,	with  the  be-
		havior	of "[^"	bracket	expressions and	"."  affected, but not
		the behavior of	"^" and	"$".  In this mode, "[^"  bracket  ex-
		pressions and "."  never match newline.

	 TCL_REG_NLANCH
		Compile	 for  inverse partial newline-sensitive	matching, with
		the behavior of	"^" and	"$" (the "anchors") affected, but  not
		the  behavior  of  "[^"	 bracket expressions and ".".  In this
		mode "^" matches an empty string after any newline in addition
		to its normal function,	and "$"	matches	an empty string	before
		any newline in addition	to its normal function.

	 TCL_REG_NOSUB
		Compile	for matching that reports only success or failure, not
		what  was  matched.  This reduces compile overhead and may im-
		prove performance.  Subsequent calls to	 Tcl_RegExpGetInfo  or
		Tcl_RegExpRange	will not report	any match information.

	 TCL_REG_CANMATCH
		Compile	 for matching that reports the potential to complete a
		partial	match given more text (see below).

       Only one	 of  TCL_REG_EXTENDED,	TCL_REG_ADVANCED,  TCL_REG_BASIC,  and
       TCL_REG_QUOTE may be specified.

       Tcl_RegExpExecObj  executes the regular expression pattern matcher.  It
       returns 1 if objPtr contains a range of characters that match regexp, 0
       if no match is found, and -1 if an error	occurs.	 In the	case of	an er-
       ror, Tcl_RegExpExecObj leaves an	error message in the  interpreter  re-
       sult.   The nmatches value indicates to the matcher how many subexpres-
       sions are of interest.  If nmatches is 0, then no  subexpression	 match
       information  is	recorded,  which may allow the matcher to make various
       optimizations.  If the value is -1, then	all of the  subexpressions  in
       the  pattern  are remembered.  If the value is a	positive integer, then
       only that number	of subexpressions will be remembered.  Matching	begins
       at  the	specified  Unicode  character  index  given by offset.	Unlike
       Tcl_RegExpExec, the behavior of anchors is not affected by  the	offset
       value.  Instead the behavior of the anchors is explicitly controlled by
       the eflags argument, which is a bit-wise	OR of zero or more of the fol-
       lowing flags:

	 TCL_REG_NOTBOL
		The starting character will not	be treated as the beginning of
		a line or the beginning	of the string, so "^" will  not	 match
		there.	Note that this flag has	no effect on how "\A" matches.

	 TCL_REG_NOTEOL
		The  last  character  in the string will not be	treated	as the
		end of a line or the end of the	string,	so "$" will not	 match
		there.	Note that this flag has	no effect on how "\Z" matches.

       Tcl_RegExpGetInfo  retrieves information	about the last match performed
       with a given regular expression regexp.	The infoPtr argument  contains
       a pointer to a structure	that is	defined	as follows:

	      typedef struct Tcl_RegExpInfo {
		  int nsubs;
		  Tcl_RegExpIndices *matches;
		  long extendStart;
	      }	Tcl_RegExpInfo;

       The  nsubs field	contains a count of the	number of parenthesized	subex-
       pressions within	the regular  expression.   If  the  TCL_REG_NOSUB  was
       used, then this value will be zero.  The	matches	field points to	an ar-
       ray of nsubs+1 values that indicate the bounds  of  each	 subexpression
       matched.	 The first element in the array	refers to the range matched by
       the entire regular expression, and subsequent  elements	refer  to  the
       parenthesized  subexpressions in	the order that they appear in the pat-
       tern.  Each element is a	structure that is defined as follows:

	      typedef struct Tcl_RegExpIndices {
		  long start;
		  long end;
	      }	Tcl_RegExpIndices;

       The start and end values	are Unicode character indices relative to  the
       offset  location	 within	 objPtr	where matching began.  The start index
       identifies the first character of the matched subexpression.   The  end
       index  identifies  the first character after the	matched	subexpression.
       If the subexpression matched the	empty string, then start and end  will
       be  equal.  If the subexpression	did not	participate in the match, then
       start and end will be set to -1.

       The extendStart field in	Tcl_RegExpInfo is only set if the TCL_REG_CAN-
       MATCH  flag  was	 used.	It indicates the first character in the	string
       where a match could occur.  If a	match was found, this will be the same
       as  the beginning of the	current	match.	If no match was	found, then it
       indicates the earliest point at which a match might occur if additional
       text  is	 appended  to  the string.  If it is no	match is possible even
       with further text, this field will be set to -1.

SEE ALSO
       re_syntax(n)

KEYWORDS
       match, pattern, regular expression, string,  subexpression,  Tcl_RegEx-
       pIndices, Tcl_RegExpInfo

Tcl				      8.1		    Tcl_RegExpMatch(3)

NAME | SYNOPSIS | ARGUMENTS | DESCRIPTION | SEE ALSO | KEYWORDS

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=RegExp.tcl87&sektion=3&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help