FreeBSD Manual Pages

home | help
std::regex_token_iterator(3)  C++ Standard Libary std::regex_token_iterator(3)

NAME
       std::regex_token_iterator - std::regex_token_iterator

Synopsis
	  Defined in header <regex>
	  template<

	      class BidirIt,
	      class	    CharT	  =	   typename	   std::itera-
       tor_traits<BidirIt>::value_type,	 (since	C++11)
	      class Traits = std::regex_traits<CharT>

	  > class regex_token_iterator

	  std::regex_token_iterator is a read-only LegacyForwardIterator  that
       accesses	the
	  individual sub-matches of every match	of a regular expression	within
       the underlying
	  character  sequence.	It can also be used to access the parts	of the
       sequence	that
	  were not matched by the given	regular	expression  (e.g.  as  a  tok-
       enizer).

	  On  construction,  it	constructs an std::regex_iterator and on every
       increment it
	  steps	through	the requested sub-matches from the  current  match_re-
       sults, incrementing
	  the  underlying  std::regex_iterator when incrementing away from the
       last submatch.

	  The default-constructed std::regex_token_iterator is the  end-of-se-
       quence iterator.
	  When a valid std::regex_token_iterator is incremented	after reaching
       the last
	  submatch  of the last	match, it becomes equal	to the end-of-sequence
       iterator.
	  Dereferencing	or incrementing	it further invokes undefined behavior.

	  Just before becoming the end-of-sequence iterator, a	std::regex_to-
       ken_iterator may
	  become a suffix iterator, if the index -1 (non-matched fragment) ap-
       pears in	the list
	  of  the  requested submatch indices. Such iterator, if dereferenced,
       returns a
	  match_results	corresponding to the sequence  of  characters  between
       the last	match and
	  the end of sequence.

	  A  typical implementation of std::regex_token_iterator holds the un-
       derlying
	  std::regex_iterator, a container (e.g. std::vector<int>) of the  re-
       quested submatch
	  indices,  the	internal counter equal to the index of the submatch, a
       pointer to
	  std::sub_match, pointing at the  current  submatch  of  the  current
       match, and a
	  std::match_results  object containing	the last non-matched character
       sequence	(used
	  in tokenizer mode).

Type requirements
	  -
	  BidirIt must meet the	requirements of	LegacyBidirectionalIterator.

Specializations
	  Several specializations for common character sequence	types are  de-
       fined:

	  Defined in header <regex>
	  Type			      Definition
	  std::cregex_token_iterator  std::regex_token_iterator<const char*>
	  std::wcregex_token_iterator	       std::regex_token_iterator<const
       wchar_t*>
	  std::sregex_token_iterator		       std::regex_token_itera-
       tor<std::string::const_iterator>
	  std::wsregex_token_iterator		       std::regex_token_itera-
       tor<std::wstring::const_iterator>

Member types
	  Member type		   Definition
	  value_type		   std::sub_match<BidirIt>
	  difference_type	   std::ptrdiff_t
	  pointer		   const value_type*
	  reference		   const value_type&
	  iterator_category	   std::forward_iterator_tag
	  iterator_concept (C++20) std::input_iterator_tag
	  regex_type		   std::basic_regex<CharT, Traits>

Member functions
	  constructor		constructs a new regex_token_iterator
				(public	member function)
	  destructor		destructs  a  regex_token_iterator,  including
       the cached value
	  (implicitly declared)	(public	member function)
	  operator=		assigns	contents
				(public	member function)
	  operator==		compares two regex_token_iterators
	  operator!=		(public	member function)
	  (removed in C++20)
	  operator*		accesses current submatch
	  operator->		(public	member function)
	  operator++		advances the iterator to the next submatch
	  operator++(int)	(public	member function)

Notes
	  It  is  the  programmer's responsibility to ensure that the std::ba-
       sic_regex object
	  passed to the	iterator's constructor outlives	the iterator.  Because
       the iterator
	  stores  a  std::regex_iterator  which	stores a pointer to the	regex,
       incrementing the
	  iterator after the regex was destroyed results in  undefined	behav-
       ior.

Example
       // Run this code

	#include <algorithm>
	#include <fstream>
	#include <iostream>
	#include <iterator>
	#include <regex>

	int main()
	{
	    // Tokenization (non-matched fragments)
	    // Note that regex is matched only two times; when the third value
       is obtained
	    // the iterator is a suffix	iterator.
	    const std::string text = "Quick brown fox.";
	    const std::regex ws_re("\\s+"); // whitespace
	    std::copy(std::sregex_token_iterator(text.begin(),	   text.end(),
       ws_re, -1),
		      std::sregex_token_iterator(),
		      std::ostream_iterator<std::string>(std::cout, "\n"));

	    std::cout << '\n';

	    // Iterating the first submatches
	    const	  std::string	      html	   =	      R"(<p><a
       href="http://google.com">google</a> )"
				     R"(<     a	    HREF    ="http://cpprefer-
       ence.com">cppreference</a>\n</p>)";
	    const						    std::regex
       url_re(R"!!(<\s*A\s+[^>]*href\s*=\s*"([^"]*)")!!", std::regex::icase);
	    std::copy(std::sregex_token_iterator(html.begin(),	   html.end(),
       url_re, 1),
		      std::sregex_token_iterator(),
		      std::ostream_iterator<std::string>(std::cout, "\n"));
	}

Output:
	Quick
	brown
	fox.

	http://google.com
	http://cppreference.com

	  Defect reports

	  The following	behavior-changing defect reports were applied retroac-
       tively to
	  previously published C++ standards.

	     DR	    Applied to		Behavior as published		  Cor-
       rect behavior
	  LWG 3698	       regex_token_iterator was	a
	  (P2770R0)  C++20	 forward_iterator			  made
       input_iterator^[1]
			       while being a stashing iterator

	   1.	iterator_category  was	unchanged  by  the resolution, because
       changing	it to
	      std::input_iterator_tag might break too much existing code.

http://cppreference.com		  2024.06.10	  std::regex_token_iterator(3)
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=std::regex_token_iterator&sektion=3&manpath=FreeBSD+Ports+15.1.quarterly>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages