Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
std::regex_token_iterator(3)  C++ Standard Libary std::regex_token_iterator(3)

NAME
       std::regex_token_iterator - std::regex_token_iterator

Synopsis
	  Defined in header <regex>
	  template<

	  class	BidirIt,
	  class	 CharT	=  typename std::iterator_traits<BidirIt>::value_type,
       (since C++11)
	  class	Traits = std::regex_traits<CharT>

	  > class regex_token_iterator

	  std::regex_token_iterator is a read-only LegacyForwardIterator  that
       accesses	the
	  individual sub-matches of every match	of a regular expression	within
       the underlying
	  character  sequence.	It can also be used to access the parts	of the
       sequence	that
	  were not matched by the given	regular	expression  (e.g.  as  a  tok-
       enizer).

	  On  construction,  it	constructs an std::regex_iterator and on every
       increment it
	  steps	through	the requested sub-matches from the  current  match_re-
       sults, incrementing
	  the  underlying  regex_iterator when incrementing away from the last
       submatch.

	  The default-constructed std::regex_token_iterator is the  end-of-se-
       quence iterator.
	  When a valid std::regex_token_iterator is incremented	after reaching
       the last
	  submatch  of the last	match, it becomes equal	to the end-of-sequence
       iterator.
	  Dereferencing	or incrementing	it further invokes undefined behavior.

	  Just before becoming the end-of-sequence iterator, a	std::regex_to-
       ken_iterator may
	  become a suffix iterator, if the index -1 (non-matched fragment) ap-
       pears in	the list
	  of  the  requested submatch indexes. Such iterator, if dereferenced,
       returns a
	  match_results	corresponding to the sequence  of  characters  between
       the last	match and
	  the end of sequence.

	  A  typical implementation of std::regex_token_iterator holds the un-
       derlying
	  std::regex_iterator, a container (e.g. std::vector<int>) of the  re-
       quested submatch
	  indexes,  the	internal counter equal to the index of the submatch, a
       pointer to
	  std::sub_match, pointing at the  current  submatch  of  the  current
       match, and a
	  std::match_results  object containing	the last non-matched character
       sequence	(used
	  in tokenizer mode).

Type requirements
	  -
	  BidirIt must meet the	requirements of	LegacyBidirectionalIterator.

Specializations
	  Several specializations for common character sequence	types are  de-
       fined:

	  Defined in header <regex>
	  Type			 Definition
	  cregex_token_iterator	 regex_token_iterator<const char*>
	  wcregex_token_iterator regex_token_iterator<const wchar_t*>
	  sregex_token_iterator	 regex_token_iterator<std::string::const_iter-
       ator>
	  wsregex_token_iterator  regex_token_iterator<std::wstring::const_it-
       erator>

Member types
	  Member type	    Definition
	  value_type	    std::sub_match<BidirIt>
	  difference_type   std::ptrdiff_t
	  pointer	    const value_type*
	  reference	    const value_type&
	  iterator_category std::forward_iterator_tag
	  regex_type	    basic_regex<CharT, Traits>

Member functions
	  constructor		constructs a new regex_token_iterator
				(public	member function)
	  destructor		destructs  a  regex_token_iterator,  including
       the cached value
	  (implicitly declared)	(public	member function)
	  operator=		assigns	contents
				(public	member function)
	  operator==		compares two regex_token_iterators
	  operator!=		(public	member function)
	  (removed in C++20)
	  operator*		accesses current submatch
	  operator->		(public	member function)
	  operator++		advances the iterator to the next submatch
	  operator++(int)	(public	member function)

Notes
	  It  is  the  programmer's responsibility to ensure that the std::ba-
       sic_regex object
	  passed to the	iterator's constructor outlives	the iterator.  Because
       the iterator
	  stores  a  std::regex_iterator  which	stores a pointer to the	regex,
       incrementing the
	  iterator after the regex was destroyed results in  undefined	behav-
       ior.

Example
       // Run this code

	#include <fstream>
	#include <iostream>
	#include <algorithm>
	#include <iterator>
	#include <regex>

	int main()
	{
	    // Tokenization (non-matched fragments)
	    // Note that regex is matched only two times; when the third value
       is obtained
	    // the iterator is a suffix	iterator.
	    const std::string text = "Quick brown fox.";
	    const std::regex ws_re("\\s+"); // whitespace
	    std::copy(	 std::sregex_token_iterator(text.begin(),  text.end(),
       ws_re, -1),
		       std::sregex_token_iterator(),
		       std::ostream_iterator<std::string>(std::cout, "\n"));

	    std::cout << '\n';

	    // Iterating the first submatches
	    const	  std::string	      html	   =	      R"(<p><a
       href="http://google.com">google</a> )"
				     R"(<     a	    HREF    ="http://cpprefer-
       ence.com">cppreference</a>\n</p>)";
	    const						    std::regex
       url_re(R"!!(<\s*A\s+[^>]*href\s*=\s*"([^"]*)")!!", std::regex::icase);
	    std::copy(	 std::sregex_token_iterator(html.begin(),  html.end(),
       url_re, 1),
		       std::sregex_token_iterator(),
		       std::ostream_iterator<std::string>(std::cout, "\n"));
	}

Output:
	Quick
	brown
	fox.

	http://google.com
	http://cppreference.com

http://cppreference.com		  2022.07.31	  std::regex_token_iterator(3)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=std::regex_token_iterator&sektion=3&manpath=FreeBSD+Ports+15.0>

home | help