Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
std::mbrtoc32(3)	      C++ Standard Libary	      std::mbrtoc32(3)

NAME
       std::mbrtoc32 - std::mbrtoc32

Synopsis
	  Defined in header <cuchar>
	  std::size_t mbrtoc32(	char32_t* pc32,

	  const	char* s,			 (since	C++11)
	  std::size_t n,

	  std::mbstate_t* ps );

	  Converts a narrow multibyte character	to its UTF-32 character	repre-
       sentation.

	  If  s	 is not	a null pointer,	inspects at most n bytes of the	multi-
       byte character
	  string, beginning with the byte pointed to by	 s  to	determine  the
       number of bytes
	  necessary  to	 complete  the next multibyte character	(including any
       shift sequences).
	  If the function determines that the next multibyte character in s is
       complete	and
	  valid, converts it to	the corresponding 32-bit character and	stores
       it in *pc32 (if
	  pc32 is not null).

	  If the multibyte character in	*s corresponds to a multi-char32_t se-
       quence (not
	  possible  with  UTF-32), then	after the first	call to	this function,
       *ps is updated in
	  such a way that the next calls to mbrtoc32 will write	out the	 addi-
       tional char32_t,
	  without considering *s.

	  If s is a null pointer, the values of	n and pc32 are ignored and the
       call is
	  equivalent to	std::mbrtoc32(nullptr, "", 1, ps).

	  If the wide character	produced is the	null character,	the conversion
       state *ps
	  represents the initial shift state.

	  The  multibyte  encoding  used  by this function is specified	by the
       currently active	C
	  locale.

Parameters
	  pc32 - pointer to the	location where the resulting 32-bit  character
       will be written
	  s    - pointer to the	multibyte character string used	as input
	  n    - limit on the number of	bytes in s that	can be examined
	  ps   - pointer to the	conversion state object	used when interpreting
       the multibyte
		 string

Return value
	  The first of the following that applies:

	    * 0	if the character converted from	s (and stored in *pc32 if non-
       null) was
	      the null character
	    *  the number of bytes [1...n] of the multibyte character success-
       fully converted
	      from s
	    * -3 if the	next char32_t from a multi-char32_t character has  now
       been written to
	      *pc32. No	bytes are processed from the input in this case.
	    *  -2  if  the  next  n bytes constitute an	incomplete, but	so far
       valid, multibyte
	      character. Nothing is written to *pc32.
	    * -1 if encoding error occurs. Nothing is written  to  *pc32,  the
       value EILSEQ is
	      stored in	errno and the value of *ps is unspecified.

Examples
       // Run this code

	#include <iostream>
	#include <iomanip>
	#include <clocale>
	#include <cstring>
	#include <cwchar>
	#include <cuchar>
	#include <cassert>

	int main()
	{
	    std::setlocale(LC_ALL, "en_US.utf8");

	    std::string	str = "z\u00df\u6c34\U0001F34C"; // or u8"z"

	    std::cout  <<  "Processing	"  <<  str.size()  <<  " bytes:	[ " <<
       std::showbase;
	    for(unsigned char c: str) std::cout	<< std::hex << +c << ' ';
	    std::cout << "]\n";

	    std::mbstate_t state{}; // zero-initialized	to initial state
	    char32_t c32;
	    const char *ptr = str.c_str(), *end	= str.c_str() +	 str.size()  +
       1;

	    while(std::size_t	rc  =  std::mbrtoc32(&c32,  ptr,  end  -  ptr,
       &state))
	    {
		std::cout << "Next UTF-32 char:	" << std::hex
			  << static_cast<int>(c32) << "	obtained from ";
		assert(rc != (std::size_t)-3); // no surrogates	in UTF-32
		if(rc == (std::size_t)-1) break;
		if(rc == (std::size_t)-2) break;
		std::cout << std::dec << rc << " bytes [ ";
		for(std::size_t	n = 0; n < rc; ++n)
		    std::cout	 <<    std::hex	   <<	 +static_cast<unsigned
       char>(ptr[n]) <<	' ';
		std::cout << "]\n";
		ptr += rc;
	    }
	}

Output:
	Processing  10	bytes:	[ 0x7a 0xc3 0x9f 0xe6 0xb0 0xb4	0xf0 0x9f 0x8d
       0x8c ]
	Next UTF-32 char: 0x7a obtained	from 1 bytes [ 0x7a ]
	Next UTF-32 char: 0xdf obtained	from 2 bytes [ 0xc3 0x9f ]
	Next UTF-32 char: 0x6c34 obtained from 3 bytes [ 0xe6 0xb0 0xb4	]
	Next UTF-32 char: 0x1f34c obtained from	4 bytes	[ 0xf0 0x9f 0x8d  0x8c
       ]

See also
	  c32rtomb  convert a 32-bit wide character to narrow multibyte	string
	  (C++11)   (function)
	  do_in	     converts  a  string from externT to internT, such as when
       reading from file
	  [virtual] (virtual protected	member	function  of  std::codecvt<In-
       ternT,ExternT,State>)

http://cppreference.com		  2022.07.31		      std::mbrtoc32(3)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=std::mbrtoc32&sektion=3&manpath=FreeBSD+Ports+15.0>

home | help