Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
std::codecvt(3)		      C++ Standard Libary	       std::codecvt(3)

NAME
       std::codecvt - std::codecvt

Synopsis
	  Defined in header <locale>
	  template<

	  class	InternT,
	  class	ExternT,
	  class	State

	  > class codecvt;

	  Class	 template  std::codecvt	 encapsulates  conversion of character
       strings,	including
	  wide and multibyte, from one encoding	to another. All	file I/O oper-
       ations performed
	  through std::basic_fstream<CharT> use	the std::codecvt<CharT,	 char,
       std::mbstate_t>
	  facet	of the locale imbued in	the stream.

	  std-codecvt-inheritance.svg

					  Inheritance diagram

	  The  following  standalone  (locale-independent) specializations are
       provided	by the
	  standard library:

	  Defined in header <locale>
	  std::codecvt<char, char, std::mbstate_t> identity conversion
	  std::codecvt<char16_t, char,		   conversion  between	UTF-16
       and UTF-8 (since
	  std::mbstate_t>			   C++11)(deprecated in	C++20)
	  std::codecvt<char16_t,  char8_t,	     conversion	between	UTF-16
       and UTF-8 (since
	  std::mbstate_t>			   C++20)
	  std::codecvt<char32_t, char,		   conversion  between	UTF-32
       and UTF-8 (since
	  std::mbstate_t>			   C++11)(deprecated in	C++20)
	  std::codecvt<char32_t,  char8_t,	     conversion	between	UTF-32
       and UTF-8 (since
	  std::mbstate_t>			   C++20)
	  std::codecvt<wchar_t,	char,		   conversion between the sys-
       tem's native wide
	  std::mbstate_t>			   and the single-byte	narrow
       character sets

	  In addition, every locale object constructed in a C++	program	imple-
       ments its own
	  (locale-specific) versions of	above specializations.

Member types
	  Member type Definition
	  intern_type InternT
	  extern_type ExternT
	  state_type  State

Member functions
	  constructor	constructs a new codecvt facet
			(public	member function)
	  destructor	destructs a codecvt facet
			(protected member function)
	  out		invokes	do_out
			(public	member function)
	  in		invokes	do_in
			(public	member function)
	  unshift	invokes	do_unshift
			(public	member function)
	  encoding	invokes	do_encoding
			(public	member function)
	  always_noconv	invokes	do_always_noconv
			(public	member function)
	  length	invokes	do_length
			(public	member function)
	  max_length	invokes	do_max_length
			(public	member function)

Member objects
	  Member name Type
	  id [static] std::locale::id

Protected member functions
	  do_out	    converts a string from internT to externT, such as
       when writing to
	  [virtual]	   file
			   (virtual protected member function)
	  do_in		   converts a string from externT to internT, such  as
       when reading
	  [virtual]	   from	file
			   (virtual protected member function)
	  do_unshift	   generates the termination character sequence	of ex-
       ternT characters
	  [virtual]	   for incomplete conversion
			   (virtual protected member function)
	  do_encoding	    returns the	number of externT characters necessary
       to produce one
	  [virtual]	   internT character, if constant
			   (virtual protected member function)
	  do_always_noconv tests if the	facet encodes an  identity  conversion
       for all valid
	  [virtual]	   argument values
			   (virtual protected member function)
	  do_length	    calculates	the  length of the externT string that
       would be	consumed
	  [virtual]	   by conversion into given internT buffer
			   (virtual protected member function)
	  do_max_length	   returns the maximum number  of  externT  characters
       that could be
	  [virtual]	   converted into a single internT character
			   (virtual protected member function)

       Inherited from std::codecvt_base

	  Member type				      Definition
	  enum	result	{  ok,	partial, error,	noconv }; Unscoped enumeration
       type

	  Enumeration constant Definition
	  ok		       conversion was completed	with no	error
	  partial	       not all source characters were converted
	  error		       encountered an invalid character
	  noconv	       no conversion required, input and output	 types
       are the same

Example
	  The  following  examples reads a UTF-8 file using a locale which im-
       plements	UTF-8
	  conversion in	codecvt<wchar_t, char, mbstate_t> and converts a UTF-8
       string to
	  UTF-16 using one of the standard specializations of std::codecvt.

       // Run this code

	#include <iostream>
	#include <fstream>
	#include <string>
	#include <locale>
	#include <iomanip>
	#include <codecvt>
	#include <cstdint>

	// utility wrapper to adapt locale-bound  facets  for  wstring/wbuffer
       convert
	template<class Facet>
	struct deletable_facet : Facet
	{
	    template<class ...Args>
	    deletable_facet(Args&&	 ...args)	:      Facet(std::for-
       ward<Args>(args)...) {}
	    ~deletable_facet() {}
	};

	int main()
	{
	    // UTF-8 narrow multibyte encoding
	    std::string		data	     =		reinterpret_cast<const
       char*>(+u8"z\u00df\u6c34\U0001f34c");
			       // or reinterpret_cast<const char*>(+u8"z")
			       //					    or
       "\x7a\xc3\x9f\xe6\xb0\xb4\xf0\x9f\x8d\x8c"

	    std::ofstream("text.txt") << data;

	    // using system-supplied locale's codecvt facet
	    std::wifstream fin("text.txt");
	    // reading from wifstream  will  use  codecvt<wchar_t,  char,  mb-
       state_t>
	    //	this  locale's codecvt converts	UTF-8 to UCS4 (on systems such
       as Linux)
	    fin.imbue(std::locale("en_US.UTF-8"));
	    std::cout << "The UTF-8 file  contains  the	 following  UCS4  code
       units:	";
	    for	(wchar_t c; fin	>> c; )
		std::cout  <<  "U+"  <<	 std::hex << std::setw(4) << std::set-
       fill('0')
			  << static_cast<uint32_t>(c) << ' ';

	    // using standard (locale-independent) codecvt facet
	    std::wstring_convert<
		deletable_facet<std::codecvt<char16_t, char, std::mbstate_t>>,
       char16_t> conv16;
	    std::u16string str16 = conv16.from_bytes(data);

	    std::cout << "\nThe	UTF-8 file contains the	following UTF-16  code
       units: ";
	    for	(char16_t c : str16)
		std::cout  <<  "U+"  <<	 std::hex << std::setw(4) << std::set-
       fill('0')
			  << static_cast<uint16_t>(c) << ' ';
	}

Output:
	The UTF-8 file contains	the following UCS4 code	units:	 U+007a	U+00df
       U+6c34 U+1f34c
	The UTF-8 file contains	the following UTF-16 code units: U+007a	U+00df
       U+6c34 U+d83c U+df4c

See also
	    Character	     locale-defined
	   conversions		      multibyte				 UTF-8
       UTF-16
			    (UTF-8, GB18030)
			   mbrtoc16  /		 codecvt<char16_t,  char,  mb-
       state_t>
	     UTF-16	    c16rtomb(with  C11's  codecvt_utf8_utf16<char16_t>
       N/A
			   DR488)	       codecvt_utf8_utf16<char32_t>
					       codecvt_utf8_utf16<wchar_t>
			   c16rtomb(without		codecvt_utf8<char16_t>
       codecvt_utf16<char16_t>
	      UCS2	   C11's DR488)
					       codecvt_utf8<wchar_t>(Windows)
       codecvt_utf16<wchar_t>(Windows)
					       codecvt<char32_t,   char,   mb-
       state_t>	codecvt_utf16<char32_t>
	     UTF-32	      mbrtoc32	 /   c32rtomb	codecvt_utf8<char32_t>
       codecvt_utf16<wchar_t>(non-Windows)
					       codecvt_utf8<wchar_t>(non-Win-
       dows)
			   mbsrtowcs /
	  system  wide:	     wcsrtombs	UTF-32(non-Windows)  use_facet<codecvt
       No				  No
	  UCS2(Windows)	   <wchar_t, char,
			   mbstate_t>>(locale)

	  codecvt_base		       defines character conversion errors
				       (class template)
	  codecvt_byname		creates	 a codecvt facet for the named
       locale
				       (class template)
	  codecvt_utf8		       converts	between	UTF-8 and UCS2/UCS4
	  (C++11)(deprecated in	C++17) (class template)
	  codecvt_utf16		       converts	between	UTF-16 and UCS2/UCS4
	  (C++11)(deprecated in	C++17) (class template)
	  codecvt_utf8_utf16	       converts	between	UTF-8 and UTF-16
	  (C++11)(deprecated in	C++17) (class template)

http://cppreference.com		  2022.07.31		       std::codecvt(3)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=std::codecvt&sektion=3&manpath=FreeBSD+Ports+15.0>

home | help