Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
ascii2uni(1)		    General Commands Manual		  ascii2uni(1)

NAME
       ascii2uni - convert 7-bit ASCII representations to UTF-8	Unicode

SYNOPSIS
       ascii2uni [options] (<input file	name>)

DESCRIPTION
       ascii2uni  converts  various  7-bit ASCII representations to UTF-8.  It
       reads from the standard input and writes	to the	standard  output.  The
       representations	understood are listed below under the command line op-
       tions. If no format is specified,  standard  hexadecimal	 format	 (e.g.
       0x00e9) is assumed.

COMMAND	LINE OPTIONS
       -a <format> Convert from	the specified format. Formats may be specified
       by means	of the following arbitrary single character codes, by means of
       names such as "SGML_decimal", and by examples of	the desired format.

	      A	 Convert  hexadecimal  numbers with prefix U in	angle-brackets
	      (<U00E9>).

	      B	Convert	\x-escaped hex (e.g. \x00E9)

	      C	 Convert  \x  escaped  hexadecimal  numbers  in	 braces	 (e.g.
	      \x{00E9}).

	      D	 Convert  decimal  HTML	 numeric  character  references	 (e.g.
	      &#0233;)

	      E	Convert	hexadecimal with prefix	U (U00E9).

	      F	Convert	hexadecimal with prefix	u (u00E9).

	      G	Convert	hexadecimal in	single	quotes	with  prefix  X	 (e.g.
	      X'00E9').

	      H	 Convert  hexadecimal  HTML numeric character references (e.g.
	      &#x00E9;)

	      I	Convert	hexadecimal UTF-8 with each byte's hex preceded	by  an
	      =-sign  (e.g.  =C3=A9) . This is the Quoted Printable format de-
	      fined by RFC 2045.

	      J	Convert	hexadecimal UTF-8 with each byte's hex preceded	 by  a
	      %-sign  (e.g.   %C3%A9). This is the URIescape format defined by
	      RFC 2396.

	      K	Convert	octal UTF-8 with each  byte  escaped  by  a  backslash
	      (e.g.  \303\251)

	      L	Convert	\U followed by eight hex digits	or \u followed by four
	      hex  digits.  \UXXXXXXXX	encoding  a  character	within the BMP
	      (U+0000-U+FFFF) is converted but a warning is issued since  this
	      violates the WWW specification.

	      M	 Convert  hexadecimal  SGML numeric character references (e.g.
	      \#xE9;)

	      N	 Convert  decimal  SGML	 numeric  character  references	 (e.g.
	      \#233;)

	      O	 Convert  octal	 escapes for the three low bytes in big-endian
	      order(e.g. \000\000\351))

	      P	Convert	hexadecimal numbers with prefix	U+ (e.g. U+00E9)

	      Q	Convert	HTML character entities	(e.g. &eacute;).

	      R	Convert	raw hexadecimal	numbers	(e.g. 00E9). Requires  the  -p
	      flag.

	      S	Convert	hexadecimal escapes for	the three low bytes in big-en-
	      dian order (e.g. \x00\x00\xE9)

	      T	 Convert decimal escapes for the three low bytes in big-endian
	      order (e.g. \d000\d000\d233)

	      U	Convert	\u-escaped hexadecimal numbers (e.g. \u00E9).

	      V	Convert	\u-escaped decimal numbers (e.g. \u00233).

	      X	Convert	standard hexadecimal numbers (e.g. 0x00E9).

	      Y	Convert	all three types	of HTML	escape:	hexadecimal and	 deci-
	      mal character references and character entities.

	      0	Convert	hexadecimal UTF-8 with each byte's hex enclosed	within
	      angle brackets (e.g.  <C3><A9>).

	      1	Convert	Common Lisp format hexadecimal numbers (e.g. #x00E9).

	      2	Convert	Perl format decimal numbers with prefix	v (e.g.	v233).

	      3	Convert	hexadecimal numbers with prefix	$ (e.g.	$00E9).

	      4	 Convert Postscript format hexadecimal numbers with prefix 16#
	      (e.g. 16#00E9).

	      5	Convert	Common Lisp format  hexadecimal	 numbers  with	prefix
	      #16r (e.g. #16r00E9).

	      6	 Convert  ADA  format  hexadecimal numbers with	prefix 16# and
	      suffix # (e.g. 16#00E9#).

	      7	Convert	Apache log format hexadecimal UTF-8 with  each	byte's
	      hex preceded by a	backslash-x (e.g.  \xC3\xA9).

	      8	Convert	Microsoft OOXML	format hexadecimal numbers with	prefix
	      _x and suffix _ (e.g. _x00E9_).

	      9	Convert	%\u-escaped hexadecimal	numbers	(e.g. %\u00E9).

       -h     Help. Print the usage message and	exit.

       -v     Print program version information	and exit.

       -m     Accept  deprecated  HTML	entities lacking final semicolon, e.g.
	      "&#x00E9"	in place of "&#x00E9;".

       -p     Pure. Assume that	the input consists entirely of escapes	except
	      for arbitrary (but non-null) amounts of separating whitespace.

       -q     Be quiet.	Do not chat unnecessarily.

       -Z <format>
	      Convert  input  using  the supplied format. The format specified
	      will be used as the format string	in a call to sscanf(3) with  a
	      single  argument consisting of a pointer to an unsigned long in-
	      teger. For example, to obtain the	same results as	 with  the  -U
	      flag, the	format would be: \u%04X.

       If the format is	Quoted-Printable, although it is not strictly speaking
       conversion  of an ASCII escape to Unicode, in accordance	with RFC 2045,
       if an equal-sign	occurs at the end of an	input line,  both  the	equal-
       sign and	the immediately	following newline are skipped.

       All  options  that  accept  hexadecimal input recognize both upper- and
       lower-case hexadecimal digits.

EXIT STATUS
       The following values are	returned on exit:

       0 SUCCESS
	      The input	was successfully converted.

       3 INFO The user requested information such as the version number	or us-
	      age synopsis and this has	been provided.

       5 BAD OPTION
	      An incorrect option flag was given on the	command	line.

       7 OUT OF	MEMORY
	      Additional memory	was unsuccessfully requested.

       8 BAD RECORD
	      An ill-formed record was detected	in the input.

SEE ALSO
       uni2ascii(1)

AUTHOR
       Bill Poser <billposer@alum.mit.edu>

LICENSE
       GNU General Public License

				 August, 2011			  ascii2uni(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=ascii2uni&sektion=1&manpath=FreeBSD+Ports+15.0.quarterly>

home | help