Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
utf8trans(1)			   docbook2X			  utf8trans(1)

NAME
       utf8trans - Transliterate UTF-8 characters according to a table

SYNOPSIS
       utf8trans charmap [file]...

DESCRIPTION
       utf8trans transliterates	characters in the specified files (or standard
       input,  if  they	 are  not specified) and writes	the output to standard
       output. All input and output is in the UTF-8 encoding.

       This program is usually used to render characters in Unicode text files
       as some markup escapes or ASCII transliterations.  (It is not  intended
       for general charset conversions.)  It provides functionality similar to
       the  character  maps in XSLT 2.0	(XML Stylesheet	Language - Transforma-
       tions, version 2.0).

OPTIONS
       -m, --modify
	      Modifies the given files in-place	with their transliterated out-
	      put, instead of sending it to standard output.

	      This option is useful  for  efficient  transliteration  of  many
	      files at once.

       --help Show brief usage information and exit.

       --version
	      Show version and exit.

USAGE
       The  translation	is done	according to the rules in the `character map',
       named in	the file charmap. It has the following format:

       1.  Each	line represents	a translation entry, except  for  blank	 lines
	   and comment lines, which are	ignored.

       2.  Any amount of whitespace (space or tab) may precede the start of an
	   entry.

       3.  Comment  lines  begin  with	#.  Everything on the same line	is ig-
	   nored.

       4.  Each	entry consists of the Unicode codepoint	of  the	 character  to
	   translate,  in  hexadecimal,	followed one space or tab, followed by
	   the translation string, up to the end of the	line.

       5.  The translation string is taken literally,  including  any  leading
	   and trailing	spaces (except the delimeter between the codepoint and
	   the	translation  string), and all types of characters. The newline
	   at the end is not included.

       The above format	is intended to be restrictive, to keep utf8trans  sim-
       ple.   But   if	 a   XML-based	 format	  is   desired,	  there	 is  a
       xmlcharmap2utf8trans script that	comes with the docbook2X distribution,
       that converts character maps in XSLT 2.0	format to the  utf8trans  for-
       mat.

LIMITATIONS
        utf8trans  does  not  work with binary	files, because malformed UTF-8
	 sequences in the input	are substituted	with U+FFFD characters.	Howev-
	 er, null characters in	the input are handled correctly. This  limita-
	 tion may be removed in	the future.

        There	is  no	way  to	 include a newline or null in the substitution
	 string.

AUTHOR
       Steve Cheng <stevecheng@users.sourceforge.net>.

docbook2X 0.8.8			 3 March 2007			  utf8trans(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=utf8trans&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help