Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
unihist(1)		    General Commands Manual		    unihist(1)

NAME
       unihist - Generate a histogram of the characters	in a Unicode file

SYNOPSIS
       unihist ([option	flags])

DESCRIPTION
       unihist	generates  a  histogram	 of the	characters in its input, which
       must be encoded in UTF-8	Unicode. By default,  for  each	 character  it
       prints the frequency of the character as	a percentage of	the total, the
       absolute	 number	 of tokens in the input, the UTF-32 code in  hexadeci-
       mal, and, if the	character is displayable, the glyph  itself  as	 UTF-8
       Unicode.	 Command  line	flags  allow  unwanted	information to be sup-
       pressed.	 In particular,	note that by suppressing the  percentages  and
       counts  it  is  possible	to generate a list of the unique characters in
       the input.

       Output is produced ordered by character code. To	sort it	in  descending
       order of	frequency, pipe	the output into	the command:

	      sort -k1 -n -r

       By  default, unihist handles all	of Unicode. To reduce memory usage and
       increase	speed, it may be compiled so as	to handle only the Basic  Mul-
       tilingual Plane (plane 0) by defining BMPONLY.

COMMAND	LINE FLAGS
       -c     Suppress printing	of counts and percentages.

       -g     Suppress printing	of glyphs.

       -h     Print usage information.

       -u     Suppress printing	of the Unicode code as text.

       -v     Print version information.

SEE ALSO
       uniname (1)

REFERENCES
       Unicode Standard, version 5.0

AUTHOR
       Bill Poser
       billposer@alum.mit.edu

LICENSE
       GNU General Public License

				   May,	2008			    unihist(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=unihist&sektion=1&manpath=FreeBSD+Ports+15.0.quarterly>

home | help