Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help

       hxnormalize - pretty-print an HTML file

       hxnormalize  [  -x ] [ -X ] [ -e	] [ -d ] [ -s ]	[ -L ] [ -i indent ] [
       -l line-length ]	[ -c commentmagic ] [ file-or-URL ]

       The hxnormalize command pretty-prints an	HTML or	 XML  file,  and  also
       tries to	fix small HTML errors. The output is the same file, but	with a
       maximum line length and with optional indentation to indicate the nest-
       ing level of each line.

       The following options are supported:

       -x	 Applies  XML  conventions:  empty elements are	written	with a
		 slash at the end (e.g., <IMG />) and, if the input  is	 HTML,
		 any  `<' and `&' inside <style> and <script> elements are es-
		 caped as `&lt;' and `&amp;'. (The input is assumed to be HTML
		 unless	the -X option is present.) Implies -e.

       -e	 Always	 inserts  endtags,  even if HTML does not require them
		 (for example: </p> and	</li>).

       -X	 Makes hxnormalize assume the input  is	 well-formed  XML.  It
		 does not try to infer omitted HTML tags, does not assume ele-
		 ments such as <img> and <br> are empty, and  does  not	 treat
		 `<' and `&' inside <style> and	<script> as normal characters.

       -d	 Omit the DOCTYPE from the output.

       -i indent Set  the  number  of spaces to	indent each nesting level. De-
		 fault is 2.  Not all elements cause an	 indent.  In  general,
		 elements that can occur in a block environment	are started on
		 a new line and	cause an indent, but inline elements, such  as
		 EM and	SPAN do	not cause an indent.

       -l line-length
		 Sets  the  maximum  length  of	 lines.	 hxnormalize will wrap
		 lines so that all lines are  as  long	as  possible,  but  no
		 longer	than this length. Default is 72. Words that are	longer
		 than the line length will not be broken, and will extend past
		 this  length.	A `word' is a sequence of characters delimited
		 by white space.) The content of the STYLE, SCRIPT and PRE el-
		 ements	will not be line-wrapped.

       -s	 Omit <span> tags that don't have any attributes.

       -L	 Remove	 redundant  `lang'  and	 `xml:lang' attributes.	(I.e.,
		 those whose value is the same as the language inherited  from
		 the parent element.)

       -c commentmagic
		 Comments  are normally	placed right after the preceding text.
		 That is usually correct for short comments, but some comments
		 are meant to be on a separate line.  commentmagic is a	string
		 and when that string occurs  inside  a	 comment,  hxnormalize
		 will output an	empty line before that comment.	E.g. -c	"===="
		 can be	used to	put all	comments that contain `====' on	a sep-
		 arate	line,  preceded	 by an empty line. By default, no com-
		 ments are treated that	way.

       The following operand is	supported:

		 The name or URL of an HTML file. If absent, standard input is
		 read instead.

       The following exit values are returned:

       0	 Successful completion.

       > 0	 An error occurred in the parsing of the HTML file.  hxnormal-
		 ize will try to correct the error and produce output anyway.

       To use a	proxy to retrieve remote files,	set the	environment  variables
       http_proxy and ftp_proxy.  E.g.,	http_proxy="http://localhost:8080/"

       The error recovery for incorrect	HTML is	primitive.

       hxnormalize  will  not omit an endtag if	the white space	after it could
       possibly	be significant.	E.g., it will not remove the first  </p>  from
       `<div><p>text</p> <p>text</p></div>'.

       hxnormalize  can	 currently  only  retrieve  remote files over HTTP. It
       doesn't handle password-protected files,	nor files  whose  content  de-
       pends on	HTTP `cookies.'

       When  converting	 from  XML  to HTML (option -X without option -x), any
       pairs of	<![CDATA[ and `]]>' are	removed	and  character	entities  &lt;
       &gt;  &quot;  &apos;  and &amp; are expanded (to	`<', `>', `"', `'' and
       `&', respectively), but any other character entities are	not  expanded.
       To  expand  other character entities, pipe the input through hxunent(1)

       To limit	lines to a given  number  of  characters,  hxnormalize	breaks
       lines  at spaces	(or inside tags). Some writing systems do not use spa-
       ces between words and thus hxnormalize may not be able to break	lines,
       except at already existing line breaks.

       To  make	short lines longer, hxnormalize	will combine lines and replace
       a line break by a space,	except in writing systems that do not put spa-
       ces between words, where	the line break is replaced by nothing.	hxnor-
       malize currently	only does the latter for  Japanese,  Chinese,  Korean,
       Khmer  and  Thai.  (The text must be correctly marked up	with `lang' or

       asc2xml(1), xml2asc(1), hxunent(1), UTF-8 (RFC 2279)

7.x				  10 Jul 2011			HXNORMALIZE(1)


Want to link to this manual page? Use this URL:

home | help