Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
HXINDEX(1)			HTML-XML-utils			    HXINDEX(1)

NAME
       hxindex - insert	an index into an HTML document

SYNOPSIS
       hxindex [-t] [-x] [-n|-N] [-f] [-r] [-c class[,class...]] [-b base] [-i
       indexdb]	 [-s  template]	[-u phrase] [-O	element[,element...]] [-X ele-
       ment[,element...]] [--] [file-or-URL]

DESCRIPTION
       The hxindex looks for terms to be indexed in a document,	collects them,
       turns them into target anchors and creates a sorted index  as  an  HTML
       list,  which is inserted	at the place of	a placeholder in the document.
       The resulting document is written to standard output.

       The index is inserted at	the place of a comment of the form

	   <!--index-->

       or between two comments of the form

	   <!--begin-index-->
	   ...
	   <!--end-index-->

       In the latter case, all existing	content	between	the  two  comments  is
       removed first.

       Index  terms are	either elements	of type	<dfn> or elements with a class
       attribute of "index". (For backward compatibility, also	class  attrib-
       utes  "index-inst" and "index-def" are recognized.) <dfn> elements (and
       class "index-def") are considered more  important  than	elements  with
       class "index" and will appear in	bold in	the generated index.

       The option -c adds additional classes, that are aliases for "index".

       By  default,  the  contents of the element are taken as the index term.
       Here are	two examples of	occurrences of the index term "shoe":

	   A <dfn>shoe</dfn> is	a piece	of clothing that...
	   completed by	a leather <span	class="index">shoe</span>...

       If the term to be indexed is not	equal to the contents of the  element,
       the title attribute can be used to give the correct term:

	   ... <dfn title="shoe">Shoes</dfn> are pieces	of clothing that...
	   ... with two	leather	<span class="index" title="shoe">shoes</span>...

       The  title attribute must also be used when the index term is a subterm
       of another. Subterms appear indented in the  index,  under  their  head
       term.  To  define a subterm, use	a title	attribute with two exclamation
       marks ("!!") between the	term and the subterm, like this:

	   <dfn	title="shoe!!leather">...</dfn>
	   <dfn	title="shoe!!invention of">...</dfn>
	   <em class="index" title="shoe!!protective!!steel nosed">...</em>

       As the last example above shows,	there can be multiple levels  of  sub-
       subterms.

       The  title  attribute also allows multiple index	terms to be associated
       with a single occurrence. The multiple terms are	separated with a  ver-
       tical bar ("|").	Compare	the following examples with the	ones above:

	   <dfn	title="shoe|boot">...</dfn>
	   <dfn	title="shoe!!invention of|inventions!!shoe">...</dfn>

       These  two elements both	insert two terms into the index. Note that the
       second example above combines subterms and multiple terms.

       It is possible to run index on a	file that already has  an  index.  The
       old  target  anchors and	the old	index will be removed before being re-
       generated.

OPTIONS
       The following options are supported:

       -t	 By default, hxindex adds an ID	attribute to the element  that
		 contains the occurrence of a term and also inserts an <a> el-
		 ement	inside	it with	a name attribute equal to the ID. This
		 is to allow old browsers that ignore ID attributes,  such  as
		 Netscape  4,  to  find	the target as well. The	-t option sup-
		 presses the <a> element.

       -x	 This option turns on XML syntax conventions:  empty  elements
		 will end in />	instead	of > as	in HTML.  -x implies -t.

       -i indexdb
		 hxindex  can  read an initial index from a file and write the
		 merged	collection of index terms back to that file. This  al-
		 lows  an  index  to  span several documents. The -i option is
		 used to give the name of the file that	contains the index.

       -b base	 This option is	useful in combination with -i to give the base
		 URL reference of the document.	By default, hxindex will store
		 links to occurrences in the indexdb file in the form #anchor,
		 but when -b is	given, the links will  look  like  base#anchor
		 instead.

		 When used in combination with -n, the title attributes	of the
		 links	will  contain  the title of the	document that contains
		 the term. The title is	inserted before	the template (see  op-
		 tion  -s)  and	 separated  from  it with a comma and a	space.
		 E.g., if hxindex is called with

		     hxindex -i	termdb -n -base	myfile.html myfile.html

		 and the termdb	already	contains an entry for "foo" in in sec-
		 tion "3.1" of a document called "file2.html" with title  "The
		 foos",	then the generated index will contain an entry such as
		 this:

		     foo, <a href="file2.html#foo"
		       title="The foos,	section	3.1">3.1</a>

       -c class,class,...
		 Normal	 index	terms are recognized because they have a class
		 of "index". The -c option  adds  additional,  comma-separated
		 class	names  that  will  be  considered aliases for "index".
		 E.g., -c  instance  will  make	 sure  that  <span  class="in-
		 stance">term</span> is	recognized as a	term for the index.

       -n	 By  default,  the index consists of links with	"#" as the an-
		 chor text.  Option -n causes the link text to consist of  the
		 section  numbers  of  the  sections in	which the terms	occur,
		 falling back to "without number" (see option -u below)	if  no
		 section  number  could	be found. Section numbers are found by
		 looking for the nearest preceding start tag with a  class  of
		 "secno"  or "no-num". In the case of "secno", the contents of
		 that element are taken	as the section number. In the case  of
		 "no-num" the section is assumed to have no number and hxindex
		 will  print  "without number" instead.	These classes are also
		 used by hxnum(1), so it is useful to run hxindex after	hxnum,
		 e.g.,

		     hxnum myfile.html | hxindex -n >mynewfile.html

       -N	 With this option, the anchor text of the links	in  the	 index
		 is  the  full	title of the section in	which the term occurs.
		 The title of the section is the nearest preceding H1, H2, H3,
		 H4, H5	or H6 element, or the document's title if there	is  no
		 preceding  H*	element.  This	option cannot be used together
		 with -n.  If both are used, the last one specified wins.

       -s template
		 When option -n	is used, the link will have a title  attribute
		 and  the template determines what it contains.	The default is
		 "section %s", where the %s is a placeholder for  the  section
		 number.  In  other words, the index will contain entries like
		 this:

		     term, <a href="#term" title="section 7.8">7.8</a>

		 Some examples:

		     hxindex -n	-s 'chapter %s'
		     hxindex -n	-s 'part %s'
		     hxindex -n	-s 'hoofdstuk %s' -u 'zonder nummer'

		 This option is	only useful in combination with	-n

       -u phrase When option -n	is used	to display section numbers, references
		 for which no section number can be found are shown as	phrase
		 instead. The default is "??".

		 This option is	only useful in combination with	-n

       -f	 Remove	 title attributes that were used for the index as well
		 as the	comments that delimit the inserted index. This	avoids
		 that  browsers	 display  these	 attributes. Note that hxindex
		 cannot	be run again on	its own	output if this option is used.
		 (Mnemonic: "freeze" or	"final".)

       -r	 Do not	ignore trailing	punctuation when sorting index	terms.
		 E.g., if two terms are	written	as

		     <dfn>foo,</dfn>...	<span class=index>foo</span>

		 hxindex  will normally	ignore the comma and treat them	as the
		 same term, but	with -r, they are treated as  different.  This
		 affects  trailing commas (,), semicolons (;), colons (:), ex-
		 clamations mark (!), question marks (?)  and full stops  (.).
		 A  final  full	stop is	never ignored if there are two or more
		 in the	term, to protect abbreviations ("B.C.")	 and  ellipsis
		 ("more...").  This  does  not	affect	how  the index term is
		 printed (it is	always printed as it  appears  in  the	text),
		 only how it is	compared to similar terms. (Mnemonic: "raw".)

       -O element,element,...
		 If  -O	is present, only elements with the given names will be
		 indexed. E.g.,

		     hxindex -O	span,i,em

		 means that hxindex will  only	look  for  class="index"  (and
		 other	classes,  according to -c) on the elements span, i and
		 em.  The argument of -O must be a comma-separated list	of el-
		 ement names.  Note that this does not affect the element dfn.
		 It will always	be indexed as a	defining instance.

       -X element,element,...
		 The option -X excludes	the given elements from	being indexed.
		 E.g.,

		     hxindex -X	ul,ol

		 makes sure that ul and	ol elements are	not indexed,  even  if
		 they  have  a	class="index" attribute. This does not exclude
		 their children	from being indexed. E.g.,

		     <ul class=index>
		      <li class=index>foo
		      <li class=index>bar
		      <li>baz
		     </ul>

		 will add foo and bar to the index, but	not the	whole  content
		 of the	ul element (foo	bar baz).  If both -O and -X are given
		 and  an  element occurs in both options, it will be excluded.
		 E.g.,

		     hxindex -X	p,h1,ul	-O em,span,h1,h2

		 will cause hxindex to only look for class attributes  on  em,
		 span and h2, because h1 is excluded.

OPERANDS
       The following operand is	supported:

       file-or-URL
		 The name of an	HTML or	XML file or the	URL of one. If absent,
		 or if the file	is "-",	standard input is read instead.

EXIT STATUS
       The following exit values are returned:

       0	 Successful completion.

       >0	 An error occurred in parsing the HTML file.

ENVIRONMENT
       The  input is assumed to	be in UTF-8, but the current locale is used to
       determine the sorting order of the index	terms. I.e., hxindex looks  at
       the  LANG,  LC_ALL  and/or  LC_COLLATE  environment  variables. See lo-
       cale(1).

       To use a	proxy to retrieve remote files,	set the	environment  variables
       http_proxy or ftp_proxy.	 E.g., http_proxy="http://localhost:8080/"

BUGS
       Assumes	UTF-8  as input. Doesn't expand	character entities (apart from
       the standard ones: "&amp;", "&lt;", "&gt" and "&quot").	Instead,  pipe
       the  input  through hxunent(1) and, if needed, asc2xml(1) to convert it
       to UTF-8.

       Remote files (specified with a URL) are currently  only	supported  for
       HTTP.  Password-protected  files	or files that depend on	HTTP "cookies"
       are not handled.	(You can use tools such	as curl(1) or wget(1)  to  re-
       trieve such files.)

       The  accessibility  of an index,	even when generated with option	-n, is
       poor.

SEE ALSO
       asc2xml(1), hxnormalize(1), hxnum(1), hxprune(1), hxtoc(1), hxunent(1),
       xml2asc(1), locale(1), UTF-8 (RFC 2279)

7.x				  10 Jul 2011			    HXINDEX(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=hxindex&sektion=1&manpath=FreeBSD+Ports+15.0.quarterly>

home | help