Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
ezmlm-archive(1)	    General Commands Manual	      ezmlm-archive(1)

NAME
       ezmlm-archive  -	 create	 thread	 and  author  index for	a mailing list
       archive

SYNOPSIS
       ezmlm-archive [ -cCFTvV ][ -f msg1 ] ][ -t msg2 ] dir

DESCRIPTION
       ezmlm-archive reads the index files from	a message archive, and creates
       a subject index,	a collection of	subject	files, and a collection	of au-
       thor files. These files are suitable as an index	for WWW	access to, and
       navigation through a mailing list archive by ezmlm-cgi(1).

       The index files read are	created	by ezmlm-idx(1)	on  a  per-list	 basis
       and by ezmlm-send(1) on a per-message archive for a indexed list.

       The output files	created	are:

       dir/archive/threads/yyyymm
	      The  thread  index.  It  contains	one line per subject, starting
	      with the number of the first message with	 that  subject	within
	      the set investigated, ``:'', a 20	character subject hash,	blank,
	      ``[n]''  where  ``n''  is	 the number of messages	in the thread,
	      blank, and the subject.  The file	 ``yyyymm''  contains  entries
	      for  all	threads	 that have messages in the month ``yyyymm'' or
	      that have	messages both before and after that month.   The  sub-
	      ject hash	is a key to the	subject	files; the message number is a
	      key to the index file.  The lines	are in ascending order by mes-
	      sage  number  when  the  index is	created	de novo	on an existing
	      archive. When the	messages are added  one-by-one	as  in	normal
	      archive  operation, ``n''	is the number of message in the	thread
	      for the particular month and the order is	in reverse  of	latest
	      message,	i.e.  the last extended	thread is shown	last. The mes-
	      sage number accompanying a thread	is always a message within the
	      thread. It is the	first in archives created on  existing	lists,
	      and the last message in incrementally created archives.  Use the
	      corresponding  subject  index file to get	a list of all messages
	      in the thread in ascending order.

       dir/archive/subjects/xx/yyyyyyyyyyyyyyyyyy
	      A	subject	file. The first	line is	the subject hash, a space, and
	      the subject.  This is followed by	one line per message with this
	      subject, in the format message  number,  ``:'',  date  (yyyymm),
	      ``:'',  author  hash,  blank,  author  from  line. The lines are
	      sorted by	message	number.	The author hash	is a key to the	author
	      files; the message number	is a key to the	index file.  The  file
	      in    the	   example    would    be   for	  the	subject	  hash
	      ``xxyyyyyyyyyyyyyyyyyy''.

       dir/archive/authors/xx/yyyyyyyyyyyyyyyyyy
	      An author	file. The first	line is	the author hash, a space,  and
	      the  author from line.  This is followed by one line per message
	      with this	author,	in the	format	message	 number,  ``:'',  date
	      (yyyymm),	 ``:'',	 subject  hash,	 blank,	subject. The lines are
	      sorted by	message	number.	The subject hash is a key to the  sub-
	      ject  files;  the	message	number is a key	to the index file. The
	      file  in	the   example	would	be   for   the	 author	  hash
	      ``xxyyyyyyyyyyyyyyyyyy''.

	      dir/archnum keeps	track of the last message processed. Normally,
	      ezmlm-archive  will  process entries for messages	from one above
	      the contents of this file	up to an including the message	number
	      in dir/num.

OPTIONS
       ezmlm-archive  writes messages in a crash-proof manner when run in nor-
       mal mode. When overriding the normal message range with any of the  op-
       tions  listed, the normal sync(3) of the	output files is	suppressed for
       efficiency. Should the computer crash during this time the state	of the
       indices is not defined. Use the -s option in the	(extremely rare) cases
       where this would	be a problem.

       -c     Create a new index. This	overrides dir/archnum  causing	ezmlm-
	      archive  to start	with the first message in the archive. Synonym
	      for -f0.	NOTE: ezmlm-archive does not remove files in  the  in-
	      dex. While it will overwrite/update old files it will not	remove
	      files that are obsolete for other	reasons.

       -C     (Default.)   Process entries starting with the message after the
	      message listed in	dir/archnum.

       -f msg1
	      Process messages from the	archive	section	(set of	100  messages)
	      containing  message  msg1.   This	 is useful if you have removed
	      part of the archive, as it will shorten processing time and  de-
	      crease memory use.  NOTE:	ezmlm-archive does not remove files in
	      the  index. While	it will	overwrite/update old files it will not
	      remove files that	are obsolete for other reasons.	The number  of
	      messages	per  thread will be incorrect when using of the	-f and
	      -t switches leads	to partial re-indexing of already indexed mes-
	      sages.

       -F     (Default.)  Do not change	the starting message from the  default
	      (see -C).

       -s     Always sync files.

       -S     (Default.)  Sync files, except when on of	the message range mod-
	      ifying options is	used.

       -t msg2
	      Process  messages	to message msg2	instead	of the last message in
	      the archive. Again, files	written	are corrected, but other files
	      are not explicitly removed.

       -T     (Default.)  Process entries for messages up to the last  message
	      in the archive.

       -v     Display ezmlm-archive version info.

       -V     Display ezmlm-archive version info.

MEMORY USAGE
       ezmlm-archive stores its	linked lists in	memory.	On at 32-bit architec-
       ture,  it uses 12 bytes per message, 28 bytes per thread	(plus one copy
       of the subject),	and 20 bytes per author	(plus one copy of  the	author
       from line).

       In normal list use, it processes	only at	most a few messages at a time,
       but  for	initial	processing of a	large archive, considerable amounts of
       memory may be used. Assuming 40 bytes for subject/from line, 5 messages
       per thread, 100,000 messages, and 1000 authors, this  is	 2.5  MB.  For
       1,000,000 messages this is about	20 MB.

       Thus,  for  large  archives,  it	 may be	useful to use the -t switch to
       process the archive in multiple subsets,	starting with e.g.  the	 first
       100,000,	then the next, and so on.

SEE ALSO
       ezmlm-cgi(1), ezmlm-idx(1), ezmlm-send(1), ezmlm(5)

							      ezmlm-archive(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=ezmlm-archive&sektion=1&manpath=FreeBSD+Ports+15.0>

home | help