Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
MU INDEX(1)		    General Commands Manual		   MU INDEX(1)

NAME
       mu-index	- index	e-mail messages	stored in Maildirs

SYNOPSIS
       mu [COMMON-OPTIONS] index

DESCRIPTION
       mu  index is the	mu command for scanning	the messages in	Maildir	direc-
       tories and storing the results in a Xapian database. The	data can  then
       be queried using	mu-find(1).

       Before  you  can	 run mu	index, you must	initialize mu and its database
       with mu-init(1).

       mu index	understands  Maildirs  as  defined  by	Daniel	Bernstein  for
       qmail(7).  In  addition,	 it  understands  recursive Maildirs (Maildirs
       within Maildirs), Maildir++. It also supports VFAT-based	Maildirs which
       use "!" or ";" as the separators, instead of the	standard ":".

       E-mail messages are only	considered for indexing	if they	 reside	 in  a
       directory  named	 either	 cur  or new. The special (as per the Maildir-
       specification) directory	tmp is ignored,	as are some cache  directories
       for some	other mail-clients. Other directories are scanned recursively.

       Symbolic	 links	are  followed,	and the	directories can	be spread over
       multiple	filesystems; however, note that	moving files  around  is  much
       faster  when  they  all	reside	on a single file-system. Be careful to
       avoid self-referential symlinks!

       If there	is a file called .noindex in a directory, the contents of that
       directory and all of its	subdirectories will be ignored.	 This  can  be
       useful  to  exclude  certain directories	from the indexing process, for
       example directories with	spam-messages.

       If there	is a file called .noupdate in a	 directory,  the  contents  of
       that  directory and all of its subdirectories will be ignored. This can
       be useful to speed up things you	have some maildirs that	never change.

       .noupdate does not  affect  already-indexed  messages:  you  can	 still
       search  for  them. .noupdate is ignored when you	start indexing with an
       empty database (such as directly	after mu init).

       There also the option --lazy-check which	can greatly speed up indexing;
       see below for details.

       A first run of mu index may take	a few minutes if you  have  a  lot  of
       mail  (tens  or hundreds	of thousands of	messages). Fortunately,	such a
       full scan needs to be done only rarely; after that, it suffices to  in-
       dex   just  the	changes,  which	 goes  much  faster.  See  PERFORMANCE
       (i,ii,iii) below	for more information.

       The optional cleanup-phase of the indexing-process is  the  removal  of
       messages	from the database for which there is no	longer a corresponding
       file  in	 the  Maildir.	If you do not want this, you can use -n, --no-
       cleanup.

       When mu index catches one of the	 signals  SIGINT,  SIGHUP  or  SIGTERM
       (e.g.,  when you	press Ctrl-C during the	indexing process), it attempts
       to shutdown gracefully: save and	commit data  and  close	 the  database
       etc.  If	 it  receives  another signal (e.g., when pressing Ctrl-C once
       more), mu index will terminate immediately.

INDEX OPTIONS
   --lazy-check
       In lazy-check mode, mu does not consider	messages for which  the	 time-
       stamp  (ctime)  of  the directory in which they reside, has not changed
       since the previous time this directory was checked.

       This is much faster than	the non-lazy check, but	won't update  messages
       that  have  changed  (rather  than having been added or removed), since
       merely editing a	message	does not update	the directory  time-stamp.  Of
       course, you can run mu-index occasionally without --lazy-check, to pick
       up such messages.

       Furthermore,  in	lazy-check mode, files which have a ctime smaller than
       the time	the previous indexing operation	was  completed,	 are  ignored.
       This  helps  for	 the  use-case	where  new  messages can appear	in big
       maildirs.

   --nocleanup
       Disable the database cleanup that mu does by default after indexing.

   --reindex
       Perform a complete reindexing of	all the	messages in the	maildir.

   --muhome
       Use a non-default directory to store and	read the database,  write  the
       logs,  etc.   By	 default, mu uses the XDG Base Directory Specification
       (e.g. on	GNU/Linux this defaults	to ~/.cache/mu and ~/.config/mu). Ear-
       lier  versions  of  mu  defaulted  to   ~/.mu,	which	now   requires
       --muhome=~/.mu.

       The  environment	 variable  MUHOME  can	be  used  as an	alternative to
       --muhome. The latter has	precedence.

COMMON OPTIONS
   -d, --debug
       Makes mu	generate extra debug information,  useful  for	debugging  the
       program	itself.	 Debug	information goes to the	standard logging loca-
       tion; see mu(1).

   -q, --quiet
       Causes mu not to	output informational messages and progress information
       to standard output, but only to the log file. Error messages will still
       be sent to standard error. Note that  mu	 index	is  much  faster  with
       --quiet,	 so  it	 is recommended	you use	this option when using mu from
       scripts etc.

   --log-stderr
       Causes mu to not	output log messages to standard	error, in addition  to
       sending them to the standard logging location.

   --nocolor
       Do  not	use ANSI colors. The environment variable NO_COLOR can be used
       as an alternative to --nocolor.

   -V, --version
       Prints mu version and copyright information.

   -h, --help
       Lists the various command line options.

ENCRYPTION
       mu index	does not decrypt messages, and	only  the  metadata  (such  as
       headers)	 of  encrypted	messages makes it to the database. mu view and
       mu4e can	decrypt	messages, but those work with the message directly and
       the information is not added to the database.

PERFORMANCE
   indexing in ancient times (2009?)
       As a non-scientific benchmark, a	simple test on the author's machine (a
       Thinkpad	X61s laptop using Linux	2.6.35 and an ext3 file	 system)  with
       no existing database, and a maildir with	27273 messages:

	      $	sudo sh	-c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	      $	time mu	index --quiet
	      66,65s user 6,05s	system 27% cpu 4:24,20 total

       (about 103 messages per second)

       A  second run, which is the more	typical	use case when there is a data-
       base already, goes much faster:

	      $	sudo sh	-c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	      $	time mu	index --quiet
	      0,48s user 0,76s system 10% cpu 11,796 total

       (more than 56818	messages per second)

       Note that each test flushes the caches first; a more  common  use  case
       might  be to run	mu index when new mail has arrived; the	cache may stay
       quite `warm' in that case:

	      $	time mu	index --quiet
	      0,33s user 0,40s system 80% cpu 0,905 total

       which is	more than 30000	messages per second.

   indexing in 2012
       As per June 2012, we did	the same non-scientific	benchmark,  this  time
       with  an	Intel i5-2500 CPU @ 3.30GHz, an	ext4 file system and a maildir
       with 22589 messages. We start without an	existing database.

	      $	sudo sh	-c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	      $	time mu	index --quiet
	      27,79s user 2,17s	system 48% cpu 1:01,47 total

       (about 813 messages per second)

       A second	run, which is the more typical use case	when there is a	 data-
       base already, goes much faster:

	      $	sudo sh	-c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	      $	time mu	index --quiet
	      0,13s user 0,30s system 19% cpu 2,162 total

       (more than 173000 messages per second)

   indexing in 2016
       As  per July 2016, we did the same non-scientific benchmark, again with
       the Intel i5-2500 CPU @ 3.30GHz,	an ext4	file system.  This  time,  the
       maildir contains	72525 messages.

	      $	sudo sh	-c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	      $	time mu	index --quiet
	      40,34s user 2,56s	system 64% cpu 1:06,17 total

       (about 1099 messages per	second).

   indexing in 2022
       A  few  years  later  and it is June 2022. There's a lot	more happening
       during indexing,	but indexing became multi-threaded  and	 machines  are
       faster;	e.g. this is with an AMD Ryzen Threadripper 1950X (16 cores) @
       3.399GHz.

       The instructions	are a little different since we	have a proper  repeat-
       able benchmark now. After building,

	      $	sudo sh	-c 'sync && echo 3 > /proc/sys/vm/drop_caches'
	      %	THREAD_NUM=4 build/lib/tests/bench-indexer -m perf
	      #	random seed: R02Sf5c50e4851ec51adaf301e0e054bd52b
	      1..1
	      #	Start of bench tests
	      #	Start of indexer tests
	      indexed 5000 messages in 20 maildirs in 3763ms; 752 s/message; 1328 messages/s (4	thread(s))
	      ok 1 /bench/indexer/4-cores
	      #	End of indexer tests
	      #	End of bench tests

       Things are again	a little faster, even though the index does a lot more
       now  (text-normalization,  and  pre-generating message-sexps). A	faster
       machine helps, too!

   recent releases
       Indexing	the same 93000-message mail corpus with	the last few releases:

      +-----------------------------------------------------------------------+
      |	      release	time (sec)   notes				      |
      +-----------------------------------------------------------------------+
      |		  1.4	160s						      |
      |		  1.6	178s						      |
      |		  1.8	97s						      |
      |		 1.10	120s	     adds html indexing, sexp-caching	      |
      |	1.11 (master)	96s	     adds language-guessing, batch-size=50000 |
      +-----------------------------------------------------------------------+

       Quite some variation!

       Over time new features /	refactoring can	change	the  timings  quite  a
       bit. At least for now, the latest code is both the fastest and the most
       feature-rich.

EXIT CODE
       This  command  returns 0	upon successful	completion, or a non-zero exit
       code otherwise.

       0.  success

       2.  no matches found. Try a different query

       11. database schema mismatch. You need to  re-initialize	 mu,  see  mu-
	   init(1)

       19. failed  to acquire lock. Some other program has exclusive access to
	   the mu database

       99. caught an exception

REPORTING BUGS
       Please report bugs at https://github.com/djcb/mu/issues.

AUTHOR
       Dirk-Jan	C. Binnema <djcb@djcbsoftware.nl>

COPYRIGHT
       This manpage is part of mu 1.12.15.

       Copyright  2008-2026 Dirk-Jan C.	Binnema. License GPLv3+: GNU GPL  ver-
       sion  3	or later https://gnu.org/licenses/gpl.html. This is free soft-
       ware: you are free to change and	redistribute it. There is NO WARRANTY,
       to the extent permitted by law.

SEE ALSO
       maildir(5), mu(1), mu-init(1), mu-find(1), mu-cfind(1)

								   MU INDEX(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=mu-index&sektion=1&manpath=FreeBSD+Ports+15.0.quarterly>

home | help