Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
DICTD(8)							      DICTD(8)

NAME
       dictd - a dictionary database server

SYNOPSIS
       dictd [options]

DESCRIPTION
       dictd  is  a  server  for  the Dictionary Server	Protocol (DICT), a TCP
       transaction based query/response	protocol that allows a client  to  ac-
       cess  dictionary	 definitions from a set	of natural language dictionary
       databases.

       For security reasons, dictd drops root permissions after	 startup.   If
       user  dictd  exists  on	the  system, the daemon	will run as that user,
       group dictd, otherwise it will run as  user  nobody,  group  nobody  or
       nogroup (depending on the operating system distribution).

       Since  startup  time is significant, the	server is designed to run con-
       tinuously, and should not be run	from inetd(8).	(However, with a  fast
       processor, it is	feasible to do so.)

       Databases are distributed separately from the server.

       By  default,  dictd  assumes that the index files are sorted alphabeti-
       cally, and only alphanumeric characters from the	7-bit ASCII  character
       set are used for	search.	 This default may be overridden	by a header in
       the data	file.  The only	such features implemented at this time are the
       headers	"00-database-allchars" which tells dictd that non-alphanumeric
       characters may also be used for search, the  header  "00-database-utf8"
       which indicates that the	database uses utf8 encoding, and the "00-data-
       base-8bit-new"  which indicates that the	database is encoded and	sorted
       according to a locale that uses an 8-bit	encoding.

BACKGROUND
       For many	years, the Internet community has relied on the	"webster" pro-
       tocol for access	to natural language definitions.  The webster protocol
       supports	access to a single dictionary and  (optionally)	 to  a	single
       thesaurus.   In	recent years, the number of publicly available webster
       servers on the Internet has dramatically	decreased.

       Fortunately, several  freely-distributable  dictionaries	 and  lexicons
       have recently become available on the Internet.	However, these freely-
       distributable databases are not accessible via a	uniform	interface, and
       are not accessible from a single	site.  They are	often small and	incom-
       plete  individually,  but would collectively provide an interesting and
       useful database of English words.  Examples include  the	 Jargon	 file,
       the  WordNet  database,	MICRA's	 version of the	1913 Webster's Revised
       Unabridged Dictionary, and the Free  Online  Dictionary	of  Computing.
       (See  the DICT protocol specification (RFC) for references.)  Translat-
       ing and non-English dictionaries	are also becoming available (for exam-
       ple, the	FOLDOC dictionary is being translated into Spanish).

       The webster protocol is not suitable for	providing access  to  a	 large
       number  of separate dictionary databases, and extensions	to the current
       webster protocol	were not felt to be a clean solution to	the dictionary
       database	problem.

       The DICT	protocol is designed to	provide	access to multiple  databases.
       Word  definitions can be	requested, the word index can be searched (us-
       ing an easily extended set of algorithms), information about the	server
       can be provided (e.g., which index search strategies are	supported,  or
       which databases are available), and information about a database	can be
       provided	 (e.g.,	 copyright,  citation,	or  distribution information).
       Further,	the DICT protocol has hooks that can be	used to	 restrict  ac-
       cess to some or all of the databases.

       dictd(8)	 is  a	server that implements the DICT	protocol.  Bret	Martin
       implemented another server, and several people (including Bret and  my-
       self) have implemented clients in a variety of languages.

OPTIONS
       -V or --version
	      Display version information.

       --license
	      Display copyright	and license information.

       -h or --help
	      Display help information.

       -v or --verbose or  -dverbose
	      Be verbose.

       -c file or --config file
	      Specify	configuration	file.	 The   default	 is   /usr/lo-
	      cal/etc/dictd.conf , but may be changed in the  defs.h  file  at
	      compile time (DICTD_CONFIG_FILE).

       -p port or --port port
	      Overrides	the keyword port in Global Settings Specification sec-
	      tion of configuration file.

       -i or --inetd
	      Communicate  on standard input/output, suitable for use from in-
	      etd.  Although, due to its rather	large startup time, this  dae-
	      mon was not intended to run from inetd, with a fast processor it
	      is feasible to do	so. This option	also implies --fast-start.

       --pp prog
	      Sets  a preprocessor for configuration file.  like  m4 or	 cpp .
	      See examples/dictd_complex.conf file from	distribution.  By  de-
	      fault configuration file is parsed without preprocessor.

       --depth length
	      Overrides	 the  keyword  depth  in Global	Settings Specification
	      section of configuration file.

       --delay seconds
	      Overrides	the keyword delay  in  Global  Settings	 Specification
	      section of configuration file.

       --facility facility
	      The  same	as syslog_facility keyword in Global Settings Specifi-
	      cation of	configuration files.

       -f or --force
	      Force the	daemon to start	even if	an instance of the  daemon  is
	      already  running.	 (This is of little value unless a non-default
	      port is specified	with -p, since,	if one instance	is bound to  a
	      port, the	second one fails when it can not bind to the port.)

       --limit children
	      Overrides	 the  keyword  limit  in Global	Settings Specification
	      section of configuration file.

       --listen-to host
	      Overrides	the keyword listen_to in Global	Settings Specification
	      section of configuration file.

       --address-family	family
	      Overrides	the keyword address_family in Global Settings Specifi-
	      cation section of	configuration file.

       --locale	locale
	      Overrides	the keyword locale in  Global  Settings	 Specification
	      section of configuration file.

       -s     The  same	 as syslog keyword in Global Settings Specification of
	      configuration files.

       -L file or --logfile file
	      The same as log_file keyword in Global Settings Specification of
	      configuration files.

       --pid-file file
	      The same as pid_file keyword in Global Settings Specification of
	      configuration files.

       -m minutes  or --mark minutes
	      Overrides	the keyword timestamp in Global	Settings Specification
	      section of configuration file.

       --default-strategy strategy
	      Overrides	the keyword default_strategy in	Global Settings	Speci-
	      fication section of configuration	file.

       --without-strategy strat1,strat2,...
	      The same as without_strategy keyword in Global Settings Specifi-
	      cation of	configuration files.

       --add-strategy strategy_name:description
	      The same as add_strategy keyword in Global  Settings  Specifica-
	      tion of configuration files.

       --fast-start
	      The  same	as fast_start keyword in Global	Settings Specification
	      of configuration files.

       --without-mmap
	      The same as without_mmap keyword in Global  Settings  Specifica-
	      tion of configuration files.

       --stdin2stdout
	      When  applied  with --inetd, each	command	obtained from stdin is
	      output to	stdout.	This option is useful for debugging.

       -l option or --log option
	      The same as log_option keyword in	Global Settings	 Specification
	      of configuration files.

       -d option
	      The  same	 as debug_option keyword in Global Settings Specifica-
	      tion of configuration files.

CONFIGURATION FILE
       Introduction
	      The configuration	file defaults to /usr/local/etc/dictd.conf but
	      can be specified on the command line with	 the  -c  option  (see
	      above).

	      The  configuration  file	is read	into memory at startup,	and is
	      not referenced again by dictd unless a signal 1 (SIGHUP) is  re-
	      ceived, which will cause dictd to	reread the configuration file.

	      The  file	 is  divided into sections.  The Access	Section	should
	      come first, followed by the Database Section, and	the User  Sec-
	      tion.   The  Database  Section  is  required; the	others are op-
	      tional, but they must be in the order listed here.

       Syntax The following keywords are valid in a  configuration  file:  ac-
	      cess,  allow,  deny,  group, database, data, index, filter, pre-
	      filter, postfilter, name,	include, user, authonly,  site.	  Key-
	      words  are case sensitive.  String arguments that	contain	spaces
	      should be	surrounded by double quotes.  Without quoting, strings
	      may contain alphanumeric characters and _, -, ., and *, but  not
	      spaces.	Strings	 can  be continued between lines.  \", \\, \n,
	      \<NL> are	treated	as double quote, backslash, new	 line  and  no
	      symbol  respectively.   Comments	start with # and extend	to the
	      end of the line.

       Global Settings Section

	      global { global settings specification }
		     Used to set global	dictd setting such as log file,	syslog
		     facility, locale and so on.

	      EXAMPLE:
		     See examples/dictd4.conf file from	the distribution.

       Access Section

	      access { access specification }
		     This section contains access restrictions for the	server
		     and all of	the databases collectively.  Per-database con-
		     trol is specified in the Database Section.

	      EXAMPLE:
		     See examples/dictd3.conf file from	the distribution.

       Database	Section

	      database string {	database specification }
		     The  string  specifies the	name of	the database (e.g., wn
		     or	web1913).  (This is an arbitrary name selected by  the
		     administrator, and	is not necessarily related to the file
		     name  or any name listed in the data file.	 A short, easy
		     to	type name is often selected for	 easy  use  with  dict
		     -d.)

		     EXAMPLE: See examples/dictd*.conf files from the distrib-
		     ution.

		     NOTE:  If	the files specified in the database specifica-
		     tion do not exist on the system, dictd may	silently fail.

	      database_virtual string {	virtual	database specification }
		     This section specifies the	virtual	database.  The	string
		     specifies the name	of the database	(e.g., en-ru or	fren).

		     EXAMPLE:	See   examples/dictd_virtual.conf   or	 exam-
		     ples/dictd_complex.conf files from	the distribution.

	      database_plugin string { plugin specification }
		     This section specifies the	plugin.	 The string  specifies
		     the name of the database.

		     EXAMPLE:	See  examples/dictd_plugin_dbi.conf  or	 exam-
		     ples/dictd_complex.conf files from	the distribution.

	      database_mime string { mime specification	}
		     Traditionally,  databases	created	 for  dictd  contained
		     plain  text  only	because	 dictd	releases before	1.10.0
		     didn't have full support of OPTION	MIME  option  (consult
		     with RFC-2229).  This section describes the special data-
		     base  which  behaves differently depending	on whether OP-
		     TION MIME command was received from client	 or  was  not,
		     i.e.  the	database created by this section allows	to re-
		     turn to the client	either a plain text or specially  for-
		     matted  content depending on whether DICT client supports
		     (or wants to receive) MIMEized content or	doesn't.   The
		     string specifies the name of the database.

		     NOTE: All this is about DEFINE command only.  MATCH, SHOW
		     DB,  SHOW STRAT, SHOW INFO, SHOW SERVER and HELP commands
		     return texts prepended with empty line only.

		     EXAMPLE: See examples/dictd_mime.conf file	from the  dis-
		     tribution.

	      database_exit
		     Excludes  following  databases from the '*' database.  By
		     default '*' means all databases available.	 Look at  'ex-
		     amples/dictd_virtual.conf'	 file  for  example configura-
		     tion.

		     NOTE: If you use 'virtual'	dictionaries, you  should  use
		     this  directive,  otherwise you will search the same dic-
		     tionary twice.

	      User Section

		     user string string
			    The	first string specifies the username,  and  the
			    second string specifies the	shared secret for this
			    username.	When  the  AUTH	 command  is used, the
			    client will	provide	the username and a hashed ver-
			    sion of the	shared secret.	If the	shared	secret
			    matches,  the  user	is said	to have	authenticated,
			    and	will have access  to  databases	 whose	access
			    specifications  allow  that	 user  (by name, or by
			    wildcard).	If present, this section  must	appear
			    last in the	configuration file.  There may be many
			    user  entries.   The  shared secret	should be kept
			    secret, as anyone who has access to	it can	access
			    the	 shared	 databases (assuming access is not de-
			    nied by domain name).

	      Access Specification
		     Access specifications may occur in	the Access Section  or
		     in	 the  Database Section.	 The access specification will
		     be	described here.

		     For allow,	deny, and authonly, a star (*) may be used  as
		     a	wild  card  that  matches any number of	characters.  A
		     question mark (?) may be used as a	wildcard that  matches
		     a	single character.  For example,	10.0.0.* and *.edu are
		     valid strings.

		     Further, a	range of IP addresses and an IP	 address  fol-
		     lowed  by	a  netmask  may	 be  specified.	  For example,
		     10.0.0.0:10.0.0.255, 10.0.0.0/24, and 10.0.0.* all	 spec-
		     ify  the  same  range  of IP numbers.  Notation cannot be
		     combined on the same line.	 If the	notation does not make
		     sense, access will	be denied by default.  Use the --debug
		     auth option to debug related problems.

		     Note that these specifications take only one  string  per
		     specification line.  However, you can have	multiple lines
		     of	each type.

		     The syntax	is as follows:

		     allow string
			    The	 string	 specifies a domain name or IP address
			    which is allowed access to the server (in the  Ac-
			    cess  Section)  or	to a database (in the Database
			    Section).  Note that more than one string  is  not
			    permitted for a single "allow" line, but more than
			    one	 "allow" lines are permitted in	the configura-
			    tion file.

		     deny string
			    The	string specifies a domain name or  IP  address
			    which  is  denied access to	the server (in the Ac-
			    cess Section) or to	a database  (in	 the  Database
			    Section).	Note  that if reverse DNS is not work-
			    ing, then only the	IP  number  will  be  checked.
			    Therefore,	it is essential	to deny	networks based
			    on IP number, since	a denial based on domain  name
			    may	not always be checked.

		     authonly string
			    This  form	is  only useful	in the Access Section.
			    The	string specifies a domain name or  IP  address
			    which  is  allowed access to the server but	not to
			    any	of the databases.  All commands	are valid  ex-
			    cept  DEFINE,  MATCH,  and SHOW DB.	 More specifi-
			    cally AUTH is a valid command, and commands	 which
			    access the databases are not allowed.

		     user string
			    This  form is only useful in the Database Section.
			    The	string specifies a username that is allowed to
			    access this	database after a successful AUTH  com-
			    mand is executed.

       Global Settings Specification
	      This section describes the following parameters:

	      port string_or_number
		     Specifies the port	or service name	(e.g., 2628).  The de-
		     fault is 2628, as specified in the	DICT Protocol RFC, but
		     may  be  changed  in  the	defs.h	file  at  compile time
		     (DICT_DEFAULT_SERVICE).

	      site string
		     Used to specify the filename  for	the  site  information
		     file,  a  flat  text  file	which will be displayed	in re-
		     sponse to the SHOW	SERVER command.

		     EXAMPLE: See examples/dictd4.conf file from the distribu-
		     tion.

	      site_no_banner boolean
		     By	default	SHOW SERVER command outputs information	 about
		     dictd  version and	an operating system type.  This	option
		     disables this.

	      site_no_uptime boolean
		     By	default	SHOW SERVER command outputs information	 about
		     uptime  of	 dictd	,  a number of forks since startup and
		     forks per hour.  This option disables this.

	      site_no_dblist boolean
		     By	default	SHOW SERVER command outputs internal  informa-
		     tion  about databases, such as a number of	headwords, in-
		     dex size and so on.  This option disables this.

	      delay number
		     Specifies the number of seconds a client may be idle  be-
		     fore  the server will close the connection.  Idle time is
		     defined to	be the time the	server is  waiting  for	 input
		     and does not include the time the server spends searching
		     the  database.  The  default is 0 seconds (no limit), but
		     may be  changed  in  the  defs.h  file  at	 compile  time
		     (DICT_DEFAULT_DELAY).

		     NOTE:  Setting  delay  option disables limit_time option.
		     Only one of them (last specified in dictd.conf  )	is  in
		     effect.

		     NOTE:  Connections	 are  closed  without warning since no
		     provision for premature connection	termination is	speci-
		     fied in the DICT protocol RFC.

	      depth number
		     Specify  the  queue  length for listen(2).	 Specifies the
		     number of pending socket connections which	are queued  by
		     the   operating   system.	 Some  operating  systems  may
		     silently limit this value to 5 (older BSD systems)	or 128
		     (Linux).  The default is 10 but may  be  changed  in  the
		     defs.h file at compile time (DICT_QUEUE_DEPTH).

	      limit_childs number
		     Specifies	the  number of daemons that may	be running si-
		     multaneously.  Each daemon	services a single  connection.
		     If	 the limit is exceeded,	a (serialized) connection will
		     be	made by	the server process, and	a  response  code  420
		     (server  temporarily  unavailable)	 will  be  sent	to the
		     client.  This parameter should be adjusted	to prevent the
		     server machine from being overloaded by dict clients, but
		     should not	be set so low that  many  clients  are	denied
		     useful  connections.  The	default	 is  100,  but	may be
		     changed in	the defs.h file	 at  compile  time  (DICT_DAE-
		     MON_LIMIT_CHILDS).

	      limit number
		     Synonym  for  limit_childs.   For	backward compatibility
		     only.

	      limit_matches number
		     Specifies the maximum number of matches that can  be  re-
		     turned  by	 MATCH query. Zero means no limit. The default
		     is	2000.

	      limit_definitions	number
		     Specifies the maximum number of definitions that  can  be
		     returned  by  DEFINE  query. Zero means no	limit. The de-
		     fault is 200.

	      limit_time number
		     Specifies the number of seconds a client may talk to  the
		     server  before the	server will close the connection.  The
		     default is	600 seconds (10	minutes), but may  be  changed
		     in	  the	defs.h	 file	at   compile   time  (DICT_DE-
		     FAULT_LIMIT_TIME).

		     NOTE: Setting limit_time option  disables	delay  option.
		     Only  one	of  them (last specified in dictd.conf ) is in
		     effect.

		     NOTE: Connections are closed  without  warning  since  no
		     provision	for premature connection termination is	speci-
		     fied in the DICT protocol RFC.

	      limit_queries number
		     Specifies the number of queries (MATCH, DEFINE,  SHOW  DB
		     etc.)   that  client  may	send  to the server before the
		     server will close the connection.	Zero means  no	limit.
		     The  default  is  2000,  but may be changed in the	defs.h
		     file at compile time (DICT_DEFAULT_LIMIT_QUERIES).

	      timestamp	number
		     How often a timestamp should  be  logged  (int  minutes).
		     (This  is effective only if logging has been enabled with
		     the -s or -L option, or with a debugging option.)

	      log_option option
		     Specify a logging option.	This is	effective only if log-
		     ging has been enabled with	the -s or -L option or in con-
		     figuration	file, or logging to the	console	has been acti-
		     vated with	a debugging option  (e.g.,  --debug  nodetach.
		     Only  one	option may be set with each invocation of this
		     option; however, multiple invocations of this option  may
		     be	made in	configuration file or dictd command line.  For
		     instance:
		     dictd -s --log stats --log	found --log notfound
		     is	a valid	command	line, and sets three logging options.

		     Some of the more verbose logging options are used primar-
		     ily  for debugging	the server code, and are not practical
		     for normal	use.

		     server Log	server diagnostics.  This  is  extremely  ver-
			    bose.

		     connect
			    Log	all connections.

		     stats  Log	all children terminations.

		     command
			    Log	all commands.  This is extremely verbose.

		     client Log	results	of CLIENT command.

		     found  Log	all words found	in the databases.

		     notfound
			    Log	all words not found in the databases.

		     timestamp
			    When  logging to a file, use a full	timestamp like
			    that which syslog would  produce.	Otherwise,  no
			    timestamp is made, making the files	shorter.

		     host   Log	name of	foreign	host.

		     auth   Log	authentication failures.

		     min    Set	 a  minimal  number of options.	 If logging is
			    activated (to a file, or via syslog), and  no  op-
			    tions  are	set,  then  the	minimal	set of options
			    will be used.  If options are set, then only those
			    options specified will be used.

		     all    Set	all of the options.

		     none   Clear all of the options.

		     To	facilitate location of interesting information in  the
		     log  file,	 entries are marked with initial letters indi-
		     cating the	class of the line being	logged:

		     I	    Information	about the server, connections, or ter-
			    mination statistics.  These	 lines	are  generally
			    not	designed to be parsed automatically.

		     E	    Error messages.

		     C	    CLIENT command information.

		     D	    Definitions	found in the databases searched.

		     M	    Matches found in the database searched.

		     N	    Matches  which  were  not  found  in the databases
			    searched.

		     T	    Trace of exact line	sent by	client.

		     A	    Authentication information.

		     To	preserve anonymity of the client, do not use the  con-
		     nect  or  host options.  Clients may or may not send host
		     information using the CLIENT command, but this should  be
		     an	option that is selectable on the client	side.

	      debug_option string
		     Activate  a  debugging option.  There are several,	all of
		     which are only useful to developers.  They	are documented
		     here for completeness.  A list can	be  obtained  interac-
		     tively by using -d	with an	illegal	option.

		     verbose
			    The	 same  as  -v or --verbose.  Adds verbosity to
			    other options.

		     scan   Debug the scanner for the configuration file.

		     parse  Debug the parser for the configuration file.

		     search Debug the character	folding	and binary search rou-
			    tines.

		     init   Report database initialization.

		     port   Log	client-side port number	to the log file.

		     lev    Debug Levenshtein search algorithm.

		     auth   Debug the authorization routines.

		     nodetach
			    Do not detach as a	background  process.   Implies
			    that  a  copy  of  the log file will appear	on the
			    standard output.

		     nofork Do not fork	daemons	to  service  requests.	 Be  a
			    single-threaded server.  This option implies node-
			    tach,  and	is most	useful for using a debugger to
			    find the point at which daemon processes are dump-
			    ing	core.

		     alt    Debugs altcompare in index.c.

	      locale string
		     Specifies the locale used for searching.  If no locale is
		     specified,	the "C"	locale is used.	 The locale  used  for
		     the  server  should  be the same as that used for dictfmt
		     when the database was built (specifically,	the locale un-
		     der which the index was sorted).  The  locale  should  be
		     specified	for  both  8-bit  and UTF-8 formats. If	locale
		     contains utf8 or utf-8 substring,	UTF-8  format  is  ex-
		     pected.   Note  that if your database is not in ASCII7 or
		     UTF-8 format, then	the dictd server will not be compliant
		     to	RFC 2229.

		     NOTE If utf-8 or 8-bit dictionaries are included  in  the
		     configuration  file, and the appropriate --locale has not
		     been specified, dictd will	fail to	start.	 This  implies
		     that dictd	will not run with both utf-8 and 8-bit dictio-
		     naries in the configuration file.

	      add_strategy strategy_name description
		     Adds strategy strategy_name with the description descrip-
		     tion.  This new search strategy may be implemented	with a
		     help  of plugins.	Both strategy_name and description are
		     strings.

	      default_strategy string
		     Set the server's default search strategy for MATCH	search
		     type.  The	compiled-in default is 'lev'.  It is also pos-
		     sible to set default  strategy  per  database.   See  de-
		     fault_strategy keyword in Database	specification section.

	      disable_strategy string
		     Disable specified strategies.  By default all implemented
		     search  strategies	 are  enabled.	It is also possible to
		     disable strategies	per  database.	 See  disable_strategy
		     keyword in	Database specification section.

	      listen_to	host
		     Local  host  name or IP address for bind.	If unspecified
		     or	*, dictd will  bind  to	 all  interfaces.   Otherwise,
		     dictd will	bind to	this address only.

	      address_family family
		     If	4, address family is IPv4 (the default), if 6, address
		     family is IPv6.

	      syslog string
		     Log using the syslog(3) facility.

	      syslog_facility string
		     Specifies	the  syslog  facility to use.  The use of this
		     option implies the	-s option to turn on logging via  sys-
		     log.   When  the  operating system	libraries support SYS-
		     LOG_NAMES,	the names used for this	option should be those
		     listed in syslog.conf(5).	Otherwise, the following names
		     are used (assuming	the particular facility	is defined  in
		     the  header  files):  auth,  authpriv, cron, daemon, ftp,
		     kern, lpr,	mail, news, syslog, user,  uucp,  local0,  lo-
		     cal1, local2, local3, local4, local5, local6, and local7.

	      log_file string
		     Specify  the file for logging.  The filename specified is
		     recomputed	on each	use using the strftime(3)  call.   For
		     example, a	filename ending	in ".%Y%m%d" will write	to log
		     files  ending  in	the year, month, and date that the log
		     entry was written.
		     NOTE: If dictd does not have write	 permission  for  this
		     file, it will silently fail.

	      pid_file string
		     The  specified  filename  will  be	created	to contain the
		     process id	of the main  dictd  process.  The  default  is
		     /var/run/dictd.pid

	      fast_start
		     By	default, dictd creates (in memory) additional index to
		     make the search faster.  This option disables this	behav-
		     iour and makes startup faster.

	      without_mmap
		     do	 not  use  the	mmap(2)	function and read entire files
		     into memory instead.  Use this option, if	you  know  ex-
		     actly what	you are	doing.

       Database	Specification
	      The database specification describes the database:

	      data string
		     Specifies	the  filename  for the flat text database.  If
		     the filename does not  begin  with	 '.'  or  '/',	it  is
		     prepended	with  $datadir/.  It is	a compile time option.
		     You can change this behaviour by editing Makefile or run-
		     ning ./configure --datadir=...

	      index string
		     Specifies the filename for	the index file.	  Path	matter
		     is	similar	to that	described above	in "data" option .

	      index_suffix string
		     This  is  optional	 index	file  to  make 'suffix'	search
		     strategy faster (binary  search).	 It  is	 generated  by
		     'dictfmt_index2suffix'. Run "dictfmt_index2suffix --help"
		     for more information.  Path matter	is similar to that de-
		     scribed above in "data" option .

	      index_word string
		     This  is optional index file to make 'word' search	strat-
		     egy  faster  (binary  search).   It   is	generated   by
		     'dictfmt_index2word'. Run "dictfmt_index2word --help" for
		     more  information.	  Path	matter	is similar to that de-
		     scribed above in "data" option .

	      prefilter	string
		     Specifies the  prefilter command.	When  a	chunk  of  the
		     compressed	 database  is  read, it	will be	filtered  with
		     this filter before	being decompressed.  This may be  used
		     to	provide	 some additional compression  that knows about
		     the data and can provide better compression than the LZ77
		     algorithm used by zlib.

	      postfilter string
		     Specifies the postfilter command.	When a	chunk  of  the
		     compressed	 database  is  read,  it will be filtered with
		     this filter before	the offset and length  for  the	 entry
		     are  used	to access data.	 This is provided for symmetry
		     with the prefilter	command, and may also  be  useful  for
		     providing additional database compression.

	      filter string
		     Specifies	the  filter  command.	After the entry	is ex-
		     tracted from the database,	it will	be filtered with  this
		     filter.   This  may be used to provide formatting for the
		     entry (e.g., for html).

	      name string
		     Specifies the short name of  the  database	 (e.g.,	 "1913
		     Webster's").  If the string begins	with @,	then it	speci-
		     fies  the	headword  to look up in	the dictionary to find
		     the  short	 name  of  the	database.   The	  default   is
		     "@00-database-short",  but	 this  may  be	changed	in the
		     defs.h file at compile time (DICT_SHORT_ENTRY_NAME).

	      info string
		     Specifies the information about database.	If the	string
		     begins  with @, then it specifies the headword to look up
		     in	the dictionary to find information.   The  default  is
		     "@00-database-info",  but	this  may  be  changed	in the
		     defs.h file at compile time (DICT_INFO_ENTRY_NAME).

	      invisible
		     Makes dictionary invisible	to the clients i.e. this  dic-
		     tionary will not be recognized or shown by	DEFINE,	MATCH,
		     SHOW INFO,	SHOW SERVER and	SHOW DB	commands. If some def-
		     initions  or  matches  are	found in invisible dictionary,
		     the name of the upper visible virtual dictionary  is  re-
		     turned.  Dictionaries '*' and '!' don't include invisible
		     ones.   NOTE: Invisible dictionaries are completely inac-
		     cessible (and invisible) to the client  unless  they  are
		     included  to  the	virtual	 or MIME dictionary (See data-
		     base_virtual or database_mime database sections).

	      disable_strategy string
		     Disables the specified strategy for database.   This  may
		     be	 useful	for slow dictionaries (plugins)	or for dictio-
		     naries included to	virtual	ones.  For an example see file
		     examples/dictd_complex.conf.

	      default_strategy string
		     Specifies the strategy which will be used if the database
		     is	accessed using the strategy '.'.  I.e. this  directive
		     is	the way	to set the preferred search strategy per data-
		     base. For example,	instead	of strategy lev	, the strategy
		     word may be preferred for databases mainly	containing the
		     multiword phrases but the single words.

       Virtual Database	Specification
	      The  virtual  database specification describes the virtual data-
	      base:

	      database_list string
		     Specifies a list of databases which are included into the
		     virtual database.	Database names are in the  string  and
		     are separated by comma.

	      name string
		     Specifies	the  short  name of the	database. See database
		     specification

	      info string
		     Specifies the information about  database.	 See  database
		     specification

	      invisible
		     Makes  dictionary	invisible to the clients. See database
		     specification

	      disable_strategy string
		     Disables the specified strategy for database.  See	 data-
		     base specification

       Plugin Specification

	      plugin string
		     Specifies a filename of the plugin.

	      data string
		     Specifies data for	initializing plugin.

	      name string
		     Specifies	the  short name	of the database.  See Database
		     Specification for more information.

	      info string
		     Specifies the information about database.	 See  Database
		     Specification for more information.

	      invisible
		     Makes  dictionary invisible to the	clients.  See Database
		     Specification for more information.

	      disable_strategy string
		     Disables the specified strategy for database.  See	 Data-
		     base Specification	for more information.

	      default_strategy string
		     Sets the default search strategy for database.  See Data-
		     base Specification	for more information.

       Mime Specification

	      dbname_nomime string
		     Specifies	the  real  database name which is used in case
		     OPTION MIME command was NOT received from a client.

	      dbname_mime string
		     Specifies the real	database name which is	used  in  case
		     OPTION MIME command WAS received from a client.  A	neces-
		     sary  MIME	 header	is set while creating a	database.  See
		     dictfmt(1)	for option --mime-header.

	      name string
		     Specifies the short name of the database.	 See  Database
		     Specification for more information.

	      info string
		     Specifies	the  information about database.  See Database
		     Specification for more information.

	      invisible
		     Makes dictionary invisible	to the clients.	 See  Database
		     Specification for more information.

	      disable_strategy string
		     Disables  the specified strategy for database.  See Data-
		     base Specification	for more information.

	      default_strategy string
		     Sets the default search strategy for database.  See Data-
		     base Specification	for more information.

       include string
	      The text of the file "string" (usually a database	specification)
	      will be read as if it appeared at	this location in the  configu-
	      ration file.  Nested includes are	not permitted.

DETERMINATION OF ACCESS	LEVEL
       When  a client connects,	the global access specification	is scanned, in
       order, until a specification matches.  If no access  specification  ex-
       ists,  all access is allowed (e.g., the action is the same as if	"allow
       *" was the only item in the specification).  For	each  item,  both  the
       hostname	and IP are checked. For	example, consider the following	access
       specification:
	      allow 10.42.*
	      authonly *.edu
	      deny *
       With  this  specification, all clients in the 10.42 network will	be al-
       lowed access to unrestricted databases; all clients  from  *.edu	 sites
       will be allowed to authenticate,	but will be denied access to all data-
       bases,  even  those  which  are	otherwise  unrestricted; and all other
       clients will have their connection terminated immediately.   The	 10.42
       network	clients	can send an AUTH command and gain access to restricted
       databases.  The *.edu clients must send an AUTH command to gain	access
       to any databases, restricted or unrestricted.

       When  the  AUTH	command	 is sent, the access list for each database is
       scanned,	in order, just as the global access list is scanned.  However,
       after authentication, the client	has an associated username.  For exam-
       ple, consider the following access specification:
	      user u1
	      deny *.com
	      user u2
	      allow *
       If the client authenticated as u1, then the client will have access  to
       this  database,	even  if  the client comes from	a *.com	site.  In con-
       trast, if the client authenticated as u2, the client will only have ac-
       cess if it does not come	from a *.com site.  In this  case,  the	 "user
       u2" is redundant, since that client would also match "allow *".

       Warning:	 Checks	 are  performed	for domain names and for IP addresses.
       However,	if reverse DNS for a specific site is not working, it is  pos-
       sible  that a domain name may not be available for checking.  Make sure
       that all	denials	use IP addresses.  (And	consider a future enhancement:
       if a domain name	is not available, should denials that depend on	a  do-
       main name match anything?  This is the more conservative	viewpoint, but
       it is not currently implemented.)

SEARCH ALGORITHMS
       The DICT	standard specifies a few search	algorithms that	must be	imple-
       mented, and permits others to be	supported on a server-dependent	basis.
       The  following  search  strategies  are supported by this server.  Note
       that all	strategies are case  insensitive.   Most  ignore  non-alphanu-
       meric, non-whitespace characters.

       exact  An  exact	match.	This algorithm uses a binary search and	is one
	      of the fastest search algorithms available.

       lev    The Levenshtein algorithm	(string	edit distance of  one).	  This
	      algorithm	 searches  for all words which are within an edit dis-
	      tance of one from	the target word.  An "edit"  means  an	inser-
	      tion, deletion, or transposition.	 This is a rapid algorithm for
	      correcting  spelling  errors,  since  many  spelling  errors are
	      within a Levenshtein distance of one from	the original word.

       prefix Prefix match.  This algorithm also uses a	binary search  and  is
	      very fast.

       nprefix
	      Like  prefix but returns the specified range of matches. For ex-
	      ample, when prefix strategy returns 1000 matches,	 you  can  get
	      only  100	 ones skipping the first 800 matches.  This is made by
	      specified	these limits in	a query	like this: 800#100#app,	 where
	      800  is  skip  count, 100	is a number of matches you want	to get
	      and "app"	is your	query.	This strategy allows to	implement DICT
	      client with fast autocompletion (although	 it  is	 not  trivial)
	      just like	many standalone	dictionary programs do.

	      NOTE:  If	 you  access  the dictionary "*" (or virtual one) with
	      nprefix strategy,	the same range is set for each database	in it,
	      but globally for all matches found in all	databases.

	      NOTE: In case you	access	non-english  dictionary	 the  returned
	      matches  may  be	(and mostly will be) NOT ordered in alphabetic
	      order.

       re     POSIX 1003.2 (modern) regular expression search.	Modern regular
	      expressions are the ones used by egrep(1).   These  regular  ex-
	      pressions	allow predefined character classes (e.g., [[:alnum:]],
	      [[:alpha:]],  [[:digit:]],  and [[:xdigit:]] are useful for this
	      application); uses * to match a sequence 0 or  more  matches  of
	      the  previous  atom;  uses  +  to	 match a sequence of 1 or more
	      matches of the previous atom; uses ? to match a sequence of 0 or
	      1	matches	of the previous	atom; used ^ to	match the beginning of
	      a	word, uses $ to	match the end of a  word,  and	allows	nested
	      subexpression  and  alternation  with  ()	 and  |.  For example,
	      "(foo|bar)" matches all  words  that  contain  either  "foo"  or
	      "bar".   To  match these special characters, they	must be	quoted
	      with two backslashes (due	to the quoting characteristics of  the
	      server).	Warning: Regular expression matches can	take 10	to 300
	      times  longer  than  substring  matches.	On a busy server, with
	      many databases, this can required	more than 5 minutes of waiting
	      time, depending on the complexity	of the regular expression.

       regexp Old (basic)  regular  expressions.   These  regular  expressions
	      don't  support  |,  +,  or  ?.   Groups use escaped parentheses.
	      While modern regular expressions are generally  easier  to  use,
	      basic  regular  expressions have a back reference	feature.  This
	      can be used to match a second occurrence of something  that  was
	      already  matched.	  For  example,	the following expression finds
	      all words	that begin and end with	the same three letters:
		  ^\\(...\\).*\\1$

	      Note the use of the double backslashes  to  escape  the  special
	      characters.  This	is required by the DICT	protocol string	speci-
	      fication (a single backslash quotes the next character --	we use
	      two  to get a single backslash through to	the regular expression
	      engine).	Warning: Note that the use  of	backtracking  is  even
	      slower than the use of general regular expressions.

       soundex
	      The  Soundex  algorithm,	a  classic algorithm for finding words
	      that sound similar to each other.	 The  algorithm	 encodes  each
	      word  using the first letter of the word and up to three digits.
	      Since the	first letter is	known, this search is relatively fast,
	      and it sometimes good for	correcting spelling  errors  when  the
	      Levenshtein algorithm doesn't help.

       substring
	      Match  a substring anywhere in the headword.  This search	strat-
	      egy uses a modified Boyer-Moore-Horspool	algorithm.   Since  it
	      must search the whole index file,	it is not as fast as the exact
	      and prefix matches.

       suffix Suffix  match.  This search strategy also	uses a modified	Boyer-
	      Moore-Horspool algorithm,	 and  is  as  fast  as	the  substring
	      search.	If  the	optional index_suffix string file is listed in
	      the configuration	file this search is much faster.

       word   Match any	single word, even if part of a multi-word  entry.   If
	      the  optional index_word string file is listed in	the configura-
	      tion file	this search strategy works much	faster.

       first  Match the	first word that	begins a multi-word entry.

       last   Match the	last word that ends a multi-word entry.	  If  the  op-
	      tional  index_suffix  string file	is listed in the configuration
	      file this	search strategy	works much faster.

DATABASE FORMAT
       Databases for dictd are distributed separately.	A database consists of
       two files.  One is a flat text file, the	other is the index.

       The flat	text file contains dictionary entries (or any  other  suitable
       data),  and  the	 index contains	tab-delimited tuples consisting	of the
       headword, the byte offset at which this entry begins in the  flat  text
       file,  and the length of	the entry in bytes.  The offset	and length are
       encoded using base 64 encoding using the	64-character subset of	Inter-
       national	 Alphabet  IA5	discussed in RFC 1421 (printable encoding) and
       RFC 1522	(base64	MIME).	Encoding the offsets in	base 64	saves  consid-
       erable space when compared with the usual base 10 encoding, while still
       permitting tab characters (ASCII	9) to be used for delimiting fields in
       a  record.   Each  record  ends with a newline (ASCII 10), so the index
       file is human readable.

       Some headwords are used by dictd	especially

       00-database-info	Containts the information about	database which is  re-
       turned  by  SHOW	INFO command, unless it	is specified in	the configura-
       tion file.

       00-database-short Containts the short name of the database which	is re-
       turned by SHOW DB command, unless it is specified in the	 configuration
       file.  See dictfmt -s.

       00-database-url	URL  where  original  dictionary sources were obtained
       from.  See dictfmt -u.  This headword is	not used by dictd

       00-database-utf8	Presents if dictionary is encoded  using  UTF-8.   See
       dictfmt --utf8

       00-database-8bit-new  Presents  if  dictionary  is  encoded using 8-BIT
       character set (not ASCII	and not	UTF8).	See dictfmt --locale.

       The flat	text file may be compressed using gzip(1) (not recommended) or
       dictzip(1) (highly recommended).	 Optimal speed will be obtained	 using
       an  uncompressed	 file.	 However, the gzip compression algorithm works
       very well on plain text,	and can	result in space	savings	typically  be-
       tween  60  and 80%.  Using a file compressed with gzip(1) is not	recom-
       mended, however,	because	random access on the file can only  be	accom-
       plished	by  serially  decompressing the	whole file, a process which is
       prohibitively slow.  dictzip(1) uses the	same compression algorithm and
       file format as does gzip(1), but	provides a table that can be  used  to
       randomly	 access	 compressed  blocks  in	 the file.  The	use of 50-64kB
       blocks for compression typically	degrades compression by	less than 10%,
       while maintaining acceptable random access capabilities for all data in
       the file.  As an	added benefit, files compressed	with dictzip(1)	can be
       decompressed with gzip(1) or zcat(1).  (Note: recompressing a dictzip'd
       file using, for example,	znew(1)	will destroy the random	access charac-
       teristics of the	file.  Always compress data files using	dictzip(1).)

SIGNALS
       SIGHUP causes dictd to reread configuration file	and reinitialize data-
       bases.

       SIGUSR1 causes dictd to unload databases. Then dictd returns 420	status
       (instead	of 220). To load databases again, send SIGHUP signal.  Because
       database	 files	are mmap'ed(2) , it is impossible to update them while
       dictd is	running.  So, if you need to update database files and	reread
       configuration file, first, send SIGUSR1 signal to dictd to unload data-
       bases, update files, and	then send SUGHUP signal	to load	them again.

COPYING
       The  main source	files for the dictd server and the dictzip compression
       program were written by Rik Faith (faith@dict.org) and are  distributed
       under the terms of the GNU General Public License.  If you need to dis-
       tribute under other terms, write	to the author.

       The  main  libraries  used  by these programs (zlib, regex, libmaa) are
       distributed under different terms, so you may be	able to	 use  the  li-
       braries	for applications which are incompatible	with the GPL --	please
       see the copyright notices and license information that  come  with  the
       libraries  for  more information, and consult with your attorney	to re-
       solve these issues.

BUGS
       The regular expression searches do not ignore  non-whitespace,  non-al-
       phanumeric  characters  as  do  the  other searches.  In	practice, this
       isn't much of a problem.

WARNINGS
       Conformance of regular expressions (used	by 're'	 and  'regexp'	search
       strategies)  to	ERE  and  BRE depends on library you build dictd with.
       Whether 're' and	'regex'	strategies support utf8	depends	on library you
       build dictd with.

FILES
       /usr/local/etc/dictd.conf
	      dictd configuration file

       /usr/local/sbin/dictd
	      dictd daemon itself

       /var/run/dictd.pid
	      File for storing pid of dictd daemon

       /usr/local/share
	      The default directory for	dictd databases	(.index	and .dict[.dz]
	      files)

SEE ALSO
       examples/dictd*.conf,  dictfmt(1),  dict(1),   dictzip(1),   gunzip(1),
       zcat(1),	webster(1), RFC	2229

				 29 March 2002			      DICTD(8)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=dictd&sektion=8&manpath=FreeBSD+Ports+14.3.quarterly>

home | help