Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
ESTCMD(1)			Hyper Estraier			     ESTCMD(1)

NAME
       estcmd -	command	line interface of the core API

SYNOPSIS
       estcmd  create  [-tr] [-apn|-acc] [-xs|-xl|-xh|-xh2|-xh3] [-sv|-si|-sa]
       [-attr name type] db

       estcmd  put  [-tr]  [-cl]  [-ws]	 [-apn|-acc]  [-xs|-xl|-xh||-xh2|-xh3]
       [-sv|-si|-sa] db	[file]

       estcmd out [-cl]	[-pc enc] db expr

       estcmd edit [-pc	enc] db	expr name [value]

       estcmd get [-nl|-nb] [-pidx path] [-pc enc] db expr [attr]

       estcmd list [-nl|-nb] [-lp] db

       estcmd uriid [-nl|-nb] [-pidx path] [-pc	enc] db	expr

       estcmd meta db [name [value]]

       estcmd inform [-nl|-nb] db

       estcmd optimize [-onp] [-ond] db

       estcmd merge [-cl] db target

       estcmd repair [-rst|-rsh] db

       estcmd	   search     [-nl|-nb]	    [-pidx     path]	 [-ic	  enc]
       [-vu|-va|-vf|-vs|-vh|-vx|-dd] [-sn wnum hnum anum] [-kn num] [-um] [-ec
       rn] [-gs|-gf|-ga] [-cd] [-ni] [-sf|-sfr|-sfu|-sfi] [-hs]	 [-attr	 expr]
       [-ord  expr]  [-max  num] [-sk num] [-aux num] [-dis name] [-sim	id] db
       [phrase]

       estcmd gather [-tr] [-cl] [-ws] [-no] [-fe|-ft|-fh|-fm] [-fx sufs  cmd]
       [-fz]  [-fo]  [-rm sufs]	[-ic enc] [-il lang] [-bc] [-lt	num] [-lf num]
       [-pc    enc]    [-px    name]	[-aa	name	value]	   [-apn|-acc]
       [-xs|-xl|-xh|-xh2|-xh3]	[-sv|-si|-sa] [-ss name] [-sd] [-cm] [-cs num]
       [-ncm] [-kn num]	[-um] db [file|dir]

       estcmd purge [-cl] [-no]	[-fc] [-pc enc]	[-attr expr] db	[prefix]

       estcmd extkeys [-no] [-fc] [-dfdb file] [-ncm] [-ni]  [-kn  num]	 [-um]
       [-attr expr] db [prefix]

       estcmd words [-nl|-nb] [-dfdb file] [-kw|-kt] db

       estcmd  draft  [-ft|-fh|-fm]  [-ic enc] [-il lang] [-bc]	[-lt num] [-kn
       num] [-um] [file]

       estcmd break [-ic enc] [-il lang] [-apn|-acc] [-wt] [file]

       estcmd iconv [-ic enc] [-il lang] [-oc enc] [file]

       estcmd regex [-inv] [-repl str] expr [file]

       estcmd scandir [-tf|-td]	[-pa|-pu] [dir]

       estcmd multi [-db db] [-nl|-nb] [-ic  enc]  [-gs|-gf|-ga]  [-cd]	 [-ni]
       [-sf|-sfr|-sfu|-sfi]  [-hs]  [-hu]  [-attr expr]	[-ord expr] [-max num]
       [-sk num] [-aux num] [-dis name]	[phrase]

       estcmd randput [-ren|-rla|-reu|-ror|-rjp|-rch] [-cs num]	db dnum

       estcmd wicked db	dnum

       estcmd regression db

       estcmd version

DESCRIPTION
       estcmd is an aggregation	of sub commands.  The name of a	sub command is
       specified by the	first argument.	 Other arguments are parsed  according
       to each sub command.  The argument db specifies the path	of an index.

       estcmd create [-tr] [-apn|-acc] [-xs|-xl|-xh|-xh2|-xh3] [-sv|-si|-sa]
       [-attr name type] db
	      Create an	index.
	      If  -tr  is  specified, a	new index is created regardless	if one
	      exists.
	      If -apn is specified, N-gram analysis is performed against Euro-
	      pean text	also.
	      If -acc is specified, character category analysis	 is  performed
	      instead of N-gram	analysis.
	      If  -xs  is  specified, the index	is tuned to register less than
	      50000 documents.
	      If -xl is	specified, the index is	tuned to  register  more  than
	      300000 documents.
	      If  -xh  is  specified, the index	is tuned to register more than
	      1000000 documents.
	      If -xh2 is specified, the	index is tuned to register  more  than
	      5000000 documents.
	      If  -xh3	is specified, the index	is tuned to register more than
	      10000000 documents.
	      If -sv is	specified, scores are stored as	void.
	      If -si is	specified, scores are stored as	32-bit integer.
	      If -sa is	specified, scores are stored as-is and marked  not  to
	      be tuned when search.
	      -attr  specifies an attribute index and its data type.  This op-
	      tion can be specified multiple times.

       estcmd put [-tr]	[-cl] [-apn|-acc] [-xs|-xl|-xh|-xh2|-xh3]
       [-sv|-si|-sa] db	[file]
	      Register a document of document draft to an index.
	      file specifies a target file.  If	it is  omitted,	 the  standard
	      input is read.
	      If  -tr  is  specified, a	new index is created regardless	if one
	      exists.
	      If -cl is	specified,  regions  of	 a  overwritten	 document  are
	      cleaned up.
	      If  -ws  is specified, scores are	weighted statically with score
	      weighting	attribute.
	      If -apn is specified, N-gram analysis is performed against Euro-
	      pean text	also.
	      If -acc is specified, character category analysis	 is  performed
	      instead of N-gram	analysis.
	      If  -xs  is  specified, the index	is tuned to register less than
	      50000 documents.
	      If -xl is	specified, the index is	tuned to  register  more  than
	      300000 documents.
	      If  -xh  is  specified, the index	is tuned to register more than
	      1000000 documents.
	      If -xh2 is specified, the	index is tuned to register  more  than
	      5000000 documents.
	      If  -xh3	is specified, the index	is tuned to register more than
	      10000000 documents.
	      If -sv is	specified, scores are stored as	void.
	      If -si is	specified, scores are stored as	32-bit integer.
	      If -sa is	specified, scores are stored as-is and marked  not  to
	      be tuned when search.

       estcmd out [-pc enc] [-cl] db expr
	      Remove information of a document from an index.
	      expr  specifies  the  ID number, the URI,	or the local path of a
	      document.
	      If -cl is	specified, regions of the document are cleaned up.
	      -pc specifies the	encoding of file paths.	  By  default,	it  is
	      ISO-8859-1.

       estcmd edit [-pc	enc] db	expr name [value]
	      Edit an attribute	of a document in an index.
	      expr  specifies  the  ID number, the URI,	or the local path of a
	      document.
	      name specifies the name of an attribute.
	      value specifies the value	of the attribute.  If it  is  omitted,
	      the attribute is removed.
	      -pc  specifies  the  encoding of the file	path and the attribute
	      value.  By default, it is	ISO-8859-1.

       estcmd get [-nl|-nb] [-pidx path] [-pc enc] db expr [attr]
	      Output document draft of a document in an	index.
	      expr specifies the ID number, the	URI, or	the local  path	 of  a
	      document.
	      If attr is specified, only the value of the attribute is output.
	      If -nl is	specified, the index is	opened without file locking.
	      If -nb is	specified, file	locking	is performed without blocking.
	      -pidx  specifies the path	of a pseudo index.  This option	can be
	      specified	multiple times.
	      -pc specifies the	encoding of file paths.	  By  default,	it  is
	      ISO-8859-1.

       estcmd list [-nl|-nb] [-lp] db
	      Output a list of all document in an index.
	      If -nl is	specified, the index is	opened without file locking.
	      If -nb is	specified, file	locking	is performed without blocking.
	      If  -lp  is specified, local path	equivalent to URL of "file://"
	      is output.

       estcmd uriid [-nl|-nb] [-pidx path] [-pc	enc] db	expr
	      Output the ID number of a	document specified by URI.
	      expr specifies the URI or	the local path of a document.
	      If -nl is	specified, the index is	opened without file locking.
	      If -nb is	specified, file	locking	is performed without blocking.
	      -pidx specifies the path of a pseudo index.  This	option can  be
	      specified	multiple times.
	      -pc  specifies  the  encoding  of	file paths.  By	default, it is
	      ISO-8859-1.

       estcmd meta db [name [value]]
	      Handle meta data.
	      name specifies the name of a piece of meta data.	If it is omit-
	      ted, a list of all names is output.
	      value specifies the value	of the meta data to be	recorded.   If
	      it  is  omitted, the current value is output.  If	it is an empty
	      string, the meta data is removed.

       estcmd inform [-nl|-nb] db
	      Output the number	of documents and the number of unique words in
	      an index.
	      If -nl is	specified, the index is	opened without file locking.
	      If -nb is	specified, file	locking	is performed without blocking.

       estcmd optimize [-onp] [-ond] db
	      Optimize an index	and clean up dispensable regions.
	      If -onp is specified, it is omitted to clean up dispensable  re-
	      gions.
	      If  -ond	is  specified,	it is omitted to optimize the database
	      files.

       estcmd merge [-cl] db target
	      Merge another index.
	      target specifies the path	of another index.
	      If -cl  is  specified,  regions  of  overwritten	documents  are
	      cleaned up.

       estcmd repair [-rst|-rsh] db
	      Repair a broken index.
	      If -rst is specified, strict consistency check is	performed.
	      If -rsh is specified, consistency	check is omitted.

       estcmd search [-nl|-nb] [-pidx path] [-ic enc]
       [-vu|-va|-vf|-vs|-vh|-vx|-dd] [-sn wnum hnum anum] [-kn num] [-um] [-ec
       rn] [-gs|-gf|-ga] [-cd] [-ni] [-sf|-sfr|-sfu|-sfi] [-hs]	[-attr expr]
       [-ord expr] [-max num] [-sk num]	[-aux num] [-dis name] [-sim id] db
       [phrase]
	      Search an	index for documents.
	      phrase specifies the search phrase.
	      If -nl is	specified, the index is	opened without file locking.
	      If -nb is	specified, file	locking	is performed without blocking.
	      -pidx  specifies the path	of a pseudo index.  This option	can be
	      specified	multiple times.
	      -ic specifies the	input encoding.	 By default, it	is UTF-8.
	      If -vu is	specified, TSV of ID number and	URI are	output.
	      If -va is	specified, multipart format  including	attributes  is
	      output.
	      If  -vf  is specified, multipart format including	document draft
	      is output.
	      If -vs is	specified, multipart format including  attributes  and
	      snippets is output.
	      If  -vh is specified, human readable format including attributes
	      and snippets is output.
	      If -vx is	specified,  XML	 including  including  attributes  and
	      snippets is output.
	      If  -dd  is  specified, document draft data are dumped and saved
	      into separated files.
	      -sn specifies the	number of whole	width of snippet and width  of
	      strings  picked  up  from	the beginning of the text and width of
	      strings picked up	around each highlighted	word.
	      -kn specifies the	number of keywords to be  extracted.   By  de-
	      fault, keyword extraction	is not performed.
	      If  -um  is specified, morphological analyzers are used for key-
	      word extraction.
	      -ec specifies lower limit	of similarity eclipse.
	      If -gs is	specified, every key of	N-gram	is  checked.   By  de-
	      fault, it	is alternately.
	      If -gf is	specified, keys	of N-gram are checked every three.
	      If -ga is	specified, keys	of N-gram are checked every four.
	      If  -cd  is specified, whether documents match the search	phrase
	      definitely is checked.
	      If -ni is	specified, TF-IDF tuning is omitted.
	      If -sf is	specified, the phrase is treated as a simplified form.
	      If -sfr is specified, the	phrase is treated as a rough form.
	      If -sfu is specified, the	phrase is treated as a union form.
	      If -sfi is specified, the	phrase is treated as  an  intersection
	      form.
	      If  -hs  is  specified,  score  information  is output as	an at-
	      tribute.
	      -attr specifies an attribute search condition.  This option  can
	      be specified multiple times.
	      -ord specifies the order expression.  By default,	it is descend-
	      ing by score.
	      -max  specifies the maximum number of shown documents.  Negative
	      means unlimited.	By default, it is 10.
	      -sk specifies the	number of documents to	be  skipped.   By  de-
	      fault, it	is 0.
	      -aux  specifies  permission to adopt result of the auxiliary in-
	      dex.  If it is not more than 0, the auxiliary index is not used.
	      By default, it is	32.
	      -dis specifies the name of the distinct attribute.
	      -sim specifies the ID number of the seed document	for similarity
	      search.

       estcmd gather [-tr] [-cl] [-ws] [-no] [-fe|-ft|-fh|-fm] [-fx sufs cmd]
       [-fz] [-fo] [-rm	sufs] [-ic enc]	[-il lang] [-bc] [-lt num] [-lf	num]
       [-pc enc] [-px name] [-aa name value] [-apn|-acc]
       [-xs|-xl|-xh|-xh2|-xh3] [-sv|-si|-sa] [-ss name]	[-sd] [-cm] [-cs num]
       [-ncm] [-kn num]	[-um] db [file|dir]
	      Scan the local file system and register documents	into an	index.
	      If the third argument is the name	of a file, a list of paths  of
	      target  documents	 are read from it.  If it is "-", the standard
	      input is specified.
	      If the third argument is the name	of a directory.	 All files un-
	      der the directory	are treated as target documents.
	      If -tr is	specified, a new index is created  regardless  if  one
	      exists.
	      If  -cl  is  specified,  regions	of  overwritten	 documents are
	      cleaned up.
	      If -ws is	specified, scores are weighted statically  with	 score
	      weighting	attribute.
	      If -no is	specified, operations are printed but not executed ac-
	      tually.
	      If -fe is	specified, target files	are treated as document	draft.
	      By  default,  the	format is detected by the suffix of each docu-
	      ment.
	      If -ft is	specified, target files	are treated as plain text.
	      If -fh is	specified, target files	are treated as HTML.
	      If -fm is	specified, target files	are treated as MIME.
	      If -fx is	specified, target files	with  the  specified  suffixes
	      are  processed  by the specified outer command.  "*" matches any
	      file.  If	the command is leaded by "T@", the output of the  com-
	      mand  is	treated	 as  plain  text.  If the command is leaded by
	      "H@", the	output of the command is treated as HTML.  If the com-
	      mand is leaded by	"M@", the output of the	command	is treated  as
	      MIME.   Else, the	output is treated as document draft.  This op-
	      tion can be specified multiple times.
	      If -fz is	specified, documents which do not corresponding	to the
	      condition	of -fx are ignored.
	      If -fo is	specified, target files	are not	read.	It  is	useful
	      for efficient process of the outer command.
	      If  -rm  is  specified, target files with	the specified suffixes
	      are removed.  "*"	matches	any file.  This	option can  be	speci-
	      fied multiple times.
	      -ic  specifies  the  input encoding.  By default,	it is detected
	      automatically.
	      -il specifies the	preferred input	language.  By default, English
	      is preferred.
	      If -bc is	specified, binary files	are detected and ignored.
	      -lt specifies the	text size limitation by	kilo  bytes.   By  de-
	      fault, it	is 128KB.  If it is negative, the size is unlimited.
	      -lf  specifies  the  file	size limitation	by mega	bytes.	By de-
	      fault, it	is 32MB.  If it	is negative, the size is unlimited.
	      -pc specifies the	encoding of file paths.	  By  default,	it  is
	      ISO-8859-1.
	      -px  specifies  the  name	 of an attribute read from the list of
	      paths.  As the list of paths can be in  TSV  format,  the	 first
	      field  is	 treated  as the path of a target document, the	second
	      field and	the followers are  definitions	of  attribute  values.
	      -px  specifies  the  name	of each	values of the second field and
	      the followers.  This option can be specified multiple times.
	      -aa specifies the	name and the value of an additional attribute.
	      This option can be specified multiple times.
	      If -apn is specified, N-gram analysis is performed against Euro-
	      pean text	also.
	      If -acc is specified, character category analysis	 is  performed
	      instead of N-gram	analysis.
	      If  -xs  is  specified, the index	is tuned to register less than
	      50000 documents.
	      If -xl is	specified, the index is	tuned to  register  more  than
	      300000 documents.
	      If  -xh  is  specified, the index	is tuned to register more than
	      1000000 documents.
	      If -xh2 is specified, the	index is tuned to register  more  than
	      5000000 documents.
	      If  -xh3	is specified, the index	is tuned to register more than
	      10000000 documents.
	      If -sv is	specified, scores are stored as	void.
	      If -si is	specified, scores are stored as	32-bit integer.
	      If -sa is	specified, scores are stored as-is and marked  not  to
	      be tuned when search.
	      -ss specifies the	name of	an attribute for substitute score.
	      If  -sd  is  specified,  the  modification  date of each file is
	      recorded as an attribute.
	      If -cm is	specified, documents whose modification	date  has  not
	      changed are ignored.
	      -cs  specifies  the  size	of cache memory	by mega	bytes.	By de-
	      fault, it	is 64MB.
	      If -ncm is specified, checking availability of the virtual  mem-
	      ory is omitted.
	      -kn  specifies  the  number of keywords to be extracted.	By de-
	      fault, keyword extraction	is not performed.
	      If -um is	specified, morphological analyzers are used  for  key-
	      word extraction.

       estcmd purge [-cl] [-no]	[-fc] [-pc enc]	[-attr expr] db	[prefix]
	      Purge  information  of  documents	which do not exist on the file
	      system.
	      If prefix	is specified, only documents  whose  URIs  are	begins
	      with it.	It can be specified by the local path of a directory.
	      If  -cl  is  specified,  regions	of  the	 deleted documents are
	      cleaned up.
	      If -no is	specified, operations are printed but not executed ac-
	      tually.
	      If -fc is	specified, information of  all	target	documents  are
	      deleted.
	      -pc  specifies  the  encoding  of	file paths.  By	default, it is
	      ISO-8859-1.
	      -attr specifies an attribute search condition.  This option  can
	      be specified multiple times.

       estcmd extkeys [-no] [-fc] [-dfdb file] [-ncm] [-ni] [-kn num] [-um]
       [-attr expr] db [prefix]
	      Create a database	of keywords extracted from documents.
	      If  prefix  is  specified,  only documents whose URIs are	begins
	      with it.
	      If -no is	specified, operations are printed but not executed ac-
	      tually.
	      If  -fc  is  specified,  all  target  documents  are   processed
	      whichever	they have existing records or not.
	      -dfdb  specifies	an  outher database of document	frequency.  By
	      default, document	frequency is calculated	dynamically  according
	      to the index.
	      If  -ncm is specified, checking availability of the virtual mem-
	      ory is omitted.
	      If -ni is	specified, TF-IDF tuning is omitted.
	      -kn specifies the	number of keywords to be  extracted.   By  de-
	      fault, it	is 32.
	      If  -um  is specified, morphological analyzers are used for key-
	      word extraction.
	      -attr specifies an attribute search condition.  This option  can
	      be specified multiple times.

       estcmd words [-nl|-nb] [-dfdb file] [-kw|-kt] db
	      Output  a	list of	all unique words and each record size which is
	      treated as docuemnt frequency.
	      If -nl is	specified, the index is	opened without file locking.
	      If -nb is	specified, file	locking	is performed without blocking.
	      -dfdb specifies an outer database	where the  result  is  stored.
	      By  default, the result is output	to the standard	output as TSV.
	      If the outer database already exists, the	value of  each	record
	      is incremented.
	      If -kw is	specified, keywords and	numbers	of corresponding docu-
	      ments are	output.
	      If  -kt  is specified, keywords and their	related	terms are out-
	      put.

       estcmd draft [-ft|-fh|-fm] [-ic enc] [-il lang] [-bc] [-lt num] [-kn
       num] [-um] [file]
	      For test and debug.

       estcmd break [-ic enc] [-il lang] [-apn|-acc] [-wt] [file]
	      For test and debug.

       estcmd iconv [-ic enc] [-il lang] [-oc enc] [file]
	      For test and debug.

       estcmd regex [-inv] [-repl str] expr [file]
	      For test and debug.

       estcmd scandir [-tf|-td]	[-pa|-pu] [dir]
	      For test and debug.

       estcmd multi [-db db] [-nl|-nb] [-ic enc] [-gs|-gf|-ga] [-cd] [-ni]
       [-sf|-sfr|-sfu|-sfi] [-hs] [-hu]	[-attr expr] [-ord expr] [-max num]
       [-sk num] [-aux num] [-dis name]	[phrase]
	      For test and debug.

       estcmd randput [-ren|-rla|-reu|-ror|-rjp|-rch] [-cs num]	db dnum
	      For test and debug.

       estcmd wicked db	dnum
	      For test and debug.

       estcmd regression db
	      For test and debug.

       estcmd version
	      Show the version information.

       All sub commands	return 0 if the	operation is success, else  return  1.
       As  for	put, out, gather, purge, randput, wicked, and regression, they
       finish with closing the database	when they catch	the signal 1 (SIGHUP),
       2 (SIGINT), 3 (SIGQUIT),	13 (SIGPIPE), or 15 (SIGTERM).

       The data	type of	attribute indexes specified by -attr option of	create
       sub command should be "seq" for sequencial type,	"str" for string type,
       or "num"	for number type.

       Each  pseudo  index specified by	-pidx option of	search sub command and
       so on is	a directory containing files of	document draft.	 If you	search
       a main index with pseudo	indexes, meta search of	 the  main  index  and
       pseudo indexes is performed.

       The  encoding  name  specified by -ic option should be such name	regis-
       tered to	IETF as	UTF-8, ISO-8859-1, and so on.  The language name spec-
       ified by	-il option should be one of "en"  (English),  "ja"  (Japanese,
       "zh" (Chinese), "ko" (Korean).

       The  outer  command specified by	-fx option of gather receives the path
       of the target document by the first argument and	the path for output by
       the second argument.  The original path of the target document is given
       as the value of the environment variable	`ESTORIGFILE'.

       Note that similarity search is very slow, by default.  To  improve  the
       performance  of	similarity search, running "estcmd extkeys" beforehand
       is strongly recommended.

SEE ALSO
       estconfig(1), estmaster(1), estcall(1), estwaver(1), estraier(3), estn-
       ode(3)

       Please see http://hyperestraier.sourceforge.net/uguide-en.html for  de-
       tail.

Man Page			  2007-03-06			     ESTCMD(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=estcmd&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help