Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
timbl(1)		    General Commands Manual		      timbl(1)

NAME
       timbl - Tilburg Memory Based Learner

SYNOPSIS
       timbl [options]

       timbl -f	data-file -t test-file

DESCRIPTION
       TiMBL  is  an open source software package implementing several memory-
       based learning algorithms, among	which IB1-IG, an implementation	of  k-
       nearest	neighbor  classification  with	feature	weighting suitable for
       symbolic	feature	spaces,	and IGTree, a decision-tree  approximation  of
       IB1-IG.	All implemented	algorithms have	in common that they store some
       representation of the training set explicitly in	memory.	 During	 test-
       ing,  new  cases	 are classified	by extrapolation from the most similar
       stored cases.

OPTIONS
       -a <n> or -a <string>
	      determines the classification algorithm.

	      Possible values are:

	      0	or IB
	       the IB1 (k-NN) algorithm	(default)

	      1	or IGTREE
	       a decision-tree-based approximation of IB1

	      2	or TRIBL
	       a hybrid	of IB1 and IGTREE

	      3	or IB2
	       an incremental editing version of IB1

	      4	or TRIBL2
	       a non-parameteric version of TRIBL

       -b n
	      number of	lines used for bootstrapping (IB2 only)

       -B n
	      number of	bins used for discretization of	numeric	feature	values
	      (Default B=20)

       --Beam=<n>
	      limit +v db output to n highest-vote classes

       --clones=<n>
	      number f threads to use for parallel testing

       -c n
	      clipping frequency for prestoring	MVDM matrices

       +D
	      store distributions on all nodes (necessary for using +v db with
	      IGTree, but wastes memory	otherwise)

       --Diversify
	      rescale weight (see docs)

       -d val
	      weigh neighbors as function of their distance:
	       Z      :	equal weights to all (default)
	       ID     :	Inverse	Distance
	       IL     :	Inverse	Linear
	       ED:a   :	Exponential Decay with factor a	(no whitespace!)
	       ED:a:b :	Exponential Decay with factor a	and b (no whitespace!)

       -e n
	      estimate time until n patterns tested

       -f file
	      read from	data file 'file' OR  use  filenames  from  'file'  for
	      cross validation test

       -F format
	      assume the specified input format	(Compact, C4.5,	ARFF, Columns,
	      Binary, Sparse )

       -G normalization

	      normalize	distributions (+v db option only)

	      Supported	normalizations are:

	      Probability or 0

	      normalize	between	0 and 1

	      addFactor:<f> or 1:<f>

	      add  f  to  all possible targets,	then normalize between 0 and 1
	      (default f=1.0).

	      logProbability or	2

	      Add 1 to the target Weight, take the 10Log  and  then  normalize
	      between 0	and 1

       +H or -H
	      write hashed trees (default +H)

       -i file
	      read the InstanceBase from 'file'	(skips phase 1 & 2 )

       -I file
	      dump the InstanceBase in 'file'

       -k n
	      search 'n' nearest neighbors (default n =	1)

       -L n
	      set  value  frequency threshold to back off from MVDM to Overlap
	      at level n

       -l n
	      fixed feature value length (Compact format only)

       -m string
	      use feature metrics as specified in 'string':
	       The format is : GlobalMetric:MetricRange:MetricRange
			 e.g.: mO:N3:I2,5-7

	       C: cosine distance. (Global only. numeric features implied)
	       D: dot product. (Global only. numeric features implied)
	       DC: Dice	coefficient
	       O: weighted overlap (default)
	       E: Euclidian distance
	       L: Levenshtein distance
	       M: modified value difference
	       J: Jeffrey divergence
	       S: Jensen-Shannon divergence
	       N: numeric values
	       I: Ignore named	values

       --matrixin=file
	      read ValueDifference Matrices from file 'file'

       --matrixout=file
	      store ValueDifference Matrices in	'file'

       -n file
	      create a C4.5-style names	file 'file'

       -M n
	      size of MaxBests Array

       -N n
	      number of	features (default 2500)

       -o s
	      use s as output filename

       --occurrences=<value>
	      The input	file contains occurrence counts	(at the	last position)
	      value can	be one of: train , test	or both

       -O path
	      save output using	'path'

       -p n
	      show progress every n lines (default p = 100,000)

       -P path
	      read data	using 'path'

       -q n
	      set TRIBL	threshold at level n

       -R n
	      solve ties at random with	seed n

       -s
	      use the exemplar weights from the	input file

       -s0
	      ignore the exemplar weights from the input file

       -T n
	      use feature n as the class label.	(default: the last feature)

       -t file
	      test using 'file'

       -t leave_one_out
	      test with	the leave-one-out testing regimen (IB1 only).  you may
	      add --sloppy to speed up leave-one-out testing (but see docs)

       -t cross_validate
	      perform cross-validation test (IB1 only)

       -t @file
	      test using files and options described in	'file'	Supported  op-
	      tions: d e F k m o p q R t u v w x % -

       --Treeorder =value n
	      ordering of the Tree:
	       DO: none
	       GRO: using GainRatio
	       IGO: using InformationGain
	       1/V: using 1/# of Values
	       G/V: using GainRatio/# of Valuess
	       I/V: using InfoGain/# of	Valuess
	       X2O: using X-square
	       X/V: using X-square/# of	Values
	       SVO: using Shared Variance
	       S/V: using Shared Variance/# of Values
	       GxE: using GainRatio * SplitInfo
	       IxE: using InformationGain * SplitInfo
	       1/S: using 1/SplitInfo

       -u file
	      read value-class probabilities from 'file'

       -U file
	      save value-class probabilities in	'file'

       -V
	      Show VERSION

       +v level	or -v level
	      set or unset verbosity level, where level	is:

	       s:  work	silently
	       o:  show	all options set
	       b:  show	node/branch count and branching	factor
	       f:  show	calculated feature weights (default)
	       p:  show	value difference matrices
	       e:  show	exact matches
	       as: show	advanced statistics (memory consuming)
	       cm: show	confusion matrix (implies +vas)
	       cs: show	per-class statistics (implies +vas)
	       cf: add confidence to output file (needs	-G)
	       di: add distance	to output file
	       db: add distribution of best matched to output file
	       md: add matching	depth to output	file.
	       k:  add a summary for all k neigbors to output file (sets -x)
	       n:  add nearest neigbors	to output file (sets -x)

		You may	combine	levels using '+' e.g. +v p+db or -v o+di

       -w n
	      weighting
	       0 or nw:	no weighting
	       1 or gr:	weigh using gain ratio (default)
	       2 or ig:	weigh using information	gain
	       3 or x2:	weigh using the	chi-square statistic
	       4 or sv:	weigh using the	shared variance	statistic
	       5  or sd: weigh using standard deviation. (all features must be
	      numeric)

       -w file
	      read weights from	'file'

       -w file:n
	      read weight n from 'file'

       -W file
	      calculate	and save all weights in	'file'

       +% or -%
	      do or don't save test result (%) to file

       +x or -x
	      do or don't use the exact	match shortcut
		 (IB1 and IB2 only, default is -x)

       -X file
	      dump the InstanceBase as XML in 'file'

BUGS
       possibly

AUTHORS
       Ko van der Sloot	Timbl@uvt.nl

       Antal van den Bosch Timbl@uvt.nl

SEE ALSO
       timblserver(1)

				2017 November 9			      timbl(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=timbl&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help