FreeBSD Manual Pages
LT-PROC(1) General Commands Manual LT-PROC(1) NAME lt-proc -- lexical processor for Apertium SYNOPSIS lt-proc [-a|-b|-o|-c|-d|-e| -g | -h | -p | -s | -t | -v | -h | -z | -w] [-W] [-N -N] [-L -N] [-i icx_file] fst_file [input_file [output_file]] DESCRIPTION lt-proc is the application responsible for providing the four lexical processing functionalities: • morphological analyser (option -a) • lexical transfer (option -n) • morphological generator (option -g) • post-generator (option -p) It accomplishes these tasks by reading binary files containing a com- pact and efficient representation of dictionaries (a class of finite- state transducers called augmented letter transducers). These files are generated by lt-comp(1). It is worth mentioning that some characters (`[', `]', `$', `^', `/', `+') are special chars used for format and encapsulation. They should be escaped if they have to be used literally, for instance: `['...`]' are ignored and the format of a linefeed is `^...$'. OPTIONS -a, --analysis Tokenizes the text in surface forms (lexical units as they ap- pear in texts) and delivers, for each surface form, one or more lexical forms consisting of lemma, lexical category and morpho- logical inflection information. Tokenization is not straight- forward due to the existence, on the one hand, of contractions, and, on the other hand, of multi-word lexical units. For con- tractions, the system reads in a single surface form and deliv- ers the corresponding sequence of lexical forms. Multi-word surface forms are analysed in a left-to-right, longest-match fashion. Multi-word surface forms may be invariable (such as a multi-word preposition or conjunction) or inflected (for exam- ple, in es, "echaban de menos", "they missed", is a form of the imperfect indicative tense of the verb "echar de menos", "to miss"). Limited support for some kinds of discontinuous multi- word units is also available. Single-word surface forms analy- sis produces output like the one in these examples: "cantar" -> "^cantar/cantar<vblex><inf>$" or "daba" -> "^daba/dar<vblex><pii><p1><sg>/dar<vblex><pii><p3><sg>$". -b, --bilingual Does lexical transference, attaching queues of morphological symbols not specified in the dictionaries. As the analysis mode, supports multiple lexical forms in the target language for a given lexical form in the source language. Works typi- cally with the output of apertium-pretransfer(1). -o, --surf-bilingual As with -b, but takes input from apertium-tagger(1) -p, with surface forms, and if the lexical form is not found in the bilingual dictionary, it outputs the surface form of the word. -c, --case-sensitive Use the literal case of the incoming characters -d, --debugged-gen Morphological generation with all the stuff -e, --decompose-compounds Try to treat unknown words as compounds, and decompose them. -w, --dictionary-case Use the case information contained in the lexicon, instead of the surface case (only applied in analysis mode). -g, --generation Delivers a target-language surface form for each target-lan- guage lexical form, by suitably inflecting it. -n, --non-marked-gen Morphological generation (like -g) but without unknown word marks (asterisk `*'). -b, --tagged-gen Morphological generation (like -g) but retaining part-of-speech tags. -p, --post-generation Performs orthographical operations such as contractions and apostrophations. The post-generator is usually dormant (just copies the input to the output) until a special alarm symbol contained in some target-language surface forms wakes it up to perform a particular string transformation if necessary; then it goes back to sleep. -s, --sao Input processing is in orthoepikon (previously sao) annotation system format: https://orthoepikon.sf.net. -t, --transliteration Apply a transliteration dictionary -i icx_file, --ignored-chars icx_file Ignores characters specified in the file icx_file -z, --null-flush Flush output on the null character -C, --careful-case Use dictionary case if present, else surface -N, --analyses Output no more than N analyses (if the transducer is weighted, the N best analyses) -L, --weight-classes Output no more than N best weight classes (where analyses with equal weight constitute a class) -W, --show-weights Print final analysis weights (if any) -v, --version Display the version number. -h, --help Display this help. FILES input_file The input compiled dictionary. SEE ALSO apertium(1), apertium-tagger(1), lt-comp(1), lt-expand(1) COPYRIGHT Copyright (C) 2005, 2006 Universitat d'Alacant / Universidad de Ali- cante. This is free software. You may redistribute copies of it under the terms of the GNU General Public License: https://www.gnu.org/licenses/gpl.html. BUGS Many... lurking in the dark and waiting for you! Apertium March 23, 2006 LT-PROC(1)
NAME | SYNOPSIS | DESCRIPTION | OPTIONS | FILES | SEE ALSO | COPYRIGHT | BUGS
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=lt-proc&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>
