FreeBSD Manual Pages

home | help
LLnextgen(1)		  LLnextgen parser generator		  LLnextgen(1)

NAME
       LLnextgen - an Extended-LL(1) parser generator

SYNOPSIS
       LLnextgen [OPTIONS] [FILES]

DESCRIPTION
       LLnextgen  is  a	 (partial) reimplementation of the LLgen ELL(1)	parser
       generator created by D. Grune and C.J.H.	Jacobs (note: this is not  the
       same as the LLgen parser	generator by Fischer and LeBlanc). It takes an
       EBNF-like description of	the grammar as input(s), and produces a	parser
       in C.

       Input  files  are  expected to end in .g. The output files will have .g
       removed and .c and .h added. If the input file does not end in .g,  the
       extensions  .c  and  .h	will  simply be	added to the name of the input
       file. Output files can also be given a different	base  name  using  the
       option --base-name (see below).

OPTIONS
       LLnextgen accepts the following options:

       -c, --max-compatibility
	      Set  options  required  for  maximum source-level	compatibility.
	      This is different	from running as	LLgen, as all  extensions  are
	      still  allowed.  LLreissue and the prototypes in the header file
	      are still	generated. This	option turns on	the --llgen-arg-style,
	      --llgen-escapes-only and --llgen-output-style options.

       -e, --warnings-as-errors
	      Treat warnings as	errors.

       -Enum, --error-limit=num
	      Set the maximum number of	errors,	before	LLnextgen  aborts.  If
	      num  is  set  0,	the error limit	is set to infinity. This is to
	      override the error limit option specified	in the grammar file.

       -h[which], --help[=which]
	      Print out	a help message,	describing the options.	 The  optional
	      which argument allows selection of which options to print. which
	      can be set to all, depend, error,	and extra.

       -V, --version
	      Print the	program	version	and copyright information, and exit.

       -v[level], --verbose[=level]
	      Increase	(without  explicit level) or set (with explicit	level)
	      the verbosity level. LLnextgen uses this option differently than
	      LLgen. At	level 1, LLnextgen will	output traces of the conflicts
	      to standard error. At level 2, LLnextgen will also write a  file
	      named LL.output with the rules containing	conflicts. At level 3,
	      LLnextgen	will include the entire	grammar	in LL.output.
	      LLgen  will  write  the  LL.output file from level 1, but	cannot
	      generate conflict	traces.	It also	has  an	 intermediate  setting
	      between LLnextgen	levels 2 and 3.

       -w[warnings], --suppress-warnings[=warnings]
	      Suppress	all or selected	warnings. Available warnings are: arg-
	      separator,   option-override,   unbalanced-c,   multiple-parser,
	      eofile,  unused[:<identifier>],  datatype	and unused-retval. The
	      unused warning can suppress all warnings about unused tokens and
	      non-terminals, or	can be used to suppress	 warnings  about  spe-
	      cific  tokens or non-terminals by	adding a colon and a name. For
	      example, to suppress warning messages about FOO not being	 used,
	      use -wunused:FOO.	Several	comma separated	warnings can be	speci-
	      fied with	one option on the command line.

       --abort
	      Generate the LLabort function.

       --base-name=name
	      Set  the base name for the output	files. Normally	LLnextgen uses
	      the name of the first input file without any trailing .g as  the
	      base  name. This option can be used to override the default. The
	      files created will be name.c and name.h.	This option cannot  be
	      used in combination with --llgen-output-style.

       --depend[=modifiers]
	      Generate	dependency  information	to be used by the make(1) pro-
	      gram. The	modifiers can be used to change	the make targets (tar-
	      gets:<targets>,  and  extra-targets:<targets>)  and  the	output
	      (file:<file>).  The  default are to use the output names as they
	      would be created by running with the same	arguments as  targets,
	      and  to  output  to standard output. Using the targets modifier,
	      the list of targets can be specified manually. The extra-targets
	      modifier allows targets to be added to the default list of  tar-
	      gets. Finally, the phony modifier	will add phony targets for all
	      dependencies to avoid make(1) problems when removing or renaming
	      dependencies. This is like the gcc(1) -MP	option.

       --depend-cpp
	      Dump  all	 top-level C-code to standard out. This	can be used to
	      generate dependency information for the generated	files by  pip-
	      ing  the	output	from LLnextgen through the C preprocessor with
	      the appropriate options.

       --dump-lexer-wrapper
	      Write the	lexer wrapper function to standard output, and exit.

       --dump-llmessage
	      Write the	default	LLmessage function  to	standard  output,  and
	      exit.

       --dump-tokens[=modifier]
	      Dump  %token  directives	for unknown identifiers	that match the
	      --token-pattern pattern. The default is  to  generate  a	single
	      %token  directive	 with all the unknown identifiers separated by
	      comma's. This default can	be overridden by modifier.  The	 modi-
	      fier  separate  produces	a  separate  %token directive for each
	      identifier, while	label produces a %label	directive. The text of
	      the label	will be	the name of the	identifier.  If	the label mod-
	      ifier and	the --lowercase-symbols	option are both	specified  the
	      label will contain only lowercase	characters.
	      Note: this option	is not always available. It requires the POSIX
	      regex API. If the	POSIX regex API	is not available on your plat-
	      form,  or	 the LLnextgen binary was compiled without support for
	      the API, you will	not be able to use this	option.

       --extensions=list
	      Specify the extensions to	be used	for the	generated  files.  The
	      list  must  be comma separated, and should not contain the . be-
	      fore the extension. The first item in the	list is	the  C	source
	      file  and	 the  second item is the header	file. You can omit the
	      extension	for the	C source file and only specify	the  extension
	      for the header file.

       --generate-lexer-wrapper[=yes|no]
	      Indicate whether to generate a wrapper for the lexical analyser.
	      As  LLnextgen requires a lexical analyser	to return the last to-
	      ken returned after detecting an error which requires inserting a
	      token to repair, most lexical analysers require a	wrapper	to ac-
	      commodate	LLnextgen. As it is identical for almost each grammar,
	      LLnextgen	can provide one. Use --dump-lexer-wrapper to  see  the
	      code.  If	 you do	specifiy this option LLnextgen will generate a
	      warning, to help remind you that a wrapper is required.
	      If you do	not want the automatically generate wrapper you	should
	      specifiy this option followed by =no.

       --generate-llmessage
	      Generate an LLmessage function. LLnextgen	requires  programs  to
	      provide  a  function  for	informing the user about errors	in the
	      input. When developing a parser, it is often desirable to	have a
	      default LLmessage.  The provided LLmessage is  very  simple  and
	      should  be  replaced by a	more elaborate one, once the parser is
	      beyond the first testing phase. Use --dump-llmessage to see  the
	      code.  This  option automatically	turns on --generate-symbol-ta-
	      ble.

       --generate-symbol-table
	      Generate a symbol	table. The symbol table	will  contain  strings
	      for  all	tokens	and character literals.	By default, the	symbol
	      table contains the token name as specified in  the  grammar.  To
	      change  the  string, for both tokens and character literals, use
	      the %label directive.

       --gettext[=macro,guard]
	      Add gettext support. A macro call	is added around	 symbol	 table
	      entries  generated from %label directives. The macro will	expand
	      to the string itself.  This is meant to allow xgettext(1)	to ex-
	      tract the	strings. The default is	N_, because that is what  most
	      people use. A guard will be included such	that compilation with-
	      out  gettext is possible by not defining the guard. The guard is
	      set to USE_NLS by	default. Translations will be  done  automati-
	      cally  in	 LLgetSymbol in	the generated parser through a call to
	      gettext.

       --keep-dir
	      Do not remove directory component	of the	input  file-name  when
	      creating	the  output file-name. By default, outputs are created
	      in the current directory.	 This option will generate the	output
	      in the directory of the input.

       --llgen-arg-style
	      Use semicolons as	argument separators in rule headers. LLnextgen
	      uses comma's by default, as this is what ANSI C does.

       --llgen-escapes-only
	      Only  allow  the	escape sequences defined by LLgen in character
	      literals.	 By default LLnextgen also allows \a, \v, \?, \",  and
	      hexadecimal constants with \x.

       --llgen-output-style
	      Generate	one  .c	 output	 per  input, and the files Lpars.c and
	      Lpars.h, instead of one .c and one .h file based on the name  of
	      the first	input.

       --lowercase-symbols
	      Convert  the token names used for	generating the symbol table to
	      lower case.  This	only applies to	tokens for which no %label di-
	      rective has been specified.

       --no-allow-label-create
	      Do not allow the %label directive	to  create  new	 tokens.  Note
	      that  this  requires  that  the token being labelled is either a
	      character	literal	or a %token directive creating the named token
	      has preceded the %label directive.

       --no-arg-count
	      Do not check argument counts for rules. LLnextgen	checks whether
	      a	rule is	used with the same number of arguments as  it  is  de-
	      fined.  LLnextgen	 also checks that any rules for	which a	%start
	      directive	is specified, the number of arguments is 0.

       --no-eof-zero
	      Do not use 0 as end-of-file token. (f)lex(1) uses	0 as the  end-
	      of-file token. Other lexical-analyser generators may use -1, and
	      may use 0	for something else (e.g. the nul character).

       --no-init-llretval
	      Do  not  initialise LLretval with	0 bytes. Note that you have to
	      take care	of initialisation of LLretval yourself when using this
	      option.

       --no-line-directives
	      Do not generate #line directives in the output. This  means  all
	      errors  will be reported relative	to the output file. By default
	      LLnextgen	generates #line	directives to make the C compiler gen-
	      erate errors relative to the LLnextgen input file.

       --no-llreissue
	      Do not generate the LLreissue variable, which is used  to	 indi-
	      cate when	a token	should be reissued by the lexical analyser.

       --no-prototypes-header
	      Do not generate prototypes for the parser	and other functions in
	      the header file.

       --not-only-reachable
	      Do  not  only analyse reachable rules. LLnextgen by default does
	      not take unreachable rules  into	account	 when  doing  conflict
	      analysis,	as these can cause spurious conflicts. However,	if the
	      unreachable  rules will be used in the future, one might already
	      want to be notified of problems with these rules.	 LLgen by  de-
	      fault does analyse unreachable rules.
	      Note:  in	 the case where	a rule is unreachable because the only
	      alternative of another reachable rule that mentions it is	 never
	      chosen (because of a %avoid directive), the rule is still	deemed
	      reachable	for the	analysis. The only way to avoid	this behaviour
	      is  by  doing the	complete analysis twice, which is an excessive
	      amount of	work to	do for a very rare case.

       --reentrant
	      Generate a reentrant parser.  By	default,  LLnextgen  generates
	      non-reentrant parsers. A reentrant parser	can be called from it-
	      self, but	not from another thread. Use --thread-safe to generate
	      a	thread-safe parser.
	      Note  that  when	multiple  parsers are specified	in one grammar
	      (using multiple %start directives), and  one  of	these  parsers
	      calls  another,  either  the --reentrant option or the --thread-
	      safe option is also required. If these parsers are  only	called
	      when none	of the others is running, the option is	not necessary.
	      Use only in combination with a reentrant lexical analyser.

       --show-dir
	      Show  directory  names of	source files in	error and warning mes-
	      sages. These are usually omitted for readability,	but may	 some-
	      times be necessary for tracing errors.

       --thread-safe
	      Generate a thread-safe parser. Thread-safe parsers can be	run in
	      parallel in different threads of the same	program. The interface
	      of  a thread-safe	parser is different from the regular (and then
	      reentrant) version. See the detailed manual for more details.

       --token-pattern=pattern
	      Specify a	regular	expression to match with  unknown  identifiers
	      used in the grammar. If an unknown identifier matches, LLnextgen
	      will  generate  a	token declaration for the identifier. This op-
	      tion is primarily	implemented to aid in the first	stages of  de-
	      velopment, to allow for quick testing for	conflicts without hav-
	      ing  to specify all the tokens yet. A list of tokens can be gen-
	      erated with the --dump-tokens option.
	      Note: this option	is not always available. It requires the POSIX
	      regex API. If the	POSIX regex API	is not available on your plat-
	      form, or the LLnextgen binary was	compiled without  support  for
	      the API, you will	not be able to use this	option.

       By  running  LLnextgen using the	name LLgen, LLnextgen goes into	LLgen-
       mode. This is implemented by turning off	all default extra  functional-
       ity  like  LLreissue,  and disallowing all extensions to	the LLgen lan-
       guage. When running as LLgen, LLnextgen accepts the  following  options
       from LLgen:

       -a     Ignored. LLnextgen only generates	ANSI C.

       -hnum  Ignored.	LLnextgen  leaves optimisation of jump tables entirely
	      up to the	C-compiler.

       -j[num]
	      Ignored. LLnextgen leaves	optimisation of	jump  tables  entirely
	      up to the	C-compiler.

       -l[num]
	      Ignored.	LLnextgen  leaves optimisation of jump tables entirely
	      up to the	C-compiler.

       -v     Increase the verbosity level. See	the description	of the -v  op-
	      tion above for details.

       -w     Suppress all warnings.

       -x     Ignored.	LLnextgen  will	only generate token sets in LL.output.
	      The extensive error-reporting mechanisms in LLnextgen make  this
	      feature obsolete.

       LLnextgen  cannot  create  parsers  with	non-correcting error-recovery.
       Therefore, using	the -n or -s options will cause	LLnextgen to print  an
       error message and exit.

COMPATIBILITY WITH LLGEN
       At  this	 time  the  basic LLgen	functionality is implemented. This in-
       cludes everything apart from the	extended user error-handling with  the
       %onerror	directive and the non-correcting error-recovery.

       Although	 I've  tried to	copy the behaviour of LLgen accurately,	I have
       implemented some	aspects	slightly differently. The following is a  list
       of the differences in behaviour between LLgen and LLnextgen:

       •      LLgen generated both K&R style C code and	ANSI C code. LLnextgen
	      only supports generation of ANSI C code.

       •      There  is	a minor	difference in the determination	of the default
	      choices.	LLnextgen simply chooses the first production with the
	      shortest possible	terminal production, while  LLgen  also	 takes
	      the complexity in	terms of non-terminals and terms into account.
	      There  is	 also  a  minor	difference when	there is more than one
	      shortest alternative and some of them are	 marked	 with  %avoid.
	      Both  differences	are not	very important as the user can specify
	      which alternative	should be the default,	thereby	 circumventing
	      the differences in the algorithms.

       •      The  default behaviour of	generating one output C	file per input
	      and Lpars.c and Lpars.h has been changed in favour of generating
	      one .c file and one .h file. The rationale  given	 for  creating
	      multiple	output	files in the first place was that it would re-
	      duce the compilation time	for the	generated parser. As  computa-
	      tion  power  has	become	much  more abundant this feature is no
	      longer necessary,	and the	difficult interaction  with  the  make
	      program  makes it	undesirable. The LLgen behaviour is still sup-
	      ported through a command-line switch.

       •      in LLgen one could have a	parser and a  %first  macro  with  the
	      same  name.   LLnextgen forbids this, as it leads	to name	colli-
	      sions in the new file naming scheme. For the old LLgen file nam-
	      ing scheme it could also easily lead  to	name  collisions,  al-
	      though  they  could be circumvented by not mentioning the	parser
	      in any of	the C code in the .g files.

       •      LLgen names the labels it	generates L_X, where X	is  a  number.
	      LLnextgen	names these LL_X.

       •      LLgen  parsers are always	reentrant. As this feature is not used
	      very often, LLnextgen parsers are	non-reentrant unless  the  op-
	      tion --reentrant is used.

       Furthermore,  LLnextgen has many	extended features, for easier develop-
       ment.

BUGS
       If you think you	have found a bug, please check that you	are using  the
       latest  version of LLnextgen [http://os.ghalkes.nl/LLnextgen]. When re-
       porting bugs, please include a minimal grammar  that  demonstrates  the
       problem.

AUTHOR
       G.P. Halkes <llnextgen@ghalkes.nl>

COPYRIGHT
       Copyright (C) 2005-2008 G.P. Halkes
       LLnextgen is licensed under the GNU General Public License version 3.
       For more	details	on the license,	see the	file COPYING in	the documenta-
       tion	directory.     On     Un*x    systems	 this	 is    usually
       /usr/share/doc/LLnextgen-0.5.5.

SEE ALSO
       LLgen(1), bison(1), yacc(1), lex(1), flex(1).

       A detailed manual for LLnextgen is available as part of	the  distribu-
       tion.   It includes the syntax for the grammar files, details on	how to
       use the generated parser	in your	programs, and details on the  workings
       of the generated	parsers. This manual can be found in the documentation
       directory.      On      Un*x	 systems      this	is     usually
       /usr/share/doc/LLnextgen-0.5.5.

Version	0.5.5			  31-12-2011			  LLnextgen(1)
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=LLnextgen&sektion=1&manpath=FreeBSD+Ports+15.0>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages