FreeBSD Manual Pages

home | help
runawk(1)							     runawk(1)

NAME
       runawk -	wrapper	for AWK	interpreter

SYNOPSIS
       runawk [options]	program_file

       runawk -e program

MOTIVATION
       After years of using AWK	for programming	I've found that	despite	of its
       simplicity and limitations AWK is good enough for scripting a wide
       range of	different tasks. AWK is	not as poweful as their	bigger
       counterparts like Perl, Ruby, TCL and others but	it has their own
       advantages like compactness, simplicity and availability	on almost all
       UNIX-like systems. I personally also like its data-driven nature	and
       token orientation, very useful techniques for text processing
       utilities.

       Unfortunately awk interpreters lacks some important features and
       sometimes do not	work as	good as	they could do.

       Problems	I see (some of them, of	course)

       1.
	 AWK  lacks  support  for  modules. Even if I create small programs, I
	 often want to use functions created earlier and already used in other
	 scripts. That is, it whould  great  to	 organise  functions  into  so
	 called	libraries (modules).

       2.
	 In  order to pass arguments to	"#!/usr/bin/awk	-f" script (not	to awk
	 interpreter), it is necessary to prepend a list of arguments with  --
	 (two  minus  signes).	In  my	view,  this  looks  badly.   Also such
	 behaviour violates POSIX/SUS "Utility Syntax Guidelines".

	 Example:

	 awk_program:

	     #!/usr/bin/awk -f

	     BEGIN {
		for (i=1; i < ARGC; ++i){
		   printf "ARGV	[%d]=%s\n", i, ARGV [i]
		}
	     }

	 Shell session:

	     % awk_program --opt1 --opt2
	     /usr/bin/awk: unknown option --opt1 ignored

	     /usr/bin/awk: unknown option --opt2 ignored

	     % awk_program -- --opt1 --opt2
	     ARGV [1]=--opt1
	     ARGV [2]=--opt2
	     %

	 In my opinion awk_program script should work like this

	     % awk_program --opt1 --opt2
	     ARGV [1]=--opt1
	     ARGV [2]=--opt2
	     %

       3.
	 When "#!/usr/bin/awk -f" script handles arguments (options) and wants
	 to read from stdin, it	is necessary to	add /dev/stdin (or `-')	 as  a
	 last argument explicitly.

	 Example:

	 awk_program:

	     #!/usr/bin/awk -f

	     BEGIN {
		if (ARGV [1] ==	"--flag"){
		   flag	= 1
		   ARGV	[1] = "" # to not read file named "--flag"
		}
	     }

	     {
		print "flag=" flag " $0=" $0
	     }

	 Shell session:

	     % echo test | awk_program -- --flag
	     % echo test | awk_program -- --flag /dev/stdin
	     flag=1 $0=test
	     %

	 Ideally awk_program should work like this

	     % echo test | awk_program --flag
	     flag=1 $0=test
	     %

       4.
	 igawk(1)  which  is  shipped with GNU awk can not be used in shebang.
	 On most (all?)	UNIXes scripts beginning with

	     #!/usr/local/bin/igawk -f

	 will not work.

       runawk was created to solve all these problems

OPTIONS
       -d    Turn on a debugging mode.

       -e program
	     Specify program. If -e is not specified, the  AWK	code  is  read
	     from program_file.

       -f awk_module
	     Activate awk_module. This works the same way as

		 #use "awk_module.awk"

	     directive in the code. Multiple -f	options	are allowed.

       -F fs Set the input field separator FS to the regular expression	fs.

       -h    Display help information.

       -t    If	 this  option  is applied, a temporary directory is created by
	     runawk and	path to	it is passed to	awk child  process.  Temporary
	     directory	is  created  under ${RUNAWK_TMPDIR} (if	it is set), or
	     ${TMPDIR} (if it is set) or /tmp directory	 otherwise.   If  #use
	     "tmpfile.awk"  is	detected in a program this option is activated
	     automatically.

       -T    Set FS to TAB character. This is equivalent to -F'\t'

       -V    Display version information.

       -v var=val
	     Assign the	value val to the variable var before execution of  the
	     program begins.

DETAILS/INTERNALS
   Standalone script
       Under UNIX-like OS-es you can use runawk	by beginning your script with

	  #!/usr/local/bin/runawk

       line or something like this instead of

	  #!/usr/bin/awk -f

       or similar.

   AWK modules
       In  order  to activate modules you should add them into awk script like
       this

	 #use "module1.awk"
	 #use "module2.awk"

       that is the line	that specifies module name is  treated	as  a  comment
       line by normal AWK interpreter but is processed by runawk especially.

       Unless  you  run	 runawk	with option -e,	#use must begin	with column 0,
       that is no spaces or tabs symbols are allowed before it and no  symbols
       are allowed between # and use.

       Also note that AWK modules can also "use" another modules and so	forth.
       All  them are collected in a depth-first	order and each one is added to
       the list	of awk interpreter arguments prepanded with -f	option.	  That
       is  #use	 directive  is	*NOT*  similar	to  #include  in C programming
       language, runawk's module code is not inserted into the place of	 #use.
       Runawk's	 modules  are  closer  to  Perl's "use"	command.  In case some
       module is mentioned more	than once, only	one -f will be added  for  it,
       i.e duplications	are removed automatically.

       Position	 of  #use  directive  in  a source file	does matter, i.e.  the
       earlier module is mentioned, the	earlier	-f will	be generated for it.

       Example:

	 file prog:
	    #!/usr/local/bin/runawk

	    #use "A.awk"
	    #use "B.awk"
	    #use "E.awk"

	    PROG code
	    ...

	 file B.awk:
	    #use "A.awk"
	    #use "C.awk"
	    B code
	    ...

	 file C.awk:
	    #use "A.awk"
	    #use "D.awk"

	    C code
	    ...

	 A.awk and D.awk don't contain #use directive

       If you run

	 runawk	prog file1 file2

       or

	 /path/to/prog file1 file2

       the following command

	 awk -f	A.awk -f D.awk -f C.awk	-f B.awk -f E.awk -f prog -- file1 file2

       will actually run.

       You can check this by running

	 runawk	-d prog	file1 file2

   Module search strategy
       Modules are first searched in a directory where main program (or	module
       in which	#use directive is specified) is	placed.	 If it	is  not	 found
       there,  then  AWKPATH  environment variable is checked. AWKPATH keeps a
       colon  separated	 list  of  search  directories.	  Finally,  module  is
       searched	   in	 system	  runawk   modules   directory,	  by   default
       PREFIX/share/runawk but this can	be changed at compile time.

       An absolute path	to the module can also be specified.

   Program as an argument
       Like some other interpreters  runawk  can  obtain  the  script  from  a
       command line like this

	/path/to/runawk	-e '
	#use "alt_assert.awk"

	{
	  assert($1 >= 0 && $1 <= 10, "Bad value: " $1)

	  # your code below
	  ...
	}'

       runawk can also be used for writing oneliners

	runawk -f abs.awk -e 'BEGIN {print abs(-1)}'

   Selecting a preferred AWK interpreter
       For  some  reason  you  may prefer one AWK interpreter or another.  The
       reason may be efficiency	for a particular task, useful but not standard
       extensions or enything else.  To	tell runawk what  AWK  interpreter  to
       use, one	can use	#interp	directive

	 file prog:
	    #!/usr/local/bin/runawk

	    #use "A.awk"
	    #use "B.awk"

	    #interp "/usr/pkg/bin/nbawk"

	    # your code	here
	    ...

       Note  that #interp directive should also	begin with column 0, no	spaces
       are allowed before it and between # and interp.

       Sometimes it also makes sense to	give users  ability  to	 select	 their
       preferred  AWK  interpreter without changing the	source code. In	runawk
       it is possible  using  special  directive  #interp-var  which  sets  an
       environment  variable  name  assignable	by  user that specifies	an AWK
       interpreter.  For example, the following	script

	 file foobar:
	    #!/usr/bin/env runawk

	    #interp-var	"FOOBAR_AWK"

	    BEGIN {
	       print "This is a	FooBar application"
	    }

       can be run as

	    env	FOOBAR_AWK=mawk	foobar

       or just

	    foobar

       In the former case mawk will be used as AWK interpreter,	in the	latter
       -- the default AWK interpreter.

   Using existing modules only
       In  UNIX	 world it is common practise to	write configuration files in a
       programming language of the application.	That  is,  if  application  is
       written	in  Bourne shell, configuration	files for such application are
       often written in	Bourne as well.	Using RunAWK one can do	the  same  for
       applications  written  in AWK. For example, the following code will use
       ~/.foobarrc file	if it exists otherwise /etc/foobar.conf	will  be  used
       if it exists.

	 file foobar:
	   #!/usr/bin/env runawk

	   #safe-use "~/.foobarrc" "/etc/foobar.conf"

	   BEGIN {
	     print foo,	bar, baz
	   }

	 file ~/.foobarrc:
	   BEGIN {
	     foo = "foo10"
	     bar = "bar20"
	     baz = 123
	   }

       Of  course, #safe-use directive may be used for other purposes as well.
       #safe-use directive accepts as much modules as you want,	 but  at  most
       one  can	 be included using awk option -f, others are silently ignored,
       also note that modules are analysed from	left to	right.	Leading	 tilde
       in  the	module	name  is replaced with user's home directory.  Another
       example:

	 file foobar:
	   #!/usr/bin/env runawk

	   #use	"/usr/share/foobar/default.conf"
	   #safe-use "~/.foobarrc" "/etc/foobar.conf"

	   your	code is	here

       Here the	default	settings are  set  in  /usr/share/foobar/default.conf,
       and configuration files (if any)	are used for overriding	them.

   Setting environment
       In  some	 cases	you  may  want	to run AWK interpreter with a specific
       environment. For	example, your script may be oriented to	process	 ASCII
       text only. In this case you can run AWK with LC_CTYPE=C environment and
       use regexp ranges.

       runawk provides #env directive for this.	String inside double quotes is
       passed to putenv(3) libc	function.

       Example:

	 file prog:
	    #!/usr/local/bin/runawk

	    #env "LC_ALL=C"

	    $1 ~ /^[A-Z]+$/ { #	A-Z is valid if	LC_CTYPE=C
		print $1
	    }

EXIT STATUS
       If  AWK	interpreter exits normally, runawk exits with its exit status.
       If AWK interpreter was killed by	signal,	runawk exits with exit	status
       128+signal.

ENVIRONMENT
       AWKPATH
	     Colon  separated  list  of	 directories  where  awk  modules  are
	     searched.

       RUNAWK_AWKPROG
	     Sets the path to the AWK interpreter, used	by default, i.e.  this
	     variable  overrides  the compile-time default.  Note that #interp
	     directive overrides this.

       RUNAWK_KEEPTMP
	     If	set, temporary files are not deleted.

AUTHOR
       Copyright (c) 2007-2014 Aleksey Cheusov <vle@gmx.net>

BUGS/FEEDBACK
       Please send any comments, questions, bug	reports	etc. to	me  by	e-mail
       or  register  them  at  sourceforge project home.  Feature requests are
       also welcomed.

HOME
       <http://sourceforge.net/projects/runawk/>

SEE ALSO awk(1)
				  2019-02-15			     runawk(1)
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=runawk&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages