Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
AWK(1)			    General Commands Manual			AWK(1)

NAME
       awk -- pattern-directed scanning	and processing language

SYNOPSIS
       awk    [-safe]	 [-version]    [-d[n]]	  [-F	fs]   [-v   var=value]
	   [prog | -f progfile]	file ...

DESCRIPTION
       awk scans each input file for lines that	match any of a set of patterns
       specified literally in prog or in one or	more  files  specified	as  -f
       progfile.   With	 each  pattern	there can be an	associated action that
       will be performed when a	line of	a file matches the pattern.  Each line
       is matched against the pattern portion of every	pattern-action	state-
       ment; the associated action is performed	for each matched pattern.  The
       file name `-' means the standard	input.	Any file of the	form var=value
       is  treated  as	an  assignment,	not a filename,	and is executed	at the
       time it would have been opened if it were a filename.

       The options are as follows:

       -d[n]   Debug mode.  Set	debug level to n, or 1 if n is not  specified.
	       A value greater than 1 causes awk to dump core on fatal errors.

       -F fs   Define  the  input field	separator to be	the regular expression
	       fs.

       -f progfile
	       Read program code from the specified file progfile  instead  of
	       from the	command	line.

       -safe   Disable	file output (print >, print >>), process creation (cmd
	       | getline, print	|,  system)  and  access  to  the  environment
	       (ENVIRON; see the section on variables below).  This is a first
	       (and  not  very	reliable) approximation	to a "safe" version of
	       awk.

       -version
	       Print the version number	of awk to standard output and exit.

       -v var=value
	       Assign value to variable	var before prog	is executed; any  num-
	       ber of -v options may be	present.

       The  input  is  normally	 made up of input lines	(records) separated by
       newlines, or by the value of RS.	 If RS is null,	 then  any  number  of
       blank  lines are	used as	the record separator, and newlines are used as
       field separators	(in addition to	the value of FS).  This	is  convenient
       when working with multi-line records.

       An input	line is	normally made up of fields separated by	whitespace, or
       by  the	regular	 expression  FS.   The fields are denoted $1, $2, ...,
       while $0	refers to the entire line.  If FS is null, the input  line  is
       split into one field per	character.

       Normally,  any  number  of blanks separate fields.  In order to set the
       field separator to a single blank, use the -F option with  a  value  of
       `[ ]'.	If  a field separator of `t' is	specified, awk treats it as if
       `\t' had	been specified and uses	<TAB> as the field separator.  In  or-
       der to use a literal `t'	as the field separator,	use the	-F option with
       a value of `[t]'.

       A pattern-action	statement has the form

	     pattern { action }

       A  missing  {  action  }	means print the	line; a	missing	pattern	always
       matches.	 Pattern-action	statements are separated by newlines or	 semi-
       colons.

       Newlines	 are  permitted	 after	a terminating statement	or following a
       comma (`,'), an open brace (`{'), a logical AND (`&&'),	a  logical  OR
       (`||'),	after the `do' or `else' keywords, or after the	closing	paren-
       thesis of an `if', `for', or `while' statement.	Additionally, a	 back-
       slash (`\') can be used to escape a newline between tokens.

       An  action  is a	sequence of statements.	 A statement can be one	of the
       following:

	     if	(expression) statement [else statement]
	     while (expression)	statement
	     for (expression; expression; expression) statement
	     for (var in array)	statement
	     do	statement while	(expression)
	     break
	     continue
	     { [statement ...] }
	     expression	# commonly var = expression
	     print [expression-list] [>expression]
	     printf format [..., expression-list] [>expression]
	     return [expression]
	     next # skip remaining patterns on this input line
	     nextfile #	skip rest of this file,	open next, start at top
	     delete array[expression] #	delete an array	element
	     delete array # delete all elements	of array
	     exit [expression] # exit immediately; status is expression

       Statements are terminated by semicolons,	newlines or right braces.   An
       empty  expression-list  stands for $0.  String constants	are quoted "",
       with the	usual C	escapes	recognized within (see printf(1)  for  a  com-
       plete  list of these).  Expressions take	on string or numeric values as
       appropriate,  and  are  built  using  the  operators  +	-  *  /	 %   ^
       (exponentiation), and concatenation (indicated by whitespace).  The op-
       erators ! ++ -- += -= *=	/= %= ^= > >= <	<= == != ?: are	also available
       in  expressions.	  Variables  may  be  scalars, array elements (denoted
       x[i]) or	fields.	 Variables are initialized to the null string.	 Array
       subscripts  may be any string, not necessarily numeric; this allows for
       a form of associative memory.  Multiple subscripts such as [i,j,k]  are
       permitted; the constituents are concatenated, separated by the value of
       SUBSEP (see the section on variables below).

       The  print statement prints its arguments on the	standard output	(or on
       a file if >file or >>file is present or on a pipe if | cmd is present),
       separated by the	current	output field separator,	and terminated by  the
       output  record  separator.  file	and cmd	may be literal names or	paren-
       thesized	expressions; identical string values in	 different  statements
       denote the same open file.  The printf statement	formats	its expression
       list according to the format (see printf(1)).

       Patterns	 are  arbitrary	Boolean	combinations (with ! ||	&&) of regular
       expressions and relational expressions.	awk supports extended  regular
       expressions  (EREs).   See re_format(7) for more	information on regular
       expressions.  Isolated regular expressions in a pattern	apply  to  the
       entire  line.  Regular expressions may also occur in relational expres-
       sions, using the	operators ~ and	!~.  /re/ is a	constant  regular  ex-
       pression;  any  string  (constant or variable) may be used as a regular
       expression, except in the position of an	isolated regular expression in
       a pattern.

       A pattern may consist of	two patterns separated by  a  comma;  in  this
       case,  the  action is performed for all lines from an occurrence	of the
       first pattern through an	occurrence of the second.

       A relational expression is one of the following:

	     expression	matchop	regular-expression
	     expression	relop expression
	     expression	in array-name
	     (expr, expr, ...) in array-name

       where a relop is	any of the  six	 relational  operators	in  C,	and  a
       matchop is either ~ (matches) or	!~ (does not match).  A	conditional is
       an  arithmetic expression, a relational expression, or a	Boolean	combi-
       nation of these.

       The special patterns BEGIN and END may be used to capture  control  be-
       fore the	first input line is read and after the last.  BEGIN and	END do
       not combine with	other patterns.

       Variable	names with special meanings:

       ARGC	  Argument count, assignable.
       ARGV	  Argument  array,  assignable;	 non-null members are taken as
		  filenames.
       CONVFMT	  Conversion format when converting numbers (default "%.6g").
       ENVIRON	  Array	of environment variables; subscripts are names.
       FILENAME	  The name of the current input	file.
       FNR	  Ordinal number of the	current	record in the current file.
       FS	  Regular expression used to separate fields; also settable by
		  option -F fs.
       NF	  Number of fields in the current record.  $NF can be used  to
		  obtain the value of the last field in	the current record.
       NR	  Ordinal number of the	current	record.
       OFMT	  Output format	for numbers (default "%.6g").
       OFS	  Output field separator (default blank).
       ORS	  Output record	separator (default newline).
       RLENGTH	  The length of	the string matched by the match() function.
       RS	  Input	record separator (default newline).
       RSTART	  The  starting	 position of the string	matched	by the match()
		  function.
       SUBSEP	  Separates multiple subscripts	(default 034).

FUNCTIONS
       The awk language	has  a	variety	 of  built-in  functions:  arithmetic,
       string, input/output, general, and bit-operation.

       Functions  may  be  defined (at the position of a pattern-action	state-
       ment) thusly:

	     function foo(a, b,	c) { ...; return x }

       Parameters are passed by	value if scalar, and  by  reference  if	 array
       name; functions may be called recursively.  Parameters are local	to the
       function;  all other variables are global.  Thus	local variables	may be
       created by providing excess parameters in the function definition.

   Arithmetic Functions
       atan2(y,	x)  Return the arctangent of y/x in radians.

       cos(x)	    Return the cosine of x, where x is in radians.

       exp(x)	    Return the exponential of x.

       int(x)	    Return x truncated to an integer value.

       log(x)	    Return the natural logarithm of x.

       rand()	    Return a random number, n, such that 0<=n<1.

       sin(x)	    Return the sine of x, where	x is in	radians.

       sqrt(x)	    Return the square root of x.

       srand(expr)  Sets seed for rand() to  expr  and	returns	 the  previous
		    seed.   If	expr  is  omitted, the time of day is used in-
		    stead.

   String Functions
       gsub(r, t, s)	The same as sub() except that all occurrences  of  the
			regular	 expression  are replaced.  gsub() returns the
			number of replacements.

       index(s,	t)	The position in	s where	the string t occurs, or	 0  if
			it does	not.

       length(s)	The  length of s taken as a string, or of $0 if	no ar-
			gument is given.

       match(s,	r)	The position in	s where	the regular expression	r  oc-
			curs, or 0 if it does not.  The	variable RSTART	is set
			to  the	starting position of the matched string	(which
			is the same as the returned value) or zero if no match
			is found.  The variable	RLENGTH	is set to  the	length
			of the matched string, or -1 if	no match is found.

       split(s,	a, fs)	Splits	the  string  s into array elements a[1], a[2],
			..., a[n] and returns n.  The separation is done  with
			the  regular expression	fs or with the field separator
			FS if fs is not	given.	An empty string	as field sepa-
			rator splits the string	into  one  array  element  per
			character.

       sprintf(fmt, expr, ...)
			The string resulting from formatting expr, ... accord-
			ing to the printf(1) format fmt.

       sub(r, t, s)	Substitutes  t for the first occurrence	of the regular
			expression r in	the string s.  If s is not  given,  $0
			is  used.   An	ampersand  (`&')  in  t	is replaced in
			string s with regular expression r.  A literal	amper-
			sand  can  be specified	by preceding it	with two back-
			slashes	(`\\').	 A literal backslash can be  specified
			by  preceding it with another backslash	(`\\').	 sub()
			returns	the number of replacements.

       substr(s, m, n)	Return at most the n-character substring of s that be-
			gins at	position m counted from	1.  If n  is  omitted,
			or if n	specifies more characters than are left	in the
			string,	 the length of the substring is	limited	by the
			length of s.

       tolower(str)	Returns	a copy of str with all	upper-case  characters
			translated  to	their corresponding lower-case equiva-
			lents.

       toupper(str)	Returns	a copy of str with all	lower-case  characters
			translated  to	their corresponding upper-case equiva-
			lents.

   Input/Output	and General Functions
       close(expr)	     Closes the	file or	pipe expr.  expr should	 match
			     the  string  that	was  used  to open the file or
			     pipe.

       cmd | getline [var]   Read a record of input from a stream  piped  from
			     the  output of cmd.  If var is omitted, the vari-
			     ables $0 and NF are set.  Otherwise var  is  set.
			     If	the stream is not open,	it is opened.  As long
			     as	the stream remains open, subsequent calls will
			     read  subsequent  records	from  the stream.  The
			     stream remains open until explicitly closed  with
			     a	call to	close().  getline returns 1 for	a suc-
			     cessful input, 0 for end of file, and -1  for  an
			     error.

       fflush([expr])	     Flushes  any buffered output for the file or pipe
			     expr, or all open files or	pipes if expr is omit-
			     ted.  expr	should match the string	that was  used
			     to	open the file or pipe.

       getline		     Sets $0 to	the next input record from the current
			     input  file.  This	form of	getline	sets the vari-
			     ables NF, NR, and FNR.  getline returns 1	for  a
			     successful	 input,	 0 for end of file, and	-1 for
			     an	error.

       getline var	     Sets $0 to	variable var.  This  form  of  getline
			     sets the variables	NR and FNR.  getline returns 1
			     for a successful input, 0 for end of file,	and -1
			     for an error.

       getline [var]  <file  Sets  $0 to the next record from file.  If	var is
			     omitted, the variables $0 and NF are set.	Other-
			     wise var is set.  If file	is  not	 open,	it  is
			     opened.  As long as the stream remains open, sub-
			     sequent  calls  will read subsequent records from
			     file.  file remains open until explicitly	closed
			     with a call to close().

       system(cmd)	     Executes cmd and returns its exit status.

   Bit-Operation Functions
       compl(x)	     Returns the bitwise complement of integer argument	x.

       and(v1, v2, ...)
		     Performs  a bitwise AND on	all arguments provided,	as in-
		     tegers.  There must be at least two values.

       or(v1, v2, ...)
		     Performs a	bitwise	OR on all arguments provided, as inte-
		     gers.  There must be at least two values.

       xor(v1, v2, ...)
		     Performs a	bitwise	Exclusive-OR  on  all  arguments  pro-
		     vided, as integers.  There	must be	at least two values.

       lshift(x, n)  Returns integer argument x	shifted	by n bits to the left.

       rshift(x, n)  Returns  integer  argument	 x  shifted  by	 n bits	to the
		     right.

EXIT STATUS
       The awk utility exits 0 on success, and >0 if an	error occurs.

       But note	that the exit expression can modify the	exit status.

EXAMPLES
       Print lines longer than 72 characters:

	     length($0)	> 72

       Print first two fields in opposite order:

	     { print $2, $1 }

       Same, with input	fields separated by comma and/or blanks	and tabs:

	     BEGIN { FS	= ",[ \t]*|[ \t]+" }
		   { print $2, $1 }

       Add up first column, print sum and average:

	     { s += $1 }
	     END { print "sum is", s, "	average	is", s/NR }

       Print all lines between start/stop pairs:

	     /start/, /stop/

       Simulate	echo(1):

	     BEGIN { # Simulate	echo(1)
		     for (i = 1; i < ARGC; i++)	printf "%s ", ARGV[i]
		     printf "\n"
		     exit }

       Print an	error message to standard error:

	     { print "error!" >	"/dev/stderr" }

SEE ALSO
       cut(1), lex(1), printf(1), sed(1), re_format(7)

       A. V. Aho, B. W.	Kernighan, and P. J. Weinberger, The  AWK  Programming
       Language, Addison-Wesley, 1988, ISBN 0-201-07981-X.

STANDARDS
       The  awk	utility	is compliant with the IEEE Std 1003.1-2008 ("POSIX.1")
       specification, except awk does not support {n,m}	pattern	matching.

       The flags -d, -safe, and	-version  as  well  as	the  commands  fflush,
       compl,  and, or,	xor, lshift, rshift, are extensions to that specifica-
       tion.

HISTORY
       An awk utility appeared in Version 7 AT&T UNIX.

BUGS
       There are no explicit conversions  between  numbers  and	 strings.   To
       force  an expression to be treated as a number add 0 to it; to force it
       to be treated as	a string concatenate ""	to it.

       The scope rules for variables in	functions are a	botch; the  syntax  is
       worse.

FreeBSD	13.1			 June 6, 2020				AWK(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=awk&manpath=FreeBSD+13.1-RELEASE>

home | help