Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
SORT(1)			    General Commands Manual		       SORT(1)

NAME
       sort -- sort or merge records (lines) of	text and binary	files

SYNOPSIS
       sort  [-bcCdfghiRMmnrsuVz]  [-k field1[,field2]]	 [-S memsize] [-T dir]
	    [-t	char] [-o output] [file	...]
       sort --help
       sort --version

DESCRIPTION
       The sort	utility	sorts text and binary files by lines.	A  line	 is  a
       record  separated  from the subsequent record by	a newline (default) or
       NUL '\0'	character (-z option).	A record can contain any printable  or
       unprintable characters.	Comparisons are	based on one or	more sort keys
       extracted from each line	of input, and are performed lexicographically,
       according  to  the  current  locale's collating rules and the specified
       command-line options that can tune the actual sorting behavior.	By de-
       fault, if keys are not given, sort uses entire lines for	comparison.

       The command line	options	are as follows:

       -c, --check, -C,	--check=silent|quiet
	       Check that the single input file	is sorted.  If the file	is not
	       sorted, sort produces the appropriate error messages and	 exits
	       with  code  1, otherwise	returns	0.  If -C or --check=silent is
	       specified, sort produces	no output.  This is a "silent" version
	       of -c.

       -m, --merge
	       Merge only.  The	input files are	assumed	to be pre-sorted.   If
	       they are	not sorted the output order is undefined.

       -o output, --output=output
	       Print  the  output  to  the output file instead of the standard
	       output.

       -S size,	--buffer-size=size
	       Use size	for the	maximum	size of	the memory buffer.  Size modi-
	       fiers %,b,K,M,G,T,P,E,Z,Y can be	used.  If a  memory  limit  is
	       not  explicitly specified, sort takes up	to about 90% of	avail-
	       able memory.  If	the file size is too big to fit	into the  mem-
	       ory  buffer,  the  temporary disk files are used	to perform the
	       sorting.

       -T dir, --temporary-directory=dir
	       Store temporary files in	the directory dir.  The	 default  path
	       is  the value of	the environment	variable TMPDIR	or /var/tmp if
	       TMPDIR is not defined.

       -u, --unique
	       Unique keys.  Suppress all lines	that have a key	that is	 equal
	       to an already processed one.  This option, similarly to -s, im-
	       plies  a	 stable	sort.  If used with -c or -C, sort also	checks
	       that there are no lines with duplicate keys.

       -s      Stable sort.  This option maintains the original	 record	 order
	       of records that have an equal key.  This	is a non-standard fea-
	       ture, but it is widely accepted and used.

       --version
	       Print the version and silently exits.

       --help  Print the help text and silently	exits.

       The following options override the default ordering rules.  When	order-
       ing  options appear independently of key	field specifications, they ap-
       ply globally to all sort	keys.  When attached to	a  specific  key  (see
       -k),  the ordering options override all global ordering options for the
       key they	are attached to.

       -b, --ignore-leading-blanks
	       Ignore leading blank characters when comparing lines.

       -d, --dictionary-order
	       Consider	only blank spaces and alphanumeric characters in  com-
	       parisons.

       -f, --ignore-case
	       Convert	all lowercase characters to their uppercase equivalent
	       before comparison, that is, perform case-independent sorting.

       -g, --general-numeric-sort, --sort=general-numeric
	       Sort by general numerical value.	 As opposed to -n, this	option
	       handles general floating	points.	 It has	a more permissive for-
	       mat than	that allowed by	-n but it has  a  significant  perfor-
	       mance drawback.

       -h, --human-numeric-sort, --sort=human-numeric
	       Sort  by	 numerical value, but take into	account	the SI suffix,
	       if present.  Sort first by numeric  sign	 (negative,  zero,  or
	       positive);  then	 by SI suffix (either empty, or	`k' or `K', or
	       one of `MGTPEZY', in that order); and finally by	numeric	value.
	       The SI suffix must immediately follow the number.  For example,
	       '12345K'	sorts before '1M', because M is	"larger" than K.  This
	       sort option is useful for sorting the output of a single	 invo-
	       cation of 'df' command with -h or -H options (human-readable).

       -i, --ignore-nonprinting
	       Ignore all non-printable	characters.

       -M, --month-sort, --sort=month
	       Sort  by	 month	abbreviations.	Unknown	strings	are considered
	       smaller than the	month names.

       -n, --numeric-sort, --sort=numeric
	       Sort fields numerically by arithmetic value.  Fields  are  sup-
	       posed to	have optional blanks in	the beginning, an optional mi-
	       nus sign, zero or more digits (including	decimal	point and pos-
	       sible thousand separators).

       -R, --random-sort, --sort=random
	       Sort  by	 a  random order.  This	is a random permutation	of the
	       inputs except that the equal keys sort together.	 It is	imple-
	       mented  by  hashing the input keys and sorting the hash values.
	       The hash	function is chosen randomly.   The  hash  function  is
	       randomized  by /dev/random content, or by file content if it is
	       specified by --random-source.  Even if multiple sort fields are
	       specified, the same random hash function	is  used  for  all  of
	       them.

       -r, --reverse
	       Sort in reverse order.

       -V, --version-sort
	       Sort  version  numbers.	 The  input  lines are treated as file
	       names in	form PREFIX VERSION SUFFIX, where SUFFIX  matches  the
	       regular	expression "(.([A-Za-z~][A-Za-z0-9~]*)?)*".  The files
	       are compared by their prefixes and versions (leading zeros  are
	       ignored	in  version  numbers, see example below).  If an input
	       string does not match the pattern, then it  is  compared	 using
	       the  byte  compare  function.   All string comparisons are per-
	       formed in C locale, the locale environment setting is ignored.

	       Example:

	       $ ls sort* | sort -V

	       sort-1.022.tgz

	       sort-1.23.tgz

	       sort-1.23.1.tgz

	       sort-1.024.tgz

	       sort-1.024.003.

	       sort-1.024.003.tgz

	       sort-1.024.07.tgz

	       sort-1.024.009.tgz

       The treatment of	field separators can be	altered	using these options:

       -b, --ignore-leading-blanks
	       Ignore leading blank space when determining the start  and  end
	       of  a  restricted sort key (see -k).  If	-b is specified	before
	       the first -k option, it applies globally	to all key  specifica-
	       tions.	Otherwise,  -b	can  be	attached independently to each
	       field argument of the key specifications.  -b.

       -k field1[,field2], --key=field1[,field2]
	       Define a	restricted sort	key that  has  the  starting  position
	       field1,	and  optional  ending  position	field2 of a key	field.
	       The -k option may be specified multiple times,  in  which  case
	       subsequent  keys	 are compared when earlier keys	compare	equal.
	       The -k option replaces the obsolete options  +pos1  and	-pos2,
	       but the old notation is also supported.

       -t char,	--field-separator=char
	       Use  char  as a field separator character.  The initial char is
	       not considered to be part of a field when determining key  off-
	       sets.   Each  occurrence	 of  char is significant (for example,
	       "charchar" delimits an empty field).  If	-t is  not  specified,
	       the  default field separator is a sequence of blank space char-
	       acters, and consecutive blank spaces do not  delimit  an	 empty
	       field, however, the initial blank space is considered part of a
	       field  when determining key offsets.  To	use NUL	as field sepa-
	       rator, use -t '\0'.

       -z, --zero-terminated
	       Use NUL as record separator.  By	default, records in the	 files
	       are  supposed  to be separated by the newline characters.  With
	       this option, NUL	('\0') is used as a record  separator  charac-
	       ter.

       Other options:

       --batch-size=num
	       Specify	maximum	 number	of files that can be opened by sort at
	       once.  This option affects  behavior  when  having  many	 input
	       files or	using temporary	files.	The default value is 16.

       --compress-program=PROGRAM
	       Use PROGRAM to compress temporary files.	 PROGRAM must compress
	       standard	 input	to  standard output, when called without argu-
	       ments.  When called with	argument -d it must  decompress	 stan-
	       dard  input  to	standard  output.  If PROGRAM fails, sort must
	       exit with error.	 An example of PROGRAM that can	be  used  here
	       is bzip2.

       --random-source=filename
	       In  random  sort, the file content is used as the source	of the
	       'seed' data for the hash	function choice.  Two  invocations  of
	       random  sort  with  the	same  seed data	will use the same hash
	       function	and will produce the same result if the	input is  also
	       identical.  By default, file /dev/random	is used.

       --debug
	       Print  some  extra information about the	sorting	process	to the
	       standard	output.

       --files0-from=filename
	       Take the	input file list	from  the  file	 filename.   The  file
	       names must be separated by NUL (like the	output produced	by the
	       command "find ... -print0").

       --radixsort
	       Try  to	use radix sort,	if the sort specifications allow.  The
	       radix sort can only be used for trivial locales (C and  POSIX),
	       and it cannot be	used for numeric or month sort.	 Radix sort is
	       very fast and stable.

       --mergesort
	       Use  mergesort.	 This is a universal algorithm that can	always
	       be used,	but it is not always the fastest.

       --qsort
	       Try to use quick	sort, if the sort specifications allow.	  This
	       sort algorithm cannot be	used with -u and -s.

       --heapsort
	       Try  to	use heap sort, if the sort specifications allow.  This
	       sort algorithm cannot be	used with -u and -s.

       --mmap  Try to use file memory mapping system call.   It	 may  increase
	       speed in	some cases.

       The following operands are available:

       file    The pathname of a file to be sorted, merged, or checked.	 If no
	       file  operands  are  specified,	or if a	file operand is	-, the
	       standard	input is used.

       A field is defined as a maximal sequence	of characters other  than  the
       field  separator	 and  record  separator	(newline by default).  Initial
       blank spaces are	included in the	field unless -b	 has  been  specified;
       the  first  blank space of a sequence of	blank spaces acts as the field
       separator and is	included in the	field (unless -t is  specified).   For
       example,	 all blank spaces at the beginning of a	line are considered to
       be part of the first field.

       Fields are specified by the -k field1[,field2] command-line option.  If
       field2 is missing, the end of the key defaults to the end of the	line.

       The arguments field1 and	field2 have the	form m.n (m,n >	0) and can  be
       followed	 by  one  or  more of the modifiers b, d, f, i,	n, g, M	and r,
       which correspond	to the options discussed above.	 When b	 is  specified
       it  applies  only  to  field1 or	field2 where it	is specified while the
       rest of the modifiers apply to the whole	key field regardless  if  they
       are  specified  only  with field1 or field2 or both.  A field1 position
       specified by m.n	is interpreted as the nth character from the beginning
       of the mth field.  A missing .n in field1 means	`.1',  indicating  the
       first  character	 of the	mth field; if the -b option is in effect, n is
       counted from the	first non-blank	 character  in	the  mth  field;  m.1b
       refers  to  the first non-blank character in the	mth field.  1.n	refers
       to the nth character from the beginning of the line; if	n  is  greater
       than the	length of the line, the	field is taken to be empty.

       nth  positions are always counted from the field	beginning, even	if the
       field is	shorter	than the number	of specified positions.	 Thus, the key
       can really start	from a position	in a subsequent	field.

       A field2	position specified by m.n is interpreted as the	nth  character
       (including  separators) from the	beginning of the mth field.  A missing
       .n indicates the	last character of the mth field; m = 0 designates  the
       end of a	line.  Thus the	option -k v.x,w.y is synonymous	with the obso-
       lete  option +v-1.x-1 -w-1.y; when y is omitted,	-k v.x,w is synonymous
       with +v-1.x-1 -w.0.  The	obsolete +pos1	-pos2  option  is  still  sup-
       ported, except for -w.0b, which has no -k equivalent.

ENVIRONMENT
       LC_COLLATE  Locale  settings  to	be used	to determine the collation for
		   sorting records.

       LC_CTYPE	   Locale settings to be used to case conversion and classifi-
		   cation of characters, that is, which	characters are consid-
		   ered	whitespaces, etc.

       LC_MESSAGES
		   Locale settings that	determine the language of output  mes-
		   sages that sort prints out.

       LC_NUMERIC  Locale  settings  that  determine the number	format used in
		   numeric sort.

       LC_TIME	   Locale settings that	determine the  month  format  used  in
		   month sort.

       LC_ALL	   Locale  settings that override all of the above locale set-
		   tings.  This	environment variable can be used  to  set  all
		   these settings to the same value at once.

       LANG	   Used	 as  a last resort to determine	different kinds	of lo-
		   cale-specific behavior if neither the  respective  environ-
		   ment	variable, nor LC_ALL are set.

       NLSPATH	   Path	to NLS catalogs.

       TMPDIR	   Path	 to  the  directory  in	 which temporary files will be
		   stored.  Note that TMPDIR may be overridden by the  -T  op-
		   tion.

       GNUSORT_NUMERIC_COMPATIBILITY
		   If defined -t will not override the locale numeric symbols,
		   that	 is,  thousand	separators and decimal separators.  By
		   default, if we specify -t with the same symbol as the thou-
		   sand	separator or decimal point, the	symbol will be treated
		   as the field	separator.  Older behavior was less  definite;
		   the	symbol was treated as both field separator and numeric
		   separator, simultaneously.  This environment	 variable  en-
		   ables the old behavior.

FILES
       /var/tmp/.bsdsort.PID.*		 Temporary files.
       /dev/random			 Default  seed	file  for  the	random
					 sort.

EXIT STATUS
       The sort	utility	shall exit with	one of the following values:

       0     Successfully sorted the input files or if used with -c or -C, the
	     input file	already	met the	sorting	criteria.
       1     On	disorder (or non-uniqueness) with the -c or -C options.
       2     An	error occurred.

SEE ALSO
       comm(1),	join(1), uniq(1)

STANDARDS
       The sort	utility	is compliant with the IEEE Std 1003.1-2008 ("POSIX.1")
       specification.

       The flags [-ghRMSsTVz] are extensions to	the POSIX specification.

       All long	options	are extensions to the specification, some of them  are
       provided	 for  compatibility with GNU versions and some of them are own
       extensions.

       The old key notations +pos1 and -pos2 come from older versions of  sort
       and are still supported but their use is	highly discouraged.

HISTORY
       A sort command first appeared in	Version	1 AT&T UNIX.

AUTHORS
       Gabor Kovesdan <gabor@FreeBSD.org>,

       Oleg Moskalenko <mom040267@gmail.com>

NOTES
       This  implementation  of	sort has no limits on input line length	(other
       than imposed by available memory) or any	restrictions on	bytes  allowed
       within lines.

       The  performance	depends	highly on locale settings, efficient choice of
       sort keys and key complexity.  The fastest sort is with	locale	C,  on
       whole lines, with option	-s.  In	general, locale	C is the fastest, then
       single-byte  locales  follow  and multi-byte locales as the slowest but
       the correct collation order is always respected.	 As for	the key	speci-
       fication, the simpler to	process	the lines the faster the  search  will
       be.

       When  sorting by	arithmetic value, using	-n results in much better per-
       formance	than -g	so its use is encouraged whenever possible.

FreeBSD	13.1		       September 4, 2019		       SORT(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=sort&manpath=FreeBSD+13.1-RELEASE>

home | help