Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
FILE(1)			    General Commands Manual		       FILE(1)

NAME
       file -- determine file type

SYNOPSIS
       file  [-bcdEhiklLNnprsSvzZ0]  [--apple] [--exclude-quiet] [--extension]
	    [--mime-encoding]  [--mime-type]  [-e  testname]  [-F   separator]
	    [-f	namefile] [-m magicfiles] [-P name=value] file ...
       file -C [-m magicfiles]
       file [--help]

DESCRIPTION
       This manual page	documents version "5.39" of the	file command.

       file tests each argument	in an attempt to classify it.  There are three
       sets  of	tests, performed in this order:	filesystem tests, magic	tests,
       and language tests.  The	first test that	succeeds causes	the file  type
       to be printed.

       The  type  printed will usually contain one of the words	text (the file
       contains	only printing characters and a few common  control  characters
       and  is	probably  safe	to read	on an ASCII terminal), executable (the
       file contains the result	of compiling a program in a  form  understand-
       able  to	 some  UNIX  kernel or another), or data meaning anything else
       (data is	usually	"binary" or non-printable).  Exceptions	are well-known
       file formats (core files, tar archives) that are	known to  contain  bi-
       nary data.  When	modifying magic	files or the program itself, make sure
       to preserve these keywords.  Users depend on knowing that all the read-
       able  files  in	a directory have the word "text" printed.  Don't do as
       Berkeley	did and	change "shell commands text" to	"shell script".

       The filesystem tests are	based on examining the return from  a  stat(2)
       system  call.   The  program  checks to see if the file is empty, or if
       it's some sort of special file.	Any known file	types  appropriate  to
       the  system you are running on (sockets,	symbolic links,	or named pipes
       (FIFOs) on those	systems	that implement them) are intuited if they  are
       defined in the system header file <sys/stat.h>.

       The  magic  tests  are  used to check for files with data in particular
       fixed formats.  The canonical example of	this is	 a  binary  executable
       (compiled  program)  a.out  file,  whose	 format	is defined in <elf.h>,
       <a.out.h> and possibly <exec.h>	in  the	 standard  include  directory.
       These files have	a "magic number" stored	in a particular	place near the
       beginning  of  the  file	 that tells the	UNIX operating system that the
       file is a binary	executable, and	which of several types	thereof.   The
       concept	of a "magic" has been applied by extension to data files.  Any
       file with some invariant	identifier at a	small fixed  offset  into  the
       file can	usually	be described in	this way.  The information identifying
       these	files	 is    read    from    the    compiled	  magic	  file
       /usr/share/misc/magic.mgc,   or	 the   files	in    the    directory
       /usr/share/misc/magic  if  the  compiled	file does not exist.  In addi-
       tion, if	$HOME/.magic.mgc or $HOME/.magic exists, it will  be  used  in
       preference to the system	magic files.

       If  a  file  does not match any of the entries in the magic file, it is
       examined	to see if it seems to be a text	file.  ASCII, ISO-8859-x, non-
       ISO 8-bit extended-ASCII	character sets (such as	those used  on	Macin-
       tosh  and  IBM  PC systems), UTF-8-encoded Unicode, UTF-16-encoded Uni-
       code, and EBCDIC	character sets can be distinguished by	the  different
       ranges  and  sequences  of bytes	that constitute	printable text in each
       set.  If	a file passes any of these tests, its  character  set  is  re-
       ported.	ASCII, ISO-8859-x, UTF-8, and extended-ASCII files are identi-
       fied  as	"text" because they will be mostly readable on nearly any ter-
       minal; UTF-16 and EBCDIC	are only "character data" because, while  they
       contain text, it	is text	that will require translation before it	can be
       read.   In  addition, file will attempt to determine other characteris-
       tics of text-type files.	 If the	lines of a file	are terminated by  CR,
       CRLF,  or  NEL, instead of the Unix-standard LF,	this will be reported.
       Files that contain embedded escape sequences or overstriking will  also
       be identified.

       Once file has determined	the character set used in a text-type file, it
       will  attempt  to  determine in what language the file is written.  The
       language	tests look for particular strings (cf.	 <names.h>)  that  can
       appear  anywhere	 in  the first few blocks of a file.  For example, the
       keyword .br indicates that the file is most  likely  a  troff(1)	 input
       file,  just  as	the keyword struct indicates a C program.  These tests
       are less	reliable than the previous two groups, so they	are  performed
       last.   The  language test routines also	test for some miscellany (such
       as tar(1) archives, JSON	files).

       Any file	that cannot be identified as having been written in any	of the
       character sets listed above is simply said to be	"data".

OPTIONS
       --apple
	       Causes the file command to output the  file  type  and  creator
	       code  as	 used  by  older MacOS versions.  The code consists of
	       eight letters, the first	describing the file type,  the	latter
	       the  creator.  This option works	properly only for file formats
	       that have the apple-style output	defined.

       -b, --brief
	       Do not prepend filenames	to output lines	(brief mode).

       -C, --compile
	       Write a magic.mgc output	file that contains a  pre-parsed  ver-
	       sion of the magic file or directory.

       -c, --checking-printout
	       Cause a checking	printout of the	parsed form of the magic file.
	       This is usually used in conjunction with	the -m flag to debug a
	       new magic file before installing	it.

       -d      Prints internal debugging information to	stderr.

       -E      On  filesystem errors (file not found etc), instead of handling
	       the error as regular output as POSIX mandates and  keep	going,
	       issue an	error message and exit.

       -e, --exclude testname
	       Exclude	the test named in testname from	the list of tests made
	       to determine the	file type.  Valid test names are:

	       apptype	 EMX application type (only on EMX).

	       ascii	 Various types of text files (this test	 will  try  to
			 guess	the text encoding, irrespective	of the setting
			 of the	`encoding' option).

	       encoding	 Different text	encodings for soft magic tests.

	       tokens	 Ignored for backwards compatibility.

	       cdf	 Prints	details	of Compound Document Files.

	       compress	 Checks	for, and looks inside, compressed files.

	       csv	 Checks	Comma Separated	Value files.

	       elf	 Prints	ELF file details, provided  soft  magic	 tests
			 are enabled and the elf magic is found.

	       json	 Examines  JSON	 (RFC-7159)  files by parsing them for
			 compliance.

	       soft	 Consults magic	files.

	       tar	 Examines tar files by verifying the checksum  of  the
			 512 byte tar header.  Excluding this test can provide
			 more  detailed	 content description by	using the soft
			 magic method.

	       text	 A synonym for `ascii'.

       --exclude-quiet
	       Like --exclude but ignore tests that file does not know	about.
	       This is intended	for compatilibity with older versions of file.

       --extension
	       Print  a	 slash-separated list of valid extensions for the file
	       type found.

       -F, --separator separator
	       Use the specified string	as the separator between the  filename
	       and the file result returned.  Defaults to `:'.

       -f, --files-from	namefile
	       Read  the  names	of the files to	be examined from namefile (one
	       per line) before	the argument  list.   Either  namefile	or  at
	       least  one filename argument must be present; to	test the stan-
	       dard input, use `-' as a	filename argument.  Please  note  that
	       namefile	 is unwrapped and the enclosed filenames are processed
	       when this option	is encountered and before any further  options
	       processing  is done.  This allows one to	process	multiple lists
	       of files	with different command line arguments on the same file
	       invocation.  Thus if you	want to	set the	delimiter, you need to
	       do it before you	specify	the list of  files,  like:  "-F	 @  -f
	       namefile", instead of: "-f namefile -F @".

       -h, --no-dereference
	       option causes symlinks not to be	followed (on systems that sup-
	       port  symbolic  links).	This is	the default if the environment
	       variable	POSIXLY_CORRECT	is not defined.

       -i, --mime
	       Causes the file command to output mime type strings rather than
	       the more	traditional human readable  ones.   Thus  it  may  say
	       `text/plain; charset=us-ascii' rather than "ASCII text".

       --mime-type, --mime-encoding
	       Like -i,	but print only the specified element(s).

       -k, --keep-going
	       Don't  stop at the first	match, keep going.  Subsequent matches
	       will be have the	string `\012- '	prepended.   (If  you  want  a
	       newline,	 see the -r option.)  The magic	pattern	with the high-
	       est strength (see the -l	option)	comes first.

       -l, --list
	       Shows a list of patterns	and their strength  sorted  descending
	       by  magic(5)  strength which is used for	the matching (see also
	       the -k option).

       -L, --dereference
	       option causes symlinks to be followed, as the like-named	option
	       in ls(1)	(on systems that support symbolic links).  This	is the
	       default if the environment variable POSIXLY_CORRECT is defined.

       -m, --magic-file	magicfiles
	       Specify an alternate list of files and  directories  containing
	       magic.	This  can be a single item, or a colon-separated list.
	       If a compiled magic file	is found alongside a  file  or	direc-
	       tory, it	will be	used instead.

       -N, --no-pad
	       Don't pad filenames so that they	align in the output.

       -n, --no-buffer
	       Force  stdout  to be flushed after checking each	file.  This is
	       only useful if checking a list of files.	 It is intended	to  be
	       used by programs	that want filetype output from a pipe.

       -p, --preserve-date
	       On  systems that	support	utime(3) or utimes(2), attempt to pre-
	       serve the access	time of	files analyzed,	to pretend  that  file
	       never read them.

       -P, --parameter name=value
	       Set various parameter limits.

		     Name	  Default    Explanation
		     bytes	  1048576    max  number of bytes to read from
									    file
		     elf_notes	  256	     max ELF notes processed
		     elf_phnum	  2048	     max    ELF	   program    sections
									    processed
		     elf_shnum	  32768	     max ELF sections processed
		     indir	  50	     recursion	 limit	 for  indirect
									    magic
		     name	  50	     use  count	 limit	for   name/use
									    magic
		     regex	  8192	     length limit for regex searches

       -r, --raw
	       Don't  translate	unprintable characters to \ooo.	 Normally file
	       translates unprintable characters to  their  octal  representa-
	       tion.

       -s, --special-files
	       Normally,  file only attempts to	read and determine the type of
	       argument	files which stat(2) reports are	ordinary files.	  This
	       prevents	problems, because reading special files	may have pecu-
	       liar  consequences.   Specifying	 the  -s option	causes file to
	       also read argument files	which are block	or  character  special
	       files.	This is	useful for determining the filesystem types of
	       the data	in raw disk partitions,	which are block	special	files.
	       This option also	causes file to disregard the file size as  re-
	       ported  by stat(2) since	on some	systems	it reports a zero size
	       for raw disk partitions.

       -S, --no-sandbox
	       On	      systems		  where		    libseccomp
	       (https://github.com/seccomp/libseccomp)	is  available,	the -S
	       flag disables sandboxing	which is enabled by default.  This op-
	       tion is needed for file to execute external decompressing  pro-
	       grams,  i.e. when the -z	flag is	specified and the built-in de-
	       compressors are not available.  On systems where	sandboxing  is
	       not available, this option has no effect.

       -v, --version
	       Print the version of the	program	and exit.

       -z, --uncompress
	       Try to look inside compressed files.

       -Z, --uncompress-noreport
	       Try  to	look  inside  compressed files,	but report information
	       about the contents only not the compression.

       -0, --print0
	       Output a	null character `\0' after the  end  of	the  filename.
	       Nice to cut(1) the output.  This	does not affect	the separator,
	       which is	still printed.

	       If  this	 option	 is  repeated more than	once, then file	prints
	       just the	filename followed by a NUL followed by the description
	       (or ERROR: text)	followed by a second NUL for each entry.

       --help  Print a help message and	exit.

ENVIRONMENT
       The environment variable	MAGIC can be used to  set  the	default	 magic
       file name.  If that variable is set, then file will not attempt to open
       $HOME/.magic.  file adds	".mgc" to the value of this variable as	appro-
       priate.	 The environment variable POSIXLY_CORRECT controls (on systems
       that support symbolic links), whether file will attempt to follow  sym-
       links  or  not.	 If  set, then file follows symlink, otherwise it does
       not.  This is also controlled by	the -L and -h options.

FILES
       /usr/share/misc/magic.mgc  Default compiled list	of magic.
       /usr/share/misc/magic	  Directory containing default magic files.

EXIT STATUS
       file will exit with 0 if	the operation was successful or	>0 if an error
       was encountered.	 The following errors cause diagnostic	messages,  but
       don't  affect  the  program exit	code (as POSIX requires), unless -E is
       specified:
	     	 A file	cannot be found
	     	 There is no permission	to read	a file
	     	 The file type cannot be determined

EXAMPLES
	     $ file file.c file	/dev/{wd0a,hda}
	     file.c:   C program text
	     file:     ELF 32-bit LSB executable, Intel	80386, version 1 (SYSV),
		       dynamically linked (uses	shared libs), stripped
	     /dev/wd0a:	block special (0/0)
	     /dev/hda: block special (3/0)

	     $ file -s /dev/wd0{b,d}
	     /dev/wd0b:	data
	     /dev/wd0d:	x86 boot sector

	     $ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10}
	     /dev/hda:	 x86 boot sector
	     /dev/hda1:	 Linux/i386 ext2 filesystem
	     /dev/hda2:	 x86 boot sector
	     /dev/hda3:	 x86 boot sector, extended partition table
	     /dev/hda4:	 Linux/i386 ext2 filesystem
	     /dev/hda5:	 Linux/i386 swap file
	     /dev/hda6:	 Linux/i386 swap file
	     /dev/hda7:	 Linux/i386 swap file
	     /dev/hda8:	 Linux/i386 swap file
	     /dev/hda9:	 empty
	     /dev/hda10: empty

	     $ file -i file.c file /dev/{wd0a,hda}
	     file.c:	  text/x-c
	     file:	  application/x-executable
	     /dev/hda:	  application/x-not-regular-file
	     /dev/wd0a:	  application/x-not-regular-file

SEE ALSO
       hexdump(1), od(1), strings(1), magic(5),	fstyp(8)

STANDARDS CONFORMANCE
       This program is believed	to exceed the System V Interface Definition of
       FILE(CMD), as near as one can determine from the	 vague	language  con-
       tained  therein.	  Its  behavior	is mostly compatible with the System V
       program of the same name.  This version knows more magic,  however,  so
       it will produce different (albeit more accurate)	output in many cases.

       The  one	 significant  difference  between this version and System V is
       that this version treats	any white space	as a delimiter,	so that	spaces
       in pattern strings must be escaped.  For	example,

	     >10     string  language impress	     (imPRESS data)

       in an existing magic file would have to be changed to

	     >10     string  language\ impress	     (imPRESS data)

       In addition, in this version, if	a pattern string contains a backslash,
       it must be escaped.  For	example

	     0	     string	     \begindata	     Andrew Toolkit document

       in an existing magic file would have to be changed to

	     0	     string	     \\begindata     Andrew Toolkit document

       SunOS releases 3.2 and later from Sun Microsystems include a file  com-
       mand  derived  from  the	 System	V one, but with	some extensions.  This
       version differs from Sun's only in minor	ways.  It includes the	exten-
       sion of the `&' operator, used as, for example,

	     >16     long&0x7fffffff >0		     not stripped

SECURITY
       On  systems where libseccomp (https://github.com/seccomp/libseccomp) is
       available, file is enforces limiting system calls to only the ones nec-
       essary for the operation	of the program.	  This	enforcement  does  not
       provide	any  security  benefit	when file is asked to decompress input
       files running external programs with the	-z option.  To	enable	execu-
       tion  of	 external decompressors, one needs to disable sandboxing using
       the -S flag.

MAGIC DIRECTORY
       The magic file entries have been	collected from various sources,	mainly
       USENET, and contributed by various authors.  Christos  Zoulas  (address
       below) will collect additional or corrected magic file entries.	A con-
       solidation of magic file	entries	will be	distributed periodically.

       The  order  of  entries in the magic file is significant.  Depending on
       what system you are using, the order that they are put together may  be
       incorrect.   If	your  old file command uses a magic file, keep the old
       magic   file   around   for   comparison	  purposes   (rename   it   to
       /usr/share/misc/magic.orig).

HISTORY
       There  has  been	 a  file command in every UNIX since at	least Research
       Version 4 (man page dated November, 1973).  The System V	version	intro-
       duced one significant major change: the external	list of	 magic	types.
       This slowed the program down slightly but made it a lot more flexible.

       This  program, based on the System V version, was written by Ian	Darwin
       <ian@darwinsys.com> without looking at anybody else's source code.

       John Gilmore revised the	code extensively, making it  better  than  the
       first  version.	 Geoff Collyer found several inadequacies and provided
       some magic file entries.	 Contributions of  the	`&'  operator  by  Rob
       McMahon,	<cudcv@warwick.ac.uk>, 1989.

       Guy  Harris,  <guy@netapp.com>,	made  many  changes  from  1993	to the
       present.

       Primary development and maintenance from	1990 to	the present by	Chris-
       tos Zoulas <christos@astron.com>.

       Altered by Chris	Lowth <chris@lowth.com>, 2000: handle the -i option to
       output  mime type strings, using	an alternative magic file and internal
       logic.

       Altered by Eric Fischer <enf@pobox.com>,	July, 2000, to identify	 char-
       acter codes and attempt to identify the languages of non-ASCII files.

       Altered	by  Reuben  Thomas  <rrt@sc3d.org>, 2007-2011, to improve MIME
       support,	merge MIME and non-MIME	magic, support directories as well  as
       files  of  magic,  apply	many bug fixes,	update and fix a lot of	magic,
       improve the build system, improve the documentation,  and  rewrite  the
       Python bindings in pure Python.

       The  list of contributors to the	`magic'	directory (magic files)	is too
       long to include here.  You know who you are; thank you.	Many contribu-
       tors are	listed in the source files.

LEGAL NOTICE
       Copyright (c) Ian F. Darwin, Toronto, Canada,  1986-1999.   Covered  by
       the  standard  Berkeley	Software  Distribution copyright; see the file
       COPYING in the source distribution.

       The files tar.h and is_tar.c were written by John Gilmore from his pub-
       lic-domain tar(1) program, and are not covered by the above license.

BUGS
       Please  report  bugs  and  send	patches	 to   the   bug	  tracker   at
       https://bugs.astron.com/	 or  the  mailing  list	 at  <file@astron.com>
       (visit https://mailman.astron.com/mailman/listinfo/file first  to  sub-
       scribe).

TODO
       Fix  output  so	that tests for MIME and	APPLE flags are	not needed all
       over the	place, and actual output is only  done	in  one	 place.	  This
       needs  a	 design.  Suggestion: push possible outputs on to a list, then
       pick the	last-pushed (most specific, one	hopes) value at	 the  end,  or
       use  a default if the list is empty.  This should not slow down evalua-
       tion.

       The handling of MAGIC_CONTINUE and printing \012-  between  entries  is
       clumsy and complicated; refactor	and centralize.

       Some of the encoding logic is hard-coded	in encoding.c and can be moved
       to the magic files if we	had a !:charset	annotation

       Continue	to squash all magic bugs.  See Debian BTS for a	good source.

       Store  arbitrarily  long	 strings, for example for %s patterns, so that
       they can	be printed out.	 Fixes Debian bug #271672.  This can  be  done
       by  allocating strings in a string pool,	storing	the string pool	at the
       end of the magic	file and converting all	the string pointers  to	 rela-
       tive offsets from the string pool.

       Add  syntax  for	 relative  offsets  after  current  level  (Debian bug
       #466037).

       Make file -ki work, i.e.	give multiple MIME types.

       Add a zip library so we can peek	inside Office2007 documents  to	 print
       more details about their	contents.

       Add an option to	print URLs for the sources of the file descriptions.

       Combine	script	searches and add a way to map executable names to MIME
       types (e.g. have	a magic	value for !:mime which	causes	the  resulting
       string  to  be looked up	in a table).  This would avoid adding the same
       magic repeatedly	for each new hash-bang interpreter.

       When a file descriptor is available, we can skip	and adjust the	buffer
       instead of the hacky buffer management we do now.

       Fix  "name"  and	"use" to check for consistency at compile time (dupli-
       cate "name", "use" pointing to undefined	"name" ).  Make	"name" / "use"
       more efficient by keeping a sorted list of names.   Special-case	 ^  to
       flip  endianness	 in the	parser so that it does not have	to be escaped,
       and document it.

       If the offsets specified	internally in the file exceed the buffer  size
       (  HOWMANY  variable in file.h),	then we	don't seek to that offset, but
       we give up.  It would be	better if buffer managements was done when the
       file descriptor is available so move around  the	 file.	 One  must  be
       careful	though because this has	performance (and thus security consid-
       erations).

AVAILABILITY
       You can obtain the original author's latest version by anonymous	FTP on
       ftp.astron.com in the directory /pub/file/file-X.YZ.tar.gz.

FreeBSD	13.0			 June 7, 2020			       FILE(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=file&sektion=1&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help