Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
FFE(1)									FFE(1)

NAME
       ffe - flat file extractor

SYNOPSIS
       ffe [options]...

DESCRIPTION
       ffe  is a program for extracting	fields from flat file records and dis-
       playing them in different formats. ffe relies on	the configuration file
       to control input	file structure and the output format.

OPTIONS
       ffe accepts the following options:

       -c, --configuration=file
	      Read the configuration from file,	default	is ~/.fferc.

       -s, --structure=STRUCTURE
	      Input file is processed using the	structure STRUCTURE.

       -p, --print=FORMAT
	      Use output format	FORMAT for printing. All printing can be  sup-
	      pressed  using  format no. Original data is printed using	format
	      raw.

       -o, --output=NAME
	      Write output to NAME instead of standard output.

       -f, --field-list=LIST
	      Print only fields	and constants  specified  in  comma  separated
	      list LIST.

       -e, --expression=EXPRESSION
	      Print  only  those records for which the EXPRESSION evaluates to
	      true.

       -a, --and
	      Expressions are combined with logical and,  default  is  logical
	      or.

       -X, --casecmp
	      Expressions are evaluated	case insensitive.

       -v, --invert-match
	      Print only those records which don't match the expression.

       -l, --loose
	      An invalid input line does not cause program to abort.

       -r, --replace=FIELD=VALUE
	      Replace  FIELDs contents with VALUE in output. VALUE can contain
	      same directives as output	option data.

       -d, --debug
	      All invalid input	lines are written to file ffe_error_<pid>.log.

       -I, --info
	      Show the structure information in	configuration file and exit.

       -?, --help
	      List all available options and their meanings and	exit.

       -V, --version
	      Show version of program and exit.

       All remaining arguments are names of input files; if no input files are
       specified, then the standard input is read.

   Expressions (option -e, --expression)
       Expression can be used to select	specific records comparing field  val-
       ues.

       If  the	value starts with string "file:" then the rest of the value is
       considered as a file name. Every	line in	the file is used as  value  in
       comparison.  Record  will  be  selected if one or more values evaluates
       true.

       Expression notation:

       field=value
	      A	record will be selected	if the field field  is	equal  to  the
	      value value.

       field^value
	      A	 record	 will  be  selected if the field field starts with the
	      value value.

       field~value
	      A	record will be selected	if the field field contains the	 value
	      value.

       field!value
	      A	record will be selected	if the field field is not equal	to the
	      value value.

       field?value
	      A	record will be selected	if the field field matches the regular
	      expression in value.

FFE CONFIGURATION
       ffe  uses  the  configuration file for extracting fields	from the input
       file and	for formatting the fields for output.  Every  line  or	binary
       block  of  the input file is considered as a record. Default configura-
       tion file is ~/.fferc but another file can be given with	'-c' option.

       Configuration file for ffe is a text file. The file may	contain	 empty
       lines. Commands are case-sensitive. Comments  begin  with  the  #-char-
       acter  and  end at the end of the line. The string and char definitions
       can be enclosed in double quotation '"' characters. char	 is  a	single
       character.   string  and	 char  can  contain  following	escape	codes:
       '\a','\b','\t','\n','\v','\f', '\r', '\"' and '\#'. Character  '\'  can
       be escaped as '\\'.

       Command Substitution allows the output of a command to replace parts of
       the configuration file. Syntax for command substitution is:
       `command`
       The command is executed and the `command` is substituted	with the stan-
       dard output of the command, with	any trailing newlines deleted. Command
       substitutions may not be	nested.

       Before executing	the command ffe	sets few environment variables:

       FFE_STRUCTURE
	      The name of the structure	given using -s,--structure.

       FFE_OUIPUT
	      The name of the output file given	using -o,--output.

       FFE_FORMAT
	      The name of the output format given using	-p,--print.

       FFE_FIRST_FILE
	      The name of the first input file.

       FFE_FILES
	      A	list of	all input files.

       If variable is already set it will not be replaced.

   Input file structure
       Input file structures are specified with	keyword	structure:

       structure name {options...}

       Options must be ended with newline, options are:

       type fixed|binary|separated [char] [*]
	      Fields  in  the input are	fixed length text fields, fixed	length
	      binary fields or text fields separated by	char. If *  is	given,
	      multiple	sequential  separators	are considered as one. Default
	      separator	is comma.

       quoted [char]
	      Fields may be quoted with	char, default quotation	mark is	double
	      quotation	mark '"'.  A quotation mark is assumed to  be  escaped
	      as  \char	or doubling the	mark as	charchar in input. Non escaped
	      quotation	marks are not preserved	in output.

       header first|all|no
	      Controls the occurrence of the header line. Default  is  no.  If
	      set  as  first or	all, the first line of the first input file is
	      considered as header line	containing the names  of  the  fields.
	      First  means  that  only	the first file has a header, all means
	      that all files have a header, although the names are still taken
	      from the header of the first file. Header	line  is  handled  ac-
	      cording  the record definition, meaning that the name positions,
	      separators etc. are the same as for the fields.

       output name
	      All records belonging this structure are printed according  out-
	      put format name. Default is to use output	named as 'default'.

       record name {options...}
	      Defines one record for a structure. A structure can contain sev-
	      eral record types.

   Record options:
       id position string

       rid position regexp
	      Identifies a record in the input file. Records are identified by
	      the  string  or  by  the	regular	 expression in regexp in input
	      record position position.	For fixed length and binary input  the
	      position	is the byte position of	the input record and for sepa-
	      rated input the position means the position'th field of the  in-
	      put record. Positions start from one.

	      Id's  are	 required  only	 if  input  structure contains several
	      record types with	equal lengths or field counts.	Non  printable
	      characters  can  be  escaped as \xnn where nn is the hexadecimal
	      value of the character.

	      A	record definition can contain several id's, then all id'd must
	      match the	input line (id's are combined with logical and).

	      In a multi-record	binary structure every	record	must  have  at
	      least one	id.

       field name|FILLER|* [length]|* [lookup]|* [output]
	      Specifies	 one field in a	text input structure. length is	manda-
	      tory for fixed length input structure except for the last	field.
	      If the last field	of a fixed length input	structure has a	 *  in
	      place of length then the last field can have arbitrary length.

	      Length  is  also used for	printing fields	in fixed length	format
	      using the	%D or %D directive. The	order of fields	in  configura-
	      tion  file  is  essential,  it  specifies	 the  field order in a
	      record.

	      If '*' is	given instead of the name,  then the  'name'  will  be
	      the  ordinal  number of the field, or if the 'header' option has
	      value 'first' or 'all', then the name of the  field  will	 taken
	      from the header line (first line of the input).

	      If  lookup  is given then	the fields contents is used to	make a
	      lookup in	lookup table lookup. If	length is  not	needed	(sepa-
	      rated format) but	lookup is needed, use asterisk (*) in place of
	      length definition.

	      If output	is given field is printed using	output output. Use as-
	      terisk in	place of lookup	if lookup is not needed.

	      Naming  the  field  as  FILLER causes field not to be printed in
	      output.

       field name|FILLER|* [length]|type [lookup]|* [output]
	      Specifies	one field in a binary input structure. All other  fea-
	      tures are	same as	for the	text structure except the type parame-
	      ter.  type specifies field data type and length and can have the
	      following	values:

	      char Printable character.

	      short Short integer having current system	length and byte	order.

	      int Integer having current system	length and byte	order.

	      long Long	integer	having current system length and byte order.

	      llong  Long  long	 integer having	current	system length and byte
	      order.

	      ushort Unsigned short integer having current system  length  and
	      byte order.

	      uint  Unsigned integer having current system length and byte or-
	      der.

	      ulong Unsigned long integer having  current  system  length  and
	      byte order.

	      ullong  Unsigned	long long integer having current system	length
	      and byte order.

	      int8 8 bit integer.

	      int16_be Big endian 16 bit integer.

	      int32_be Big endian 32 bit integer.

	      int64_be Big endian 64 bit integer.

	      int16_le Little endian 16	bit integer.

	      int32_le Little endian 32	bit integer.

	      int64_le Little endian 64	bit integer.

	      uint8 Unsigned 8 bit integer.

	      uint16_be	Unsigned big endian 16 bit integer.

	      uint32_be	Unsigned big endian 32 bit integer.

	      uint64_be	Unsigned big endian 64 bit integer.

	      uint16_le	Unsigned little	endian 16 bit integer.

	      uint32_le	Unsigned little	endian 32 bit integer.

	      uint64_le	Unsigned little	endian 64 bit integer.

	      float Float having current system	length and byte	order.

	      float_be Float having current system length and big endian  byte
	      order.

	      float_le	Float  having  current system length and little	endian
	      byte order.

	      double Double having current system length and byte order.

	      double_be	Double having current system  length  and  big	endian
	      byte order.

	      double_le	 Double	having current system length and little	endian
	      byte order.

	      bcd_be_len Bcd number having length len and nybbles in  big  en-
	      dian order.

	      bcd_le_len  Bcd  number  having length len and nybbles in	little
	      endian order.

	      hex_be_len Hexadecimal data in big endian	 order	having	length
	      len.

	      hex_le_len Hexadecimal data in little endian order having	length
	      len.

	      If  length  is  given instead of the type, then the field	is as-
	      sumed to be a printable string having length length.  String  is
	      printed until length characters are printed or NULL character is
	      found.

	      Bcd  number  (bcd_be_len	and  bcd_le_len)  is printed until len
	      bytes are	read or	a nybble having	hexadecimal value f is	found.
	      Bcd  number  having  big	endian order is	printed	in order: most
	      significant nybble first and least significant nybble second and
	      bcd number having	little endian order is printed in order: least
	      significant nybble first and  most  significant  nybble  second.
	      Bytes are	always read in big endian order.

	      Hexadecimal data (hex_be_len and hex_le_len) is printed as hexa-
	      decimal  values.	Big  endian  data is printed starting from the
	      lower address and	little endian data starting from the upper ad-
	      dress.

       field-count number
	      Same effect as having field * number times.  Because  length  is
	      not specified, this works	only with separated structure.

       fields-from record
	      Fields for this record are the same as for record	record.

       output name
	      This  record is printed according	output format name. Default is
	      to use output format specified in	the structure.

       level number [element_name|*] [group_name]
	      Level can	be used	if the contents	of a file should be printed as
	      hierarchical multi-level nested form document. Use * instead  of
	      the element name if it is	not needed. number is the level	of the
	      record,  starting	 from number one (highest level), element_name
	      is the name for the record, group_name is	used to	group  records
	      in  the  same and	lower levels. Only number is mandatory parame-
	      ter.

       record-length strict|minimum

	      strict Input record length and field count must match the	record
	      definition in order to get it processed. This is default value.

	      minimum Input record length and field count can be the  same  or
	      longer  as defined for the record. The rest of the input line is
	      ignored.

   Output definitions
       There can be several output definitions in the configuration file. For-
       mat can be selected with	'-p' option. Default format is named  as  'de-
       fault'.

       output name|default {options...}
	      Defines  one  output  format.  Output named as 'default' will be
	      used if none is given for	structure or record, or	none is	 given
	      with option '-p'.

	      There is two predefined output formats no	and raw. no suppresses
	      all printing and raw prints the original input data.

   Output options
       Pictures	in output definition can contain printf-style %-directives:

       %f     Name of the input	file.

       %s     Name of the current structure.

       %r     Name of the current record.

       %o     Input record number in current file.

       %O     Input record number starting from	the first file.

       %i     Byte  offset  of	the current record in the current file.	Starts
	      from zero.

       %I     Byte offset of the current record	starting from the first	 file.
	      Starts from zero.

       %n     Field name.

       %t     Field contents, without leading and trailing whitespaces.

       %d     Field  contents.	Binary	integer	is printed as a	decimal	value.
	      Floating point number is printed in the style [-]ddd.ddd,	 where
	      the number of digits after the decimal-point character is	6. Bcd
	      number  is  printed  as a	decimal	number and hexadecimal data as
	      consecutive hexadecimal values.

       %D     Field contents, right  padded  to	 the  field  length  (requires
	      length definition	for the	field).

       %C     Field  contents,	right  padded  to  the	field length (requires
	      length definition	for the	field).	Output field is	cut  if	 input
	      field is longer that field length.

       %x     Unsigned hexadecimal value of a binary integer. Other fields are
	      printed using directive %d.

       %l     Value from lookup.

       %L     Value  from  lookup,  right padded to the	field length (requires
	      length definition	for the	field).

       %e     Does not print anything, causes still the	"field empty" check to
	      be performed. Can	be used	 when  only  the  names	 of  non-empty
	      fields should be printed.

       %p     Fields  start  position in a record. For fixed structure this is
	      field's byte position in the input line and for separated	struc-
	      ture this	is the ordinal number of the field. Starts from	one.

       %h     Hexadecimal dump of a field. Byte	values are printed as consecu-
	      tive xnn values, where the nn is	the  hexadecimal  value	 of  a
	      byte. Data is printed before any endian conversion.

       %g     Group name given by the keyword group_name in record definition.

       %m     Element name given by the	keyword	element_name in	record defini-
	      tion.

       %%     Percent sign.

       file_header picture
	      Picture is printed once before file contents.

       file_trailer picture
	      Picture is printed once after file contents.

       header picture
	      If specified, then the header line describing the	field names is
	      printed  before  records.	Every field  name is printed according
	      the picture using	the same separator and fields  length  as  de-
	      fined for	the fields. Picture can	contain	only %n	directive.

       data picture
	      Field contents is	printed	according picture.

       lookup picture
	      If  field	 is  mapped to lookup table, this picture will be used
	      instead of picture from data option. If not given, then  picture
	      from data	will be	used.

       separator string
	      All  fields  are	terminated by string, except the last field of
	      the record. Default is not to print separator.

       record_header picture
	      picture is printed before	the record content. Default is not  to
	      print header.

       record_trailer picture
	      picture is printed after the record content. Default is newline.

       justify left|right|char
	      Fields  are  left	 or right justified. char justifies output ac-
	      cording the first	occurrence of char in the  data	 picture.  De-
	      fault is left.

       indent string
	      Record  contents	is  intended  by string. Field contents	is in-
	      tended by	two times the string. Default is not to	indent.

       field-list name1,name2,...
	      Only fields or constants named as	name1,name2,...	 are  printed,
	      same  effect  as	has  '-f'  option. Default is to print all the
	      fields. Fields are also printed in the same order	 as  they  are
	      listed.

       no-data-print yes|no
	      When  set	 as no and field-list is given,	suppresses printing of
	      record_header and	record_trailer in case	where  current	record
	      contains none of the fields specified in field-list.

       field-empty-print yes|no
	      When  set	as no, nothing is printed for fields which consist en-
	      tirely of	characters from	empty-chars. If	none of	the fields  of
	      a	record are printed then	the printing of	record_trailer is also
	      suppressed. Default is yes.

       empty-chars string
	      string  specifies	 a  set	 of characters which define an "empty"
	      field. Default is	" \f\n\r\t\v" (space, form-feed, newline, car-
	      riage return, horizontal tab and vertical	tab).

       output-file file
	      Output is	written	to file	instead	of the default output. If - is
	      given the	standard output	is used.

       group_header string
	      If a record has a	 level	and  group  name  defined,  string  is
	      printed  before  the  first  record  in the same group or	if the
	      group name has changed in	the same level

       group_trailer string
	      If a record has a	 level	and  group  name  defined,  string  is
	      printed  after  the records in lower levels or if	the group name
	      has changed in the same level or if a  higher  level  record  is
	      found.

       element_header string
	      If record	has a level and	header name defined, string is printed
	      before the records contents.

       element_header string
	      If record	has a level and	header name defined, string is printed
	      after the	records	contents.

       hex-caps	yes|no
	      Print hexadecimal	numbers	in capital letters. Default is no.

   Lookup definitions
       lookup name {options...}
	      Defines one lookup table.

   Lookup options:
       search exact|longest
	      The search type for lookup table.

       default-value value
	       value is	printed	if the lookup is not successful.

       pair key	value
	      One key/value pair for the lookup	table.

       file name [separator]
	      Key/value	 pairs	are read from file name. Every line is consid-
	      ered as a	key/value pair separated by separator. Default separa-
	      tor is semicolon.

   Constants
       Additional to input fields constants values can be printed using	option
       -f,--field-list or output option	field-list. Constant will  be  printed
       using data output option.

       Constants are specified as

       const name value
	      when the name appears in a field list, value will	be printed for
	      every record as the name were one	of the input fields.

   Input Preprocessor
       It  is  possible	to define an input preprosessor	for ffe. An input pre-
       processor is simply an executable program which writes the contents  of
       the input file to standard output which will be read by ffe. If the in-
       put  preprosessor does not write	any characters on its standard output,
       then ffe	uses the original file.

       To set up an input preprocessor,	set the	FFEOPEN	 environment  variable
       to  a command line which	will invoke your input preprocessor. This com-
       mand line should	include	one occurrence of the string %s, which will be
       replaced	by the input filename when the input preprocessor  command  is
       invoked.

       The input preprocessor is not used if ffe is reading standard input.

EXAMPLES
       Example	of fixed length	flat file containing fields 'FirstName','Last-
       Name' and 'Age':

       John	Ripper	     23
       Scott	Tiger	     45
       Mary	Moore	     41

       This file can be	printed	in XML with the	following configuration:

       structure personnel {
	   type	fixed
	   output XML
	   record person {
	       field FirstName 9
	       field LastName  13
	       field Age 2
	   }
       }

       output XML {
	   file_header "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n"
	   data	"<%n>%d</%n>\n"
	   record_header "<%r>\n"
	   record_trailer "</%r>\n"
	   indent " "
       }

SEE ALSO
       More examples in	Texinfo	manual.	If the info and	ffe are	 properly  in-
       stalled,	the command

	      info ffe

       should give more	information.

AUTHOR
       Timo Savinen <tjsa@iki.fi >

Timo Savinen			  2011-04-06				FFE(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=ffe&sektion=1&manpath=FreeBSD+Ports+15.0.quarterly>

home | help