Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
MH-FORMAT(5)		      File Formats Manual		  MH-FORMAT(5)

NAME
       mh-format - formatting language for nmh message system

DESCRIPTION
       Several	nmh  commands  utilize either a	format string or a format file
       during their execution.	For example, scan uses a format	string to gen-
       erate its listing of messages; repl uses	a format file to generate mes-
       sage replies, and so on.

       There are  a  number  of	 scan  listing	formats	 available,  including
       nmh/etc/scan.time, nmh/etc/scan.size, and nmh/etc/scan.timely.  Look in
       /usr/local/etc/nmh  for other scan and repl format files	which may have
       been written at your site.

       You can have your local nmh expert write	new format commands or	modify
       existing	 ones,	or  you	can try	your hand at it	yourself.  This	manual
       section explains	how to do that.	 Note: some  familiarity  with	the  C
       printf routine is assumed.

       A format	string consists	of ordinary text combined with special,	multi-
       character,  escape  sequences  which begin with `%'.  When specifying a
       format string, the usual	C  backslash  characters  are  honored:	 `\b',
       `\f',  `\n',  `\r',  and	 `\t'.	Continuation lines in format files end
       with `\'	followed by the	newline	character.  A literal `%' can  be  in-
       serted into a format file by using the sequence `%%'.

   SYNTAX
       Format  strings	are  built  around  escape sequences.  There are three
       types of	escape sequence: header	components,  built-in  functions,  and
       flow control.  Comments may be inserted in most places where a function
       argument	 is  not expected.  A comment begins with `%;' and ends	with a
       (non-escaped) newline.

   Component escapes
       A component escape is specified as `%{component}', and exists for  each
       header  in  the message being processed.	 For example, `%{date}'	refers
       to the "Date:" field of the message.   All  component  escapes  have  a
       string  value.	Such  values  are usually compressed by	converting any
       control characters (tab and newline included) to	spaces,	 then  eliding
       any  leading or multiple	spaces.	 Some commands,	however, may interpret
       some component escapes differently; be sure to refer to each  command's
       manual entry for	details.  Some commands	(such as ap(8) and mhl(1)) use
       a special component `%{text}' to	refer to the text being	processed; see
       their respective	man pages for details and examples.

   Function escapes
       A  function  escape  is	specified as `%(function)'.  All functions are
       built-in, and most have a string	or integer value.  A  function	escape
       may  take  an  argument.	 The argument follows the function escape (and
       any separating whitespace is discarded) as in the following example:

	    %(function argument)

       In addition to literal numbers or strings, the argument to  a  function
       escape  can  be	another	function, or a component, or a control escape.
       When the	argument is a function or a component, the argument is	speci-
       fied  without a leading `%'.  When the argument is a control escape, it
       is specified with a leading `%'.

   Control escapes
       A control escape	is one of: `%<', `%?', `%|', or	`%>'.  These are  com-
       bined into the conditional execution construct:

	    %< condition format-text
	    %? condition format-text
		...
	    %| format-text
	    %>

       (Extra  white space is shown here only for clarity.)  These constructs,
       which may be nested without ambiguity, form a  general  if-elseif-else-
       endif  block  where  only  one  of the format-texts is interpreted.  In
       other words, `%<' is like the "if", `%?'	is like	the "elseif", `%|'  is
       like "else", and	`%>' is	like "endif".

       A  `%<'	or  `%?'  control escape causes	its condition to be evaluated.
       This condition is a component or	function.  For	components  and	 func-
       tions  whose  value  is an integer, the condition is true if it is non-
       zero, and false if zero.	 For components	and functions whose value is a
       string, the condition is	true it	is a non-empty string, and false if an
       empty string.

       The `%?'	control	escape is optional, and	can be used multiple times  in
       a conditional block.  The `%|' control escape is	also optional, but may
       only be used once.

   Function escapes
       Functions expecting an argument generally require an argument of	a par-
       ticular	type.	In addition to the integer and string types, these in-
       clude:

	    Argument Description	    Example Syntax
	    literal  A literal number	    %(func 1234)
		     or	string		    %(func text	string)
	    comp     Any component	    %(func{in-reply-to})
	    date     A date component	    %(func{date})
	    addr     An	address	component   %(func{from})
	    expr     Nothing		    %(func)
		     or	a subexpression	    %(func(func2))
		     or	control	escape	    %(func %<{reply-to}%|%{from}%>)

       The date	and addr types have the	same syntax  as	 the  component	 type,
       comp,  but  require  a  header  component  which	is a date, or address,
       string, respectively.

       Most arguments not of type expr are required.  When escapes are	nested
       (via  expr  arguments), evaluation is done from innermost to outermost.
       As noted	above, for the expr argument type,  functions  and  components
       are written without a leading `%'.  Control escape arguments must use a
       leading `%', preceded by	a space.

       For example,

	    %<(mymbox{from}) To: %{to}%>

       writes  the  value of the header	component "From:" to the internal reg-
       ister  named  str; then (mymbox)	reads str and writes its result	to the
       internal	register named num; then the control escape,  `%<',  evaluates
       num.   If  num is non-zero, the string "To:" is printed followed	by the
       value of	the header component "To:".

   Evaluation
       The evaluation of format	strings	is performed by	a  small  virtual  ma-
       chine.  The machine is capable of evaluating nested expressions (as de-
       scribed	above)	and,  in  addition, has	an integer register num, and a
       text string register str.  When a function escape that accepts  an  op-
       tional argument is processed, and the argument is not present, the cur-
       rent  value  of	either	num or str is substituted as the argument: the
       register	used depends on	the function, as listed	below.

       Component escapes write the value  of  their  message  header  in  str.
       Function	 escapes write their return value in num for functions return-
       ing integer or boolean values,  and  in	str  for  functions  returning
       string  values.	 (The boolean type is a	subset of integers, with usual
       values 0=false and 1=true.)  Control escapes return  a  boolean	value,
       setting	num to 1 if the	last explicit condition	evaluated by a `%<' or
       `%?' control escape succeeded, and 0 otherwise.

       All component escapes, and those	function escapes which return an inte-
       ger or string value, evaluate to	their value as well as setting str  or
       num.   Outermost	 escape	 expressions  in  these	forms will print their
       value, but outermost escapes which return a boolean value do not	result
       in printed output.

   Functions
       The function escapes may	be roughly grouped into	a few categories.

	    Function	Argument Return	  Description
	    msg			 integer  message number
	    cur			 integer  message is current (0	or 1)
	    unseen		 integer  message is unseen (0 or 1)
	    size		 integer  size of message
	    strlen		 integer  length of str
	    width		 integer  column width of terminal
	    charleft		 integer  bytes	left in	output buffer
	    timenow		 integer  seconds since	the Unix epoch
	    me			 string	  the user's mailbox (username)
	    myhost		 string	  the user's local hostname
	    myname		 string	  the user's name
	    localmbox		 string	  the complete local mailbox
	    eq		literal	 boolean  num == arg
	    ne		literal	 boolean  num != arg
	    gt		literal	 boolean  num >	arg
	    match	literal	 boolean  str contains arg
	    amatch	literal	 boolean  str starts with arg
	    plus	literal	 integer  arg plus num
	    minus	literal	 integer  arg minus num
	    multiply	literal	 integer  num multiplied by arg
	    divide	literal	 integer  num divided by arg
	    modulo	literal	 integer  num modulo arg
	    num		literal	 integer  Set num to arg.
	    num			 integer  Set num to zero.
	    lit		literal	 string	  Set str to arg.
	    lit			 string	  Clear	str.
	    getenv	literal	 string	  Set str to environment value of arg
	    profile	literal	 string	  Set str to profile or	context
					  component arg	value
	    nonzero	expr	 boolean  num is non-zero
	    zero	expr	 boolean  num is zero
	    null	expr	 boolean  str is empty
	    nonnull	expr	 boolean  str is non-empty
	    void	expr		  Set str or num
	    comp	comp	 string	  Set str to component text
	    compval	comp	 integer  Set num to "atoi(comp)"
	    decode	expr	 string	  decode str as	RFC 2047 (MIME-encoded)
					  component
	    unquote	expr	 string	  remove RFC 2822 quotes from str
	    trim	expr		  trim trailing	whitespace from	str
	    trimr	expr	 string	  Like %(trim),	also returns string
	    kilo	expr	 string	  express in SI	units: 15.9K, 2.3M, etc.
					  %(kilo) scales by factors of 1000,
	    kibi	expr	 string	  express in IEC units:	15.5Ki,	2.2Mi.
					  %(kibi) scales by factors of 1024.
	    ordinal	expr	 string	  Output ordinal suffix	based on value
					  of num (st, nd, rd, th)
	    putstr	expr		  print	str
	    putstrf	expr		  print	str in a fixed width
	    putnum	expr		  print	num
	    putnumf	expr		  print	num in a fixed width
	    putlit	expr		  print	str without space compression
	    zputlit	expr		  print	str without space compression;
					  str must occupy no width on display
	    bold		 string	  set terminal bold mode
	    underline		 string	  set terminal underlined mode
	    standout		 string	  set terminal standout	mode
	    resetterm		 string	  reset	all terminal attributes
	    hascolor		 boolean  terminal supports color
	    fgcolor	literal	 string	  set terminal foreground color
	    bgcolor	literal	 string	  set terminal background color
	    formataddr	expr		  append arg to	str as a
					  (comma separated) address list
	    concataddr	expr		  append arg to	str as a
					  (comma separated) address list,
					  including duplicates,
					  see Special Handling
	    putaddr	literal		  print	str address list with
					  arg as optional label;
					  get line width from num

       The (me)	function returns the username of the current user.   The  (my-
       host)  function	returns	 the localname entry in	mts.conf, or the local
       hostname	if localname is	not configured.	 The  (myname)	function  will
       return  the  value of the SIGNATURE environment variable	if set,	other-
       wise it will return the passwd GECOS  field  (truncated	at  the	 first
       comma  if it contains one) for the current user.	 The (localmbox) func-
       tion will return	the complete form of the local mailbox,	 suitable  for
       use in a	"From" header.	It will	return the "Local-Mailbox" profile en-
       try if there is one; if not, it will be equivalent to:

	    %(myname) <%(me)@%(myhost)>

       The following functions require a date component	as an argument:

	    Function	Argument Return	  Description
	    sec		date	 integer  seconds of the minute
	    min		date	 integer  minutes of the hour
	    hour	date	 integer  hours	of the day (0-23)
	    wday	date	 integer  day of the week (Sun=0)
	    day		date	 string	  day of the week (abbrev.)
	    weekday	date	 string	  day of the week
	    sday	date	 integer  day of the week known?
					  (1=explicit,0=implicit,-1=unknown)
	    mday	date	 integer  day of the month
	    yday	date	 integer  day of the year
	    mon		date	 integer  month	of the year
	    month	date	 string	  month	of the year (abbrev.)
	    lmonth	date	 string	  month	of the year
	    year	date	 integer  year (may be > 100)
	    zone	date	 integer  timezone in minutes
	    tzone	date	 string	  timezone string
	    szone	date	 integer  timezone explicit?
					  (1=explicit,0=implicit,-1=unknown)
	    date2local	date		  coerce date to local timezone
	    date2gmt	date		  coerce date to GMT
	    dst		date	 integer  daylight savings in effect? (0 or 1)
	    clock	date	 integer  seconds since	the Unix epoch
	    rclock	date	 integer  seconds prior	to current time
	    tws		date	 string	  official RFC 822 rendering
	    pretty	date	 string	  user-friendly	rendering
	    nodate	date	 integer  returns 1 if date is invalid

       The  following  functions  require an address component as an argument.
       The return value	of functions noted with	`*' is computed	from the first
       address present in the header component.

	    Function	Argument Return	  Description
	    proper	addr	 string	  official RFC 822 rendering
	    friendly	addr	 string	  user-friendly	rendering
	    addr	addr	 string	  mbox@host or host!mbox rendering*
	    pers	addr	 string	  the personal name*
	    note	addr	 string	  commentary text*
	    mbox	addr	 string	  the local mailbox*
	    mymbox	addr	 integer  list has the user's address? (0 or 1)
	    getmymbox	addr	 string	  the user's (first) address,
					  with personal	name
	    getmyaddr	addr	 string	  the user's (first) address,
					  without personal name
	    host	addr	 string	  the host domain*
	    nohost	addr	 integer  no host was present (0 or 1)*
	    type	addr	 integer  host type* (0=local,1=network,
					  -1=uucp,2=unknown)
	    path	addr	 string	  any leading host route*
	    ingrp	addr	 integer  address was inside a group (0	or 1)*
	    gname	addr	 string	  name of group*

       (A clarification	on (mymbox{comp}) is in	order.	This  function	checks
       each of the addresses in	the header component "comp" against the	user's
       mailbox name and	any "Alternate-Mailboxes".  It returns true if any ad-
       dress  matches.	However,  it also returns true if the "comp" header is
       not present in the message.  If needed, the (null) function can be used
       to explicitly test for this case.)

       The friendly{comp}) call	will return any	double-quoted "personal	 name"
       (that is, anything before <>), then it will return that.	 If there's no
       personal	name but there is a "note" (comments string after an email ad-
       dress), it will return that.  If	there is neither of those it will just
       return the bare email address.

   Formatting
       When  a function	or component escape is interpreted and the result will
       be printed immediately, an optional field width	can  be	 specified  to
       print  the field	in exactly a given number of characters.  For example,
       a numeric escape	like %4(size) will print at most 4 digits of the  mes-
       sage  size;  overflow  will be indicated	by a `?' in the	first position
       (like `?234').  A string	escape like %4(me)  will  print	 the  first  4
       characters  and	truncate  at  the end.	Short fields are padded	at the
       right with the fill character (normally,	a blank).  If the field	 width
       argument	 begins	with a leading zero, then the fill character is	set to
       a zero.

       The functions (putnumf) and (putstrf) print their result	in exactly the
       number of characters specified by their leading field  width  argument.
       For  example, %06(putnumf(size))	will print the message size in a field
       six characters wide filled with leading zeros; %14(putstrf{from})  will
       print the "From:" header	component in fourteen characters with trailing
       spaces  added  as  needed.   Using a negative value for the field width
       causes right-justification within the field, with padding on  the  left
       up to the field width.  Padding is with spaces except for a left-padded
       putnumf	when  the  width starts	with zero.  The	functions (putnum) and
       (putstr)	are somewhat special: they print their result in  the  minimum
       number of characters required, and ignore any leading field width argu-
       ment.  The (putlit) function outputs the	exact contents of the str reg-
       ister  without  any  changes such as duplicate space removal or control
       character conversion.  Similarly, the (zputlit)	function  outputs  the
       exact  contents	of  the	str register, but requires that	those contents
       not occupy any output width.  It	can therefore be used  for  outputting
       terminal	escape sequences.

       There  are  a limited number of function	escapes	to output terminal es-
       cape sequences.	These sequences	are  retrieved	from  the  terminfo(5)
       database	 according  to the current terminal setting.  The (bold), (un-
       derline), and (standout)	escapes	set bold  mode,	 underline  mode,  and
       standout	mode respectively.  (hascolor) can be used to determine	if the
       current terminal	supports color.	 (fgcolor) and (bgcolor) set the fore-
       ground  and background colors respectively.  Both of these escapes take
       one literal argument, the color name, which can be one of: black,  red,
       green, yellow, blue, magenta, cyan, white.  (resetterm) resets all ter-
       minal  attributes  to  their  default  setting.	These terminal escapes
       should be used in conjunction with (zputlit) (preferred)	 or  (putlit),
       as the normal (putstr) function will strip out control characters.

       The  available output width is kept in an internal register; any	output
       exceeding this width will be truncated.	The one	exception to  this  is
       that  (zputlit)	functions  will	 still be executed if a	terminal reset
       code is being placed at the end of a line.

   Special Handling
       Some functions have different behavior depending	on  the	 command  they
       are invoked from.

       In  repl	 the  (formataddr) function stores all email addresses encoun-
       tered into an internal cache and	will use this cache to suppress	dupli-
       cate addresses.	If you need to create an address  list	that  includes
       previously-seen	addresses you may use the (concataddr) function, which
       is identical to (formataddr) in all other respects.   Note  that	 (con-
       cataddr)	does not add addresses to the duplicate-suppression cache.

   Other Hints and Tips
       Sometimes,  the	writer of a format function is confused	because	output
       is duplicated.  The general rule	to remember is simple: If  a  function
       or  component  escape  begins  with a `%', it will generate text	in the
       output file.  Otherwise,	it will	not.

       A good example is a simple attempt to generate a	To:  header  based  on
       the From: and Reply-To: headers:

	    %(formataddr %<{reply-to}%|%{from})%(putaddr To: )

       Unfortunately,  if the Reply-to:	header is not present, the output line
       will be something like:

	    My From User <from@example.com>To: My From User <from@example.com>

       What went wrong?	 When performing the test for the if clause (%<),  the
       component  is not output	because	it is considered an argument to	the if
       statement (so the rule about not	starting with  %  applies).   But  the
       component  escape  in our else statement	(everything after the `%|') is
       not an argument to anything; it begins with a %,	and thus the value  of
       that component is output.  This also has	the side effect	of setting the
       str register, which is later picked up by the (formataddr) function and
       then  output by (putaddr).  The example format string above has another
       bug: there should always	be a valid width value	in  the	 num  register
       when (putaddr) is called, otherwise bad formatting can take place.

       The solution is to use the (void) function; this	will prevent the func-
       tion  or	 component  from outputting any	text.  With this in place (and
       using (width) to	set the	num register for the width) a better implemen-
       tation would look like:

	  %(formataddr %<{reply-to}%|%(void{from})%(void(width))%(putaddr To: )

       It should be noted here that the	side effects of	function and component
       escapes are still in force and, as a result, each component test	in the
       if-elseif-else-endif clause sets	the str	register.

       As an additional	note, the (formataddr) and (concataddr)	functions have
       special behavior	when it	comes to the str register.  The	starting point
       of the register is saved	and is used to build up	entries	in the address
       list.

       You will	find the fmttest(1) utility invaluable when debugging problems
       with format strings.

   Examples
       With all	the above in mind, here	is a breakdown of the  default	format
       string for scan.	 The first part	is:

	      %4(msg)%<(cur)+%|	%>%<{replied}-%?{encrypted}E%| %>

       which  says  that  the message number should be printed in four digits.
       If the message is the current message then a `+', else a	space,	should
       be  printed;  if	 a  "Replied:" field is	present	then a `-', else if an
       "Encrypted:" field is present then an `E', otherwise a space, should be
       printed.	 Next:

	      %02(mon{date})/%02(mday{date})

       the month and date are printed in two digits (zero filled) separated by
       a slash.	 Next,

	    %<{date} %|*%>

       If a "Date:" field is present it	is printed, followed by	a space;  oth-
       erwise a	`*' is printed.	 Next,

	    %<(mymbox{from})%<{to}To:%14(decode(friendly{to}))%>%>

       if  the	message	 is  from me, and there	is a "To:" header, print "To:"
       followed	by a "user-friendly" rendering of the  first  address  in  the
       "To:"  field;  any  MIME-encoded	characters are decoded into the	actual
       characters.  Continuing,

	    %<(zero)%17(decode(friendly{from}))%>

       if either of the	above two tests	failed,	then the  "From:"  address  is
       printed in a mime-decoded, "user-friendly" format.  And finally,

	    %(decode{subject})%<{body}<<%{body}>>%>

       the mime-decoded	subject	and initial body (if any) are printed.

       For  a  more  complicated example, consider a possible replcomps	format
       file.

	    %(lit)%(formataddr %<{reply-to}

       This clears str and formats the "Reply-To:" header if present.  If  not
       present,	the else-if clause is executed.

	    %?{from}%?{sender}%?{return-path}%>)\

       This  formats  the "From:", "Sender:" and "Return-Path:"	headers, stop-
       ping as soon as one of them is present.	Next:

	    %<(nonnull)%(void(width))%(putaddr To: )\n%>\

       If the formataddr result	is non-null, it	is printed as an address (with
       line folding if needed) in a field width	wide, with a leading label  of
       "To:".

	    %(lit)%(formataddr{to})%(formataddr{cc})%(formataddr(me))\

       str  is cleared,	and the	"To:" and "Cc:"	headers, along with the	user's
       address (depending on what was specified	with the "-cc" switch to repl)
       are formatted.

	    %<(nonnull)%(void(width))%(putaddr cc: )\n%>\

       If the result is	non-null, it is	printed	as above with a	leading	 label
       of "cc:".

	    %<{fcc}Fcc:	%{fcc}\n%>\

       If a -fcc folder	switch was given to repl (see repl(1) for more details
       about %{fcc}), an "Fcc:"	header is output.

	    %<{subject}Subject:	Re: %{subject}\n%>\

       If a subject component was present, a suitable reply subject is output.

	    %<{message-id}In-Reply-To: %{message-id}\n%>\
	    %<{message-id}References: %<{references} %{references}%>\
	    %{message-id}\n%>
	    --------

       If a message-id component was present, an "In-Reply-To:"	header is out-
       put  including  the message-id, followed	by a "References:" header with
       references, if present, and the message-id.  As	with  all  plain-text,
       the row of dashes are output as-is.

       This last part is a good	example	for a little more elaboration.	Here's
       that part again in pseudo-code:

	    if (comp_exists(message-id))  then
		 print ("In-reply-to: ")
		 print (message-id.value)
		 print ("\n")
	    endif
	    if (comp_exists(message-id)) then
		 print ("References: ")
		 if (comp_exists(references)) then
		       print(references.value);
		 endif
		 print (message-id.value)
		 print ("\n")
	    endif

       One  more  example: Currently, nmh supports very	large message numbers,
       and it is not uncommon for a folder to have far more  than  10000  mes-
       sages.	Nonetheless  (as noted above) the various scan format strings,
       inherited from older MH versions, are generally hard-coded to 4	digits
       for the message number. Thereafter, formatting problems occur.  The nmh
       format strings can be modified to behave	more sensibly with larger mes-
       sage numbers:

	      %(void(msg))%<(gt	9999)%(msg)%|%4(msg)%>

       The  current  message  number  is placed	in num.	 (Note that (msg) is a
       function	escape which returns an	integer, it is not a component.)   The
       (gt)  conditional  is  used to test whether the message number has 5 or
       more digits.  If	so, it is printed at full width, otherwise at  4  dig-
       its.

SEE ALSO
       scan(1),	repl(1), fmttest(1)

CONTEXT
       None

nmh-1.8+dev			  2015-01-10			  MH-FORMAT(5)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=mh-format&sektion=5&manpath=FreeBSD+Ports+14.3.quarterly>

home | help