Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
bt_format_names(3)		    btparse		    bt_format_names(3)

       bt_format_names - formatting BibTeX names for consistent	output

	  bt_name_format * bt_create_name_format (char * parts,
						  boolean abbrev_first);
	  void bt_free_name_format (bt_name_format * format);
	  void bt_set_format_text (bt_name_format * format,
				   bt_namepart part,
				   char	* pre_part,
				   char	* post_part,
				   char	* pre_token,
				   char	* post_token);
	  void bt_set_format_options (bt_name_format * format,
				      bt_namepart part,
				      boolean abbrev,
				      bt_joinmethod join_tokens,
				      bt_joinmethod join_part);
	  char * bt_format_name	(bt_name * name, bt_name_format	* format);

       After splitting a name into its components parts	(represented as	a
       "bt_name" structure), you often want to put it back together again as a
       single string in	a consistent way.  btparse provides a very flexible
       way to do this, generally in two	stages:	first, you create a "name for-
       mat" which describes how	to put the tokens and parts of any name	back
       together, and then you apply the	format to a particular name.

       The "name format" is encapsulated in a "bt_name_format" structure,
       which is	created	with "bt_create_name_format()".	 This function in-
       cludes some clever trickery that	means you can usually get away with
       calling it alone, and not need to do any	customization of the format.
       If you do need to customize the format, though, "bt_set_format_text()"
       and "bt_set_format_options()" provide that capability.

       The format controls the following:

       o   which name parts are	printed, and in	what order (e.g. "first	von
	   last	jr", or	"von last jr first")

       o   the text that precedes and follows each part	(e.g. if the first
	   name	follows	the last name, you probably want a comma before	the
	   `first' part: "Smith, John" rather than "Smith John")

       o   the text that precedes and follows each token (e.g. if the first
	   name	is abbreviated,	you may	want a period after each token:	"J. R.
	   Smith" rather than "J R Smith")

       o   the method used to join the tokens of each part together

       o   the method used to join each	part to	the following part

       All of these except the list of parts to	format are kept	in arrays in-
       dexed by	name part: for example,	the structure has a field

	  char * post_token[BT_MAX_NAMEPARTS]

       and "post_token[BTN_FIRST]" ("BTN_FIRST"	is from	the "bt_namepart"
       "enum") is the string to	be added after each token in the first
       name---for example, "." if the first name is to be abbreviated in the
       conventional way.

       Yet another "enum", "bt_joinmethod", describes the available methods
       for joining tokens together.  Note that there are two sets of join
       methods in a name format: between tokens	within a single	part, and be-
       tween the tokens	of two different parts.	 The first allows you, for ex-
       ample, to change	"J R Smith" (first name	abbreviated with no post-token
       text but	tokens joined by a space) to "JR Smith"	(the same, but first-
       name tokens jammed together).  The second is mainly used	to ensure that
       "von" and "last"	name-parts may be joined with a	tie: "de~Roche"	rather
       than "de	Roche".

       The token join methods are:

	   Insert a "discretionary tie"	between	tokens.	 That is, either a
	   space or a "tie" is inserted, depending on context.	(A "tie," oth-
	   erwise known	as unbreakable space, is currently hard-coded as
	   "~"---from TeX.)

	   The format is then applied to a particular name by "bt_for-
	   mat_name()",	which returns a	new string.

	   Always insert a space between tokens.

	   Always insert a "tie" ("~") between tokens.

	   Insert nothing between tokens---just	jam them together.

       Tokens are joined together, and thus the	choice of whether to insert a
       "discretionary tie" is made, at two places: within a part and between
       two parts.  Naturally, this only	applies	when "BTJ_MAYTIE" was supplied
       as the token-join method; "BTJ_SPACE" and "BTJ_FORCETIE"	always insert
       either a	space or tie, and "BTJ_NOTHING"	always adds nothing between
       tokens.	Within a part, ties are	added after a the first	token if it is
       less than three characters long,	and before the last token.  Between
       parts, a	tie is added only if the preceding part	consisted of single
       token that was less than	three characters long.	In all other cases,
       spaces are inserted.  (This implementation slavishly follows BibTeX.)

	      bt_name_format * bt_create_name_format (char * parts,
						      boolean abbrev_first)

	   Creates a name format for a given set of parts, with	variations for
	   the most common forms of customization---the	order of parts and
	   whether to abbreviate the first name.

	   The "parts" parameter specifies which parts to include in a format-
	   ted name, as	well as	the order in which to format them.  "parts"
	   must	be a string of four or fewer characters, each of which denotes
	   one of the four name	parts: for instance, "vljf" means to format
	   all four parts in "von last jr first" order.	 No characters outside
	   of the set "fvlj" are allowed, and no characters may	be repeated.
	   "abbrev_first" controls whether the `first' part will be abbrevi-
	   ated	(i.e., only the	first letter from each token will be printed).

	   In addition to simply setting the list of parts to format and the
	   "abbreviate"	flag for the first name, "bt_create_name_format()"
	   initializes the entire format structure so as to minimize the need
	   for further customizations:

	   *   The "token join method"---what to insert	between	tokens of the
	       same part---is set to "BTJ_MAYTIE" (discretionary tie) for all

	   *   The "part join method"---what to	insert after the final token
	       of a particular part, assuming there are	more parts to
	       come---is set to	"BTJ_SPACE" for	the `first', `last', and `jr'
	       parts.  If the `von' part is present and	immediately precedes
	       the `last' part (which will almost always be the	case),
	       "BTJ_MAYTIE" is used to join `von' to `last'; otherwise,	`von'
	       also gets "BTJ_SPACE" for the inter-part	join method.

	   *   The abbreviation	flag is	set to "FALSE" for the `von', `last',
	       and `jr'	parts; for `first', the	abbreviation flag is set to
	       whatever	you pass in as "abbrev_first".

	   *   Initially, all "surrounding text" (pre-part, post-part, pre-to-
	       ken, and	post-token) for	all parts is set to the	empty string.
	       Then a few tweaks are done, depending on	the "abbrev_first"
	       flag and	the order of tokens.  First, if	"abbrev_first" is
	       "TRUE", the post-token text for first name is set to "."---this
	       changes "J R Smith" to "J. R. Smith", which is usually the de-
	       sired form.  (If	you don't want the periods, you'll have	to set
	       the post-token text yourself with "bt_set_format_text()".)

	       Then, if	`jr' is	present	and immediately	after `last' (almost
	       always the case), the pre-part text for `jr' is set to ", ",
	       and the inter-part join method for `last' is set	to "BTJ_NOTH-
	       ING".  This changes "John Smith Jr" (where the space following
	       "Smith" comes from formatting the last name with	a "BTJ_SPACE"
	       inter-part join method) to "John	Smith, Jr" (where the ", " is
	       now associated with "Jr"---that way, if there is	no `jr'	part,
	       the ", "	will not be printed.)

	       Finally,	if `first' is present and immediately follows either
	       `jr' or `last' (which will usually be the case in "last-name
	       first" formats),	the same sort of trickery is applied: the pre-
	       part text for `first' is	set to ", ", and the part join method
	       for the preceding part (either `jr' or `last') is set to

	   While all these rules are rather complicated, they mean that	you
	   are usually freed from having to do any customization of the	name
	   format.  Certainly this is the case if you only need	"fvlj" and
	   "vljf" part orders, only want to abbreviate the first name, want
	   periods after abbreviated tokens, non-breaking spaces in the
	   "right" places, and commas in the conventional places.

	   If you want something out of	the ordinary---for instance, abbrevi-
	   ated	tokens jammed together with no puncuation, or abbreviated last
	   names---you'll need to customize the	name format a bit with
	   "bt_set_format_text()" and "bt_set_format_options()".

	      void bt_free_name_format (bt_name_format * format)

	   Frees a name	format created by "bt_create_name_format()".

	      void bt_set_format_text (bt_name_format *	format,
				       bt_namepart part,
				       char * pre_part,
				       char * post_part,
				       char * pre_token,
				       char * post_token)

	   Allows you to customize some	or all of the surrounding text for a
	   single name part.  Supply "NULL" for	any chunk of text that you
	   don't want to change.

	   For instance, say you want a	name format that will abbreviate first
	   names, but without any punctuation after the	abbreviated tokens.
	   You could create and	customize the format as	follows:

	      format = bt_create_name_format ("fvlj", TRUE);
	      bt_set_format_text (format,
				  BTN_FIRST,	   /* name-part	to customize */
				  NULL,	NULL,	   /* pre- and post- part text */
				  NULL,	"");	   /* empty string for post-token */

	   Without the "bt_set_format_text()" call, "format" would result in
	   names formatted like	"J. R. Smith".	After setting the post-token
	   text	for first names	to "", this name would become "J R Smith".

	      void bt_set_format_options (bt_name_format * format,
					  bt_namepart part,
					  boolean abbrev,
					  bt_joinmethod	join_tokens,
					  bt_joinmethod	join_part)

	   Allows further customization	of a name format: you can set the ab-
	   breviation flag and the two token-join methods.  Alas, there	is no
	   mechanism for leaving a value unchanged; you	must set everything
	   with	"bt_set_format_options()".

	   For example,	let's say that just dropping periods from abbreviated
	   tokens in the first name isn't enough; you really want to save
	   space by jamming the	abbreviated tokens together: "JR Smith"	rather
	   than	"J R Smith"  Assuming the two calls in the above example have
	   been	done, the following will finish	the job:

	      bt_set_format_options (format, BTN_FIRST,
				     TRUE,	   /* keep same	value for abbrev flag */
				     BTJ_NOTHING,  /* jam tokens together */
				     BTJ_SPACE);   /* space after final	token of part */

	   Note	that we	unfortunately had to know (and supply) the current
	   values for the abbreviation flag and	post-part join method, even
	   though we were only setting the intra-part join method.

	      char * bt_format_name (bt_name * name, bt_name_format * format)

	   Once	a name format has been created and customized to your heart's
	   content, you	can use	it to format any number	of names that have
	   been	split with "bt_split_name" (see	bt_split_names).  Simply pass
	   the name structure and name format structure, and a newly-allocated
	   string containing the formatted name	will be	returned to you.  It
	   is your responsibility to "free()" this string.

       btparse,	bt_split_names

       Greg Ward <>

btparse, version 0.34		  2003-10-25		    bt_format_names(3)


Want to link to this manual page? Use this URL:

home | help