Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
mdbFLT(5)		       The m17n	Library			     mdbFLT(5)

NAME
       mdbFLT -	Font Layout Table

DESCRIPTION
       For simple scripts, the rendering engine	converts character codes into
       glyph codes one by one by consulting the	encoding of each selected
       font. But, to render text that requires complicated layout (e.g.	Thai
       and Indic scripts), one to one conversion is not	sufficient. A sequence
       of characters may have to be drawn as a single ligature.	Some glyphs
       may have	to be drawn at 2-dimensionally shifted positions.

       To handle those complicated scripts, the	m17n library uses Font Layout
       Tables (FLTs for	short).	The FLT	driver interprets an FLT and converts
       a character sequence into a glyph sequence that is ready	to be passed
       to the rendering	engine.

       An FLT can contain information to extract a grapheme cluster from a
       character sequence and to reorder the characters	in the cluster,	in
       addition	to information found in	OpenType Layout	Tables (CMAP, GSUB,
       and GPOS).

       An FLT is a cascade of one or more conversion stages. In	each stage, a
       sequence	is converted into another sequence to be read in the next
       stage. The length of sequences may differ from stage to stage. Each
       element in a sequence has the following integer attributes.

        code
       In  the	first  conversion  stage,  this	 is  the character code	in the
       original	character sequence. In the last	stage, it is  the  glyph  code
       passed  to  the rendering engine. In other cases, it is an intermediate
       glyph code.
        category
       The category code defined in the	CATEGORY-TABLE of the  current	stage,
       or defined in the one of	the former stages and not overwritten by later
       stages.
        combining-spec
       If  nonzero, it specifies how to	combine	this (intermediate) glyph with
       the previous one.
        left-padding-flag
       If nonzero, it instructs	the rendering function	to  insert  a  padding
       space  before  this  (intermediate)  glyph  so  that the	glyph does not
       overlap with the	previous one.
        right-padding-flag
       If nonzero, it instructs	the rendering function	to  insert  a  padding
       space  after  this  (intermediate)  glyph  so  that  the	glyph does not
       overlap with the	next one.
       When the	layout engine draws text, it at	first determines a font	and an
       FLT for each character in the text. For each subsequence	of  characters
       that  use  the  same  font  and	FLT,  the  layout  engine  generates a
       corresponding intermediate glyph	sequence. The code attribute  of  each
       element	in  the	intermediate glyph sequence is its character code, and
       all other attributes are	zeros. This sequence is	processed in the first
       stage of	FLT as the current run (substring).
       Each stage works	as follows.
       At first, if the	stage has a CATEGORY-TABLE, the	category of each glyph
       in the current run is  updated.	If  there  is  a  glyph	 that  has  no
       category, the current run ends before that glyph.
       Then,   the   default   values	of  code-offset,  combining-spec,  and
       left-padding-flag of this stage are initialized to zero.
       Next, the initial conversion rule  of  the  stage  is  applied  to  the
       current run.
       Lastly,	 the   current	 run  is  replaced  with  the  newly  produced
       (intermediate) glyph sequence.
SYNTAX and SEMANTICS
       The m17n	library	loads an FLT from the  m17n  database  using  the  tag
       <font, layouter,	FLT-NAME>. The date format of an FLT is	as follows:
       FONT-LAYOUT-TABLE ::= FLT-DECLARATION ? STAGE0 STAGE *

       FLT-DECLARATION ::= '(' 'font' 'layouter' FLT-NAME nil PROP * ')'
       FLT-NAME	::= SYMBOL
       PROP :: = VERSION | FONT
       VERSION ::= '(' 'version' MTEXT ')'
       FONT ::=	'(' 'font' FONT-SPEC ')'
       FONT-SPEC ::=
	    '('	[[ FOUNDRY FAMILY
		  [ WEIGHT [ STYLE [ STRETCH [ ADSTYLE ]]]]]
		REGISTRY ]
	    [ OTF-SPEC ] [ LANG-SPEC ] ')'

       STAGE0 ::= CATEGORY-TABLE GENERATOR

       STAGE ::= CATEGORY-TABLE	? GENERATOR

       CATEGORY-TABLE ::= '(' 'category' CATEGORY-SPEC + ')'

       CATEGORY-SPEC ::= '(' CODE CATEGORY ')'
			 | '(' CODE CODE CATEGORY ')'

       CODE ::=	INTEGER

       CATEGORY	::= INTEGER
       In  the definition of CATEGORY-SPEC, CODE is a glyph code, and CATEGORY
       is ASCII	code of	an upper or lower letter, i.e. one of  'A',  ...  'Z',
       'a', .. 'z'.
       The  first form of CATEGORY-SPEC	assigns	CATEGORY to a glyph whose code
       is CODE.	The second form	assigns	CATEGORY to glyphs  whose  code	 falls
       between the two CODEs.
       GENERATOR ::= '(' 'generator' RULE MACRO-DEF * ')'

       RULE ::=	REGEXP-BLOCK | MATCH-BLOCK | SUBST-BLOCK | COND-BLOCK
		FONT-FACILITY-BLOCK | DIRECT-CODE | COMBINING-SPEC | OTF-SPEC
		| PREDEFINED-RULE | MACRO-NAME

       MACOR-DEF ::= '(' MACRO-NAME RULE + ')'
       Each  RULE  specifies  glyphs to	be consumed and	glyphs to be produced.
       When some glyphs	are consumed, they are taken  away  from  the  current
       run.  A rule may	fail in	some condition.	If not described explicitly to
       fail, it	should be regarded that	the rule succeeds.
       DIRECT-CODE ::= INTEGER
       This rule consumes  no  glyph  and  produces  a	glyph  which  has  the
       following attributes:
        code :	INTEGER	plus the default code-offset
        combining-spec	: default value
        left-padding-flag : default value
        right-padding-flag : zero
       After   having	produced   the	 glyph,	  the	default	  code-offset,
       combining-spec, and left-padding-flag are all reset to zero.
       PREDEFINED-RULE ::= '=' | '*' | '<' | '>' | '|' | '[' | ']'
       They perform actions as follows.
        =
       This rule consumes the first glyph in the current run and produces  the
       same glyph. It fails if the current run is empty.
        *
       This  rule  repeatedly executes the previous rule. If the previous rule
       fails, this rule	does nothing and fails.
        <
       This rule specifies the start of	a grapheme cluster.
        >
       This rule specifies the end of a	grapheme cluster.
        @[
       This rule  sets	the  default  left-padding-flag	 to  1.	 No  glyph  is
       consumed. No glyph is produced.
        @]
       This  rule changes the right-padding-flag of the	lastly generated glyph
       to 1. No	glyph is consumed. No glyph is produced.
        |
       This rule consumes no glyph and produces	a special glyph	whose category
       is ' ' and other	attributes are	zero.  This  is	 the  only  rule  that
       produces	that special glyph.
       REGEXP-BLOCK ::=	'(' REGEXP RULE	* ')'

       REGEXP ::= MTEXT
       MTEXT  is  a  regular  expression  that	should	match  the sequence of
       categories of the current run. If a match is found, this	rule  executes
       RULEs  temporarily  limiting  the  current run to the matched part. The
       matched part is consumed	by this	rule.
       Parenthesized subexpressions, if	 any,  are  recorded  to  be  used  in
       MATCH-BLOCK that	may appear in one of RULEs.
       If no match is found, this rule fails.
       MATCH-BLOCK ::= '(' MATCH-INDEX RULE * ')'

       MATCH-INDEX ::= INTEGER
       MATCH-INDEX  is	an  integer  specifying	 a parenthesized subexpression
       recorded	by the previous	REGEXP-BLOCK.  If  such	 a  subexpression  was
       found  by  the previous regular expression matching, this rule executes
       RULEs temporarily limiting the current run to the matched part  of  the
       subexpression. The matched part is consumed by this rule.
       If no match was found, this rule	fails.
       If  this	 is the	first rule of the stage, MATCH-INDEX must be 0,	and it
       matches the whole current run.
       SUBST-BLOCK ::= '(' SOURCE-PATTERN RULE * ')'

       SOURCE-PATTERN ::= '(' CODE + ')'
			  | (' 'range' CODE CODE ')'
       If the sequence of codes	of the	current	 run  matches  SOURCE-PATTERN,
       this  rule  executes  RULEs temporarily limiting	the current run	to the
       matched part. The matched part is consumed.
       The first form of SOURCE-PATTERN	specifies a sequence of	glyph codes to
       be matched. In this case, this rule resets the default  code-offset  to
       zero.
       The  second form	specifies a range of codes that	should match the first
       glyph code of the code sequence.	In  this  case,	 this  rule  sets  the
       default	code-offset  to	 the  first  glyph  code  minus	the first CODE
       specifying the range.
       If no match is found, this rule fails.
       FONT-FACILITY-BLOCK ::= '(' FONT-FACILITY RULE *	')'
       FONT-FACILITY = '(' 'font-facility' CODE	* ')'
		   | '(' 'font-facility' FONT-SPEC ')'
       If the current font has glyphs for CODEs	 or  matches  with  FONT-SPEC,
       this rule succeeds and RULEs are	executed. Otherwise, this rule fails.
       COND-BLOCK ::= '(' 'cond' RULE +	')'
       This  rule  sequentially	 executes RULEs	until one succeeds. If no rule
       succeeds, this rule fails. Otherwise, it	succeeds.
       OTF-SPEC	::= SYMBOL
       OTF-SPEC	is a symbol whose name specifies an  instruction  to  the  OTF
       driver. The name	has the	following syntax.
	 OTF-SPEC-NAME ::= ':otf=' SCRIPT LANGSYS ? GSUB-FEATURES ? GPOS-FEATURES ?

	 SCRIPT	::= SYMBOL

	 LANGSYS ::= '/' SYMBOL

	 GSUB-FEATURES ::= '=' FEATURE-LIST ?

	 GPOS-FEATURES ::= '+' FEATURE-LIST ?

	 FEATURE-LIST ::= ( SYMBOL ',' ) * [ SYMBOL | '*' ].fi
       Each SYMBOL specifies a tag name	defined	in the OpenType	specification.
       For SCRIPT, SYMBOL specifies a Script tag name (e.g. deva for Devanagari).
       For LANGSYS, SYMBOL specifies a Language	System tag name. If LANGSYS is omitted,	the Default Language System table is used.
       For GSUB-FEATURES, each SYMBOL in FEATURE-LIST specifies	a GSUB Feature tag name	to apply. '*' is allowed as the	last item to specify all remaining features. If	SYMBOL is preceded by '~' and the last item is '*', SYMBOL is excluded from the	features to apply. If no SYMBOL	is specified, no GSUB feature is applied. If GSUB-FEATURES itself is omitted, all GSUB features	are applied.
       When OTF-SPEC appears in	a FONT-SPEC, FEATURE-LIST specifies features that the font must	have (or must not have if preceded by '~'), and	the last'*', even if exists, has no meaning.
       The specification of GPOS-FEATURES is analogous to that of GSUB-FEATURES.
       Please note that	all the	tags above must	be 4 ASCII printable characters.
       See the following page for the OpenType specification.
	http://www.microsoft.com/typography/otspec/default.htm
       COMBINING ::= SYMBOL
       COMBINING  is  a	 symbol	 whose	name specifies how to combine the next
       glyph with the previous one. This rule sets the default	combining-spec
       to  an integer code that	is unique to the symbol	name. The name has the
       following syntax.
	 COMBINING-NAME	::= VPOS HPOS OFFSET VPOS HPOS

	 VPOS ::= 't' |	'c' | 'b' | 'B'

	 HPOS ::= 'l' |	'c' | 'r'

	 OFFSET	:: = '.' | XOFF	| YOFF XOFF ?

	 XOFF ::= ('<' | '>') INTEGER ?

	 YOFF ::= ('+' | '-') INTEGER ?
       VPOS  and  HPOS	specify	 the  vertical	and  horizontal	 positions  as
       described below.
				       POINT VPOS HPOS
				       ----- ---- ----
	   0----1----2 <---- top       0     t	  l
	   |	     |		       1     t	  c
	   |	     |		       2     t	  r
	   |	     |		       3     B	  l
	   9   10   11 <---- center    4     B	  c
	   |	     |		       5     B	  r
	 --3----4----5-- <-- baseline  6     b	  l
	   |	     |		       7     b	  c
	   6----7----8 <---- bottom    8     b	  r
				       9     c	  l
	   |	|    |		      10     c	  c
	 left center right	      11     c	  r
       The left	figure shows 12	reference points of a glyph by numbers 0 to
       11. The	rectangle  0-6-8-2  is	the  bounding  box  of	the glyph, the
	   positions 3,	4, and 5 are on	the baseline, 9-11 are on the vertical
	   center of the box, 0-2 and 6-8 are on the top  and  on  the	bottom
	   respectively.  1,  10, 4, and 7 are on the horizontal center	of the
	   box.
       The right table shows how those reference points	 are  specified	 by  a
       pair of VPOS and	HPOS.
       The first VPOS and HPOS in the definition of COMBINING-NAME specify the
       reference  point	 of  the  previous glyph, and the second VPOS and HPOS
       specify that of the next	glyph. The next	glyph is drawn so  that	 these
       two reference points align.
       OFFSET  specifies  the  way  of	alignment in detail. If	it is '.', the
       reference points	are on the same	position.
       XOFF specifies how much the X position of the reference	point  of  the
       next  glyph should be shifted to	the left ('<') or right	('>') from the
       previous	reference point.
       YOFF specifies how much the Y position of the reference point the  next
       glyph  should  be  shifted  upward  ('+')  or  downward	('-') from the
       previous	reference point.
       In both cases, INTEGER is the amount of shift expressed as a percentage
       of the font size, i.e., if INTEGER is 10, it means 10%  (1/10)  of  the
       font size. If INTEGER is	omitted, it is assumed that 5 is specified.
       Once the	next glyph is combined with the	previous one, they are treated
       as a single combined glyph.
       MACRO-NAME ::= SYMBOL
       MACRO-NAME  is  a  symbol  that	appears	 in  one  of  MACRO-DEF. It is
       exapanded to the	sequence of the	corresponding RULEs.
CONTEXT	DEPENDENT BEHAVIOR
       So far, it has been assumed that	each sequence, which is	drawn  with  a
       specific	 font,	is  context  free,  i.e.  not  affected	 by the	glyphs
       preceding or following that sequence. This is true when sequence	S1  is
       drawn  with  font  F1  while  the preceding sequence S0 unconditionally
       requires	font F0.
	 sequence			       S0      S1
	 currently used	font		       F0      F1
	 usable	font(s)			       F0      F1
       Sometimes, however, a clear separation of sequences  is	not  possible.
       Suppose	that  the  preceding sequence S0 can be	drawn not only with F0
       but also	with F1.
	 sequence			       S0      S1
	 currently used	font		       F0      F1
	 usable	font(s)			       F0,F1   F1
       In this case, glyphs used to draw the preceding	S0  may	 affect	 glyph
       generation of S1. Therefore it is necessary to access information about
       S0,  which  has	already	been processed,	when processing	S1. Generation
       rules in	the first stage	(only in the first  stage)  accept  a  special
       regular expression to access already processed parts.
	 "RE0 RE1"
       RE0  and	 RE1 are regular expressions that match	the preceding sequence
       S0 and the following sequence S1, respectively.
       Pay attention to	the space between  the	two  regular  expressions.  It
       represents  the special category	' ' (see above). Note that the regular
       expression above	belongs	to  glyph  generation  rules  using  font  F1,
       therefore  not  only  RE1  but  also  RE0  must	be  expressed with the
       categories for F1. This means when the preceding	sequence S0 cannot  be
       expressed  with	the  categories	for F1 (as in the first	example	above)
       generation rules	having these patterns never match.
SEE ALSO
       mdbGeneral(5), FLTs provided by the m17n	database
COPYRIGHT
       Copyright (C) 2001 Information-technology Promotion Agency (IPA)
       Copyright (C)  2001-2011	 National  Institute  of  Advanced  Industrial
       Science and Technology (AIST)
       Permission  is  granted to copy,	distribute and/or modify this document
       under   the   terms   of	  the	GNU   Free    Documentation    License
       <http://www.gnu.org/licenses/fdl.html>.

Version	1.8.4			Mon Sep	25 2023			     mdbFLT(5)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=mdbFLT&sektion=5&manpath=FreeBSD+Ports+15.0>

home | help