Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
Locale::Po4a::TransTractor(3pm)	  Po4a Tools   Locale::Po4a::TransTractor(3pm)

       Locale::Po4a::TransTractor - generic trans(lator	ex)tractor.

       The po4a	(PO for	anything) project goal is to ease translations (and
       more interestingly, the maintenance of translations) using gettext
       tools on	areas where they were not expected like	documentation.

       This class is the ancestor of every po4a	parser used to parse a
       document, to search translatable	strings, to extract them to a PO file
       and to replace them by their translation	in the output document.

       More formally, it takes the following arguments as input:

       - a document to translate;

       - a PO file containing the translations to use.

       As output, it produces:

       - another PO file, resulting of the extraction of translatable strings
	 from the input	document;

       - a translated document,	with the same structure	than the one in	input,
	 but with all translatable strings replaced with the translations
	 found in the PO file provided in input.

       Here is a graphical representation of this:

	  Input	document --\				 /---> Output document
			    \				/	(translated)
			     +-> parse() function -----+
			    /				\
	  Input	PO --------/				 \---> Output PO

	   This	is where all the work takes place: the parsing of input
	   documents, the generation of	output,	and the	extraction of the
	   translatable	strings. This is pretty	simple using the provided
	   functions presented in the section INTERNAL FUNCTIONS below.	See
	   also	the SYNOPSIS, which presents an	example.

	   This	function is called by the process() function below, but	if you
	   choose to use the new() function, and to add	content	manually to
	   your	document, you will have	to call	this function yourself.

	   This	function returns the header we should add to the produced
	   document, quoted properly to	be a comment in	the target language.
	   See the section Educating developers	about translations, from
	   po4a(7), for	what it	is good	for.

       The following example parses a list of paragraphs beginning with	"<p>".
       For the sake of simplicity, we assume that the document is well
       formatted, i.e. that '<p>' tags are the only tags present, and that
       this tag	is at the very beginning of each paragraph.

	sub parse {
	  my $self = shift;

	  PARAGRAPH: while (1) {
	      my ($paragraph,$pararef)=("","");
	      my $first=1;
	      my ($line,$lref)=$self->shiftline();
	      while (defined($line)) {
		  if ($line =~ m/<p>/ && !$first--; ) {
		      #	Not the	first time we see <p>.
		      #	Reput the current line in input,
		      #	 and put the built paragraph to	output

		      #	Now that the document is formed, translate it:
		      #	  - Remove the leading tag
		      $paragraph =~ s/^<p>//s;

		      #	  - push to output the leading tag (untranslated) and the
		      #	    rest of the	paragraph (translated)
		      $self->pushline(	"<p>"
				      .	$document->translate($paragraph,$pararef)

		      next PARAGRAPH;
		  } else {
		      #	Append to the paragraph
		      $paragraph .= $line;
		      $pararef = $lref unless(length($pararef));

		  # Reinit the loop
	      #	Did not	get a defined line? End	of input file.

       Once you've implemented the parse function, you can use your document
       class, using the	public interface presented in the next section.

PUBLIC INTERFACE for scripts using your	parser
	   This	function can do	all you	need to	do with	a po4a document	in one
	   invocation. Its arguments must be packed as a hash. ACTIONS:

	   a. Reads all	the PO files specified in po_in_name

	   b. Reads all	original documents specified in	file_in_name

	   c. Parses the document

	   d. Reads and	applies	all the	addenda	specified

	   e. Writes the translated document to	file_out_name (if given)

	   f. Writes the extracted PO file to po_out_name (if given)

	   ARGUMENTS, beside the ones accepted by new()	(with expected type):

	   file_in_name	(@)
	       List of filenames where we should read the input	document.

	   file_in_charset ($)
	       Charset used in the input document (if it isn't specified, it
	       will try	to detect it from the input document).

	   file_out_name ($)
	       Filename	where we should	write the output document.

	   file_out_charset ($)
	       Charset used in the output document (if it isn't	specified, it
	       will use	the PO file charset).

	   po_in_name (@)
	       List of filenames where we should read the input	PO files from,
	       containing the translation which	will be	used to	translate the

	   po_out_name ($)
	       Filename	where we should	write the output PO file, containing
	       the strings extracted from the input document.

	   addendum (@)
	       List of filenames where we should read the addenda from.

	   addendum_charset ($)
	       Charset for the addenda.

	   Create a new	po4a document. Accepted	options	(but be	in a hash):

	   verbose ($)
	       Sets the	verbosity.

	   debug ($)
	       Sets the	debugging.

   Manipulating	document files
	   Add another input document at the end of the	existing one. The
	   argument is the filename to read.

	   Please note that it does not	parse anything.	You should use the
	   parse() function when you're	done with packing input	files into the

	   Write the translated	document to the	given filename.

   Manipulating	PO files
	   Add the content of a	file (which name is passed as argument)	to the
	   existing input PO. The old content is not discarded.

	   Write the extracted PO file to the given filename.

	   Returns some	statistics about the translation done so far. Please
	   note	that it's not the same statistics than the one printed by
	   msgfmt --statistic. Here, it's stats	about recent usage of the PO
	   file, while msgfmt reports the status of the	file. It is a wrapper
	   to the Locale::Po4a::Po::stats_get function applied to the input PO
	   file. Example of use:

	       [normal use of the po4a document...]

	       ($percent,$hit,$queries)	= $document->stats();
	       print "We found translations for	$percent\%  ($hit from $queries) of strings.\n";

   Manipulating	addenda
	   Please refer	to po4a(7) for more information	on what	addenda	are,
	   and how translators should write them. To apply an addendum to the
	   translated document,	simply pass its	filename to this function and
	   you are done	;)

	   This	function returns a non-null integer on error.

INTERNAL FUNCTIONS used	to write derivated parsers
   Getting input, providing output
       Four functions are provided to get input	and return output. They	are
       very similar to shift/unshift and push/pop. The first pair is about
       input, while the	second is about	output.	Mnemonic: in input, you	are
       interested in the first line, what shift	gives, and in output you want
       to add your result at the end, like push	does.

	   This	function returns the next line of the doc_in to	be parsed and
	   its reference (packed as an array).

	   Unshifts a line of the input	document and its reference.

	   Push	a new line to the doc_out.

	   Pop the last	pushed line from the doc_out.

   Marking strings as translatable
       One function is provided	to handle the text which should	be translated.

	   Mandatory arguments:

	   - A string to translate

	   - The reference of this string (i.e.	position in inputfile)

	   - The type of this string (i.e. the textual description of its
	     structural	role; used in Locale::Po4a::Po::gettextization(); see
	     also po4a(7), section Gettextization: how does it work?)

	   This	function can also take some extra arguments. They must be
	   organized as	a hash.	For example:

			      'wrap' =>	1);

	       boolean indicating whether we can consider that whitespaces in
	       string are not important. If yes, the function canonizes	the
	       string before looking for a translation or extracting it, and
	       wraps the translation.

	       the column at which we should wrap (default: 76).

	       an extra	comment	to add to the entry.


	   - Pushes the	string,	reference and type to po_out.

	   - Returns the translation of	the string (as found in	po_in) so that
	     the parser	can build the doc_out.

	   - Handles the charsets to recode the	strings	before sending them to
	     po_out and	before returning the translations.

   Misc	functions
	   Returns if the verbose option was passed during the creation	of the

	   Returns if the debug	option was passed during the creation of the

	   This	tells TransTractor that	a new charset (the first argument) has
	   been	detected from the input	document. It can usually be read from
	   the document	header.	Only the first charset will remain, coming
	   either from the process() arguments or detected from	the document.

	   This	function will return the charset that should be	used in	the
	   output document (usually useful to substitute the input document's
	   detected charset where it has been found).

	   It will use the output charset specified in the command line. If it
	   wasn't specified, it	will use the input PO's	charset, and if	the
	   input PO has	the default "CHARSET", it will return the input
	   document's charset, so that no encoding is performed.

	   This	function returns the recoded text passed as argument, from the
	   input document's charset to the output document's one. This isn't
	   needed when translating a string (translate() recodes everything
	   itself), but	it is when you skip a string from the input document
	   and you want	the output document to be consistent with the global

       One shortcoming of the current TransTractor is that it can't handle
       translated document containing all languages, like debconf templates,
       or .desktop files.

       To address this problem,	the only interface changes needed are:

       - take a	hash as	po_in_name (a list per language)

       - add an	argument to translate to indicate the target language

       - make a	pushline_all function, which would make	pushline of its
	 content for all language, using a map-like syntax:

	     $self->pushline_all({ "Description[".$langcode."]=".

       Will see	if it's	enough ;)

	Denis Barbier <>
	Martin Quinson (
	Jordi Vilalta <>

Po4a Tools			  2021-02-28   Locale::Po4a::TransTractor(3pm)


Want to link to this manual page? Use this URL:

home | help