Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
PDF2DJVU(1)			pdf2djvu manual			   PDF2DJVU(1)

NAME
       pdf2djvu	- creates DjVu files from PDF files

SYNOPSIS

       pdf2djvu	[{-o | --output} output-djvu-file] [option...] pdf-file...

       pdf2djvu	{-i | --indirect} index-djvu-file  [option...] pdf-file...

       pdf2djvu	{--version | --help | -h}

DESCRIPTION
       This program creates a DjVu file	from one or more Portable Document
       Format files.

OPTIONS
       pdf2djvu	accepts	the following options:

   Document type, file names
       -o, --output=output-djvu-file
	   Generate a bundled multi-page document. Write the file into
	   output-djvu-file instead of standard	output.

       -i, --indirect=index-djvu-file
	   Generate an indirect	multi-page document. Use index-djvu-file as
	   the index file name;	put the	component files	into the same
	   directory. The directory must exist and be writable.

       --page-id-template=template
	   Specifies the naming	scheme for page	identifiers. Consult the
	   "TEMPLATE LANGUAGE" section for the template	language description.

	   The default template	is "p{page:04*}.djvu".

	   For portability reasons, page identifiers:

	      must consist only of lowercase ASCII letters, digits, _,	+, -
	       and dot,

	      cannot start with a +, -	or a dot,

	      cannot contain two consecutive dots,

	      must end	with the .djvu or the .djv extension.

       --page-id-prefix=prefix
	   Equivalent to "--page-id-template=prefix{page:04*}.djvu".

       --page-title-template=template
	   Specifies the template for page titles. Consult the "TEMPLATE
	   LANGUAGE" section for the template language description.

	   The default template	is "{label}".

       --no-page-titles
	   Don't set page titles.

   Resolution, page size
       -d, --dpi=resolution
	   Specifies the desired resolution to resolution dots per inch. The
	   default is 300 dpi. The allowed range is: 72	<= resolution <= 6000.

       --media-box
	   Use MediaBox	to determine page size.	 CropBox is used by default.

       --page-size=widthxheight
	   Specifies the preferred page	size to	width pixels x height pixels.
	   The actual page size	may be altered in order	to respect aspect
	   ratio and DjVu limitations on resolution. (This option takes
	   precedence over -d/--dpi.)

       --guess-dpi
	   Try to guess	native resolution by inspecting	embedded images. Use
	   with	care.

   Image quality
       --bg-slices=n+...+n, --bg-slices=n,...,n
	   Specifies the encoding quality of the IW44 background layer.	This
	   option is similar to	the -slice option of c44. Consult the c44(1)
	   manual page for details. The	default	is 72+11+10+10.

       --bg-subsample=n
	   Specifies the background subsampling	ratio. The default is 3. Valid
	   values are integers between 1 and 12, inclusive.

       --fg-colors=default
	   Try to preserve all the foreground layer colors. This is the
	   default.

       --fg-colors=web
	   Reduce foreground layer colors to the web palette (216 colors).
	   This	option is not recommended.

       --fg-colors=n
	   Use GraphicsMagick to reduce	number of distinct colors in the
	   foreground layer to n. Valid	values are integers between 1 and
	   4080. This option is	not recommended.

       --fg-colors=black
	   Discard any color information from the foreground layer.

       --monochrome
	   Render pages	as monochrome bitmaps. With this option, --bg-...  and
	   --fg-...  options are not respected.

       --loss-level=n
	   Specifies the aggressiveness	of the lossy compression. The default
	   is 0	(lossless). Valid values are integers between 0	and 200,
	   inclusive. This option is similar to	the -losslevel option of cjb2;
	   consult the cjb2(1) manual page for details.	This option can	be
	   used	only if	the --monochrome option	is also	enabled.

       --lossy
	   Synonym for --loss-level=100.

       --anti-alias
	   Enable font and vector anti-aliasing. This option is	not
	   recommended.

   Extraction
       --no-metadata
	   Don't extract the metadata.

	   By default:

	      The following entries of	the document information dictionary
	       are extracted: Title, Author, Subject, Creator, Producer,
	       CreationDate, ModDate. Timestamps are formatted according to
	       RFC 3999[1], with date and time components separated by a
	       single space.

	      The XMP metadata	is extracted (or created) and updated
	       accordingly.

	       Note
	       If multiple input documents are specified, only metadata	of the
	       first one is taken into account.

       --verbatim-metadata
	   Keep	the original metadata intact.

       --no-outline
	   Don't extract the document outline.

       --hyperlinks=border-avis
	   Make	hyperlink borders always visible.

	   By default, a hyperlink border is visible only when the mouse is
	   over	the hyperlink.

       --hyperlinks=#RRGGBB
	   Force the specified border color for	hyperlinks.

       --no-hyperlinks,	--hyperlinks=none
	   Don't extract hyperlinks.

       --no-text
	   Don't extract the text.

       --words
	   Extract the text. Record the	location of every word.	This is	the
	   default.

       --lines
	   Extract the text. Record the	location of every line,	rather that
	   every word.

       --crop-text
	   Extract no text outside the page boundary.

       --no-nfkc
	   Do not apply	NFKC[2]	normalization on the text, except for
	   characters from the Alphabetic Presentation Forms block[3]
	   (U+FB00-U+FB4F), which are normalized unconditionally.

	   The default is to apply NFKC	normalization on all characters.

       --filter-text=command-line
	   Filter the text through the command-line. The provided filter must
	   preserve whitespace,	control	characters and decimal digits.

	   This	option implies --no-nfkc.

       -p, --pages=page-range
	   Specifies pages to convert.	page-range is a	comma-separated	list
	   of sub-ranges. Each sub-range is either a single page (e.g. 17) or
	   a contiguous	range of pages (e.g. 37-42). Duplicate page numbers
	   are not allowed. Pages are numbered from 1.

	   The default is to convert all pages.

   Performance
       -j, --jobs=n
	   Use n threads to perform conversion.	The default is to use one
	   thread.

       -j0, --jobs=0
	   Determine automatically how many threads to use to perform
	   conversion.

   Verbosity, help
       -v, --verbose
	   Display more	informational messages while converting	the file.

       -q, --quiet
	   Don't display informational messages	while converting the file.

       --version
	   Output version information and exit.

       -h, --help
	   Display help	and exit.

ENVIRONMENT
       The following environment variables affects pdf2djvu on Unix systems:

       OMP_*
	   Details of runtime behavior with respect to parallelism can be
	   controlled by several environment variables.	Please refer to	the
	   OpenMP API specification[4] for details.

       TMPDIR
	   pdf2djvu makes heavy	use of temporary files.	It will	store them in
	   a directory specified by this variable. The default is /tmp.

TEMPLATE LANGUAGE
   Template syntax
       The template language is	roughly	modeled	on the Python string
       formatting syntax[5].

       A template is a piece of	text which contains fields, surrounded by
       curly braces {}.	Fields are replaced with appropriately formatted
       values when the template	is evaluated. Moreover,	{{ is replaced with a
       single {	and }} is replaced with	a single }.

   Field syntax
       Each field consists of a	variable name, optionally followed by a	shift,
       optionally followed by a	format specification.

       The shift is a signed (i.e. starting with a + or	- character) integer.

       The format specification	consists of a colon, followed by a width
       specification.

       The width specification is a decimal integer defining the minimum field
       width. If not specified,	then the field width will be determined	by the
       content.	Preceding the width specification with a zero (0) character
       enables zero-padding.

       The width specification is optionally followed by an asterisk (*)
       character, which	increases the minimum field width to the width of the
       longest possible	content	of the variable.

   Available variables
       dpage
	   Page	number in the DjVu document.

       page, spage
	   Page	number in the PDF document.

       label
	   Page	label (logical page number) in the PDF document.

	   This	variable is available only for page titles.

IMPLEMENTATION DETAILS
   Layer separation algorithm
       Unless the --monochrome option is on, pdf2djvu uses the following naive
       layer separation	algorithm:

	1. For each page, do the following:

	    1. Rasterize the page into a pixmap, in the	usual manner.

	    2. Rasterize the page into another pixmap, omitting	the following
	       page elements:

	          text,

	          1 bit-per-pixel raster images,

	          vector elements (except fills of large areas).

	    3. Compare both pixmaps, pixel by pixel:

		1. If their colors match, classify the pixel as	a part of the
		   background layer.

		2. Otherwise, classify the pixel as a part of the foreground
		   layer.

BUG REPORTS
       If you find a bug in pdf2djvu, please report it at the issue tracker[6]
       or to the mailing list[7].

SEE ALSO
       djvu(1),	djvudigital(1),	csepdjvu(1)

NOTES
	1. RFC 3999
	   https://www.ietf.org/rfc/rfc3339

	2. NFKC
	   https://unicode.org/reports/tr15/

	3. Alphabetic Presentation Forms block
	   https://unicode.org/charts/PDF/UFB00.pdf

	4. OpenMP API specification
	   https://www.openmp.org/specifications/

	5. Python string formatting syntax
	   https://docs.python.org/2/library/string.html#format-string-syntax

	6. the issue tracker
	   https://github.com/jwilk/pdf2djvu/issues

	7. the mailing list
	   https://groups.io/g/pdf2djvu

pdf2djvu 0.9.19			  2022-08-09			   PDF2DJVU(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=pdf2djvu&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help