Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
GSCAN2PDF(1)	      User Contributed Perl Documentation	  GSCAN2PDF(1)

NAME
       gscan2pdf - A GUI to produce PDFs or DjVus from scanned documents

USAGE
       1. Scan one or several pages in with File/Scan
       2. Create PDF of	selected pages with File/Save

REQUIRED ARGUMENTS
       None

OPTIONS
       gscan2pdf has the following command-line	options:

       --device=device
	   Specifies the device	to use,	instead	of getting the list of devices
	   from	 via  the SANE API.  This can be useful	if the scanner is on a
	   remote computer which is not	broadcasting its existence.

       --help
	   Displays this help page and exits.

       --log=log-file
	   Specifies a file to store logging messages.

       --debug,	--info,	--warn,	--error, --fatal
	   Defines the log level.  If a	log file is specified,	this  defaults
	   to --debug, otherwise --error.

       --import=PDF|DjVu|images
	   Imports  the	 specified  file(s). If	the document has more than one
	   page, a window is displayed to select the required pages.

       --import-all=PDF|DjVu|images Imports all	pages of the specified
       file(s).
       --version
	   Displays the	program	version	and exits.

       Scanning	is handled with	SANE via scanimage.  PDF conversion is done by
       PDF::Builder.  TIFF export is handled by	libtiff	 (faster  and  smaller
       memory footprint	for multipage files).

DIAGNOSTICS
       To  diagnose  a	possible  error, start gscan2pdf from the command line
       with logging enabled:

       "gscan2pdf --log=file.log"

       and check file.log.

EXIT STATUS
       None

CONFIGURATION
       gscan2pdf creates a text	resource file  in  ~/.config/gscan2pdfrc.  The
       directory  can  be  changed  by	setting	the $XDG_CONFIG_HOME variable.
       Generally,   however,   preferences   should   be   changed   via   the
       Edit/Preferences	 menu,	or  are	 captured  automatically during	normal
       usage of	the program.

INCOMPATIBILITIES
       None known.

BUGS AND LIMITATIONS
       Whilst it is possible to	import PDFs, this is intended to  be  able  to
       round-trip files	created	by gscan2pdf.

Download
       gscan2pdf	  is	      available		on	   Sourceforge
       (<https://sourceforge.net/projects/gscan2pdf/files/gscan2pdf/>).

   Debian-based
       If   you	  are	using	Debian,	  you	 should	   find	   that	   sid
       <https://www.debian.org/releases/sid/>  has  the	latest version already
       packaged.

       If you are using	a Ubuntu-based system, you can automatically  keep  up
       to date with the	latest version via the ppa:

       "sudo apt-add-repository	ppa:jeffreyratcliffe/ppa"

       If  you	are  you are using Synaptic, then use menu Edit/Reload Package
       Information, search for gscan2pdf in  the  package  list,  and  lo  and
       behold, you can install the nice	shiny new version.

       From the	command	line:

       "sudo apt update"

       "sudo apt install gscan2pdf"

   From	source
       The  source  is hosted in the files section of the gscan2pdf project on
       Sourceforge (<https://sourceforge.net/projects/gscan2pdf/files/>).

   From	the repository
       gscan2pdf uses Git for its Revision Control System. You can browse  the
       tree at <https://sourceforge.net/p/gscan2pdf/code/>.

       Git   users   can   clone   the	 complete   tree   with	  "git	 clone
       git://git.code.sf.net/p/gscan2pdf/code"

Building gscan2pdf from	source
       Having downloaded the source either from	a Sourceforge file release, or
       from the	 Git  repository,  unpack  it  if  necessary  with  "tar  xvfz
       gscan2pdf-x.x.x.tar.gz cd gscan2pdf-x.x.x"

       "perl Makefile.PL", will	create the Makefile.

       "make  test"  should  run  several hundred tests	to confirm that	things
       will work properly on your system.

       You can install directly	from  the  source  with	 "make	install",  but
       building	 the  appropriate  package  for	your distribution should be as
       straightforward as "make	debdist" or "make rpmdist". However, you  will
       additionally  need the rpm, devscripts, fakeroot, debhelper and gettext
       packages.

Dependencies
       The list	below looks daunting, but all packages are available from  any
       reasonable  up-to-date  distribution. If	you are	using Synaptic,	having
       installed gscan2pdf, locate the gscan2pdf  entry	 in  Synaptic,	right-
       click  it and you can install them under	Recommends. Note also that the
       library	names  given  below  are   the	 Debian/Ubuntu	 ones.	 Those
       distributions  using  RPM  typically  use perl(module) where Debian has
       libmodule-perl.

       Required
	   libgtk3-perl	>= 0.028
	       There is	a bug in version of  libgtk3-perl  before  0.028  that
	       causes  gscan2pdf  to crash when	saving.	Whilst I could prevent
	       gscan2pdf from crashing,	it would still be impossible  to  save
	       anything, rendering gscan2pdf rather useless.

	   libgtk3-simplelist-perl
	       A simple	interface to Gtk3's complex MVC	list widget

	   liblocale-gettext-perl (>= 1.05)
	       Using libc functions for	internationalisation in	Perl

	   libpdf-builder-perl
	       provides	the functions for creating PDF documents in Perl

	   libsane
	       API library for scanners

	   libimage-sane-perl
	       Perl bindings for libsane.

	   libset-intspan-perl
	       manages sets of integers

	   libtiff-tools
	       TIFF manipulation and conversion	tools

	   Imagemagick
	       Image manipulation programs

	   perlmagick
	       A perl interface	to the libMagick graphics routines

	   sane-utils
	       API library for scanners	-- utilities.

       Optional
	   sane
	       scanner	graphical  frontends.  Only  required  for the scanadf
	       frontend.

	   unpaper
	       post-processing	   tool	    for	    scanned	pages.	   See
	       <https://www.flameeyes.eu/projects/unpaper>.

	   xdg-utils
	       Desktop	integration  utilities	from freedesktop.org. Required
	       for	     Email	     as		  PDF.		   See
	       <https://www.freedesktop.org/wiki/Software/xdg-utils/>

	   djvulibre-bin
	       Utilities     for     the     DjVu     image	format.	   See
	       <http://djvu.sourceforge.net/>

	   gocr
	       A command line OCR. See <http://jocr.sourceforge.net/>.

	   tesseract
	       A	   command	     line	    OCR.	   See
	       <https://github.com/tesseract-ocr/tesseract>

	   cuneiform
	       A command line OCR. See <http://launchpad.net/cuneiform-linux>

Support
       There are two mailing lists for gscan2pdf:

       gscan2pdf-announce
	   A  low-traffic  list	for announcements, mostly of new releases. You
	   can				subscribe			    at
	   <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-announce>

       gscan2pdf-help
	   General   support,	questions,   etc..   You   can	 subscribe  at
	   <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-help>

Reporting bugs
       Before reporting	bugs, please read the "FAQs" section.

       Please  report  any  bugs  found,   preferably	against	  the	Debian
       package[1][2].	You  do	 not  need  to	be a Debian user, or set up an
       account to do this.  The	Debian tool "reportbug"	provides a  convenient
       GUI for doing so.

       1. https://packages.debian.org/sid/gscan2pdf
       2. https://www.debian.org/Bugs/

       Alternatively,  there  is  a  bug  tracker for the gscan2pdf project on
       Sourceforge
       (<https://sourceforge.net/p/gscan2pdf/_list/tickets?source=navbar>).

       Please include the log file created by "gscan2pdf --log=log"  with  any
       new bug report.

Translations
       gscan2pdf  has  already	been partly translated into several languages.
       If you would like to contribute to  an  existing	 or  new  translation,
       please		      check		   out		      Rosetta:
       <https://translations.launchpad.net/gscan2pdf>

       Note that the translations for the scanner options are  taken  directly
       from  sane-backends.  If	you would like to contribute to	these, you can
       do   so	 either	  at   contact	  the	 sane-devel    mailing	  list
       (sane-devel@lists.alioth.debian.org)   and  have	 a  look  at  the  po/
       directory in the	source code <http://www.sane-project.org/cvs.html>.

       Alternatively, Ubuntu has its own translation  project.	For  the  9.04
       release,	      the	translations	   are	     available	    at
       <https://translations.launchpad.net/ubuntu/jaunty/+source/sane-backends/+pots/sane-backends>

       If you have updated  an	".po"  file  in	 the  "po"  directory  of  the
       gscan2pdf  source tree and would	like to	test it, pick a	test directory
       for the compiled	locales, e.g.  "./locale", and create the ".mo"	 files
       with:

       "perl Makefile.PL LOCALEDIR=./locale"

       If  the	updated	 locale	 is your standard one, then the	following will
       find the	updated	file:

       "perl -I	lib bin/gscan2pdf --log=log --locale=locale"

       If it is	not your standard locale, you will need	 something  like  (for
       Russian):

       "LC_ALL=ru_RU.utf8      LC_MESSAGES=ru_RU.utf8	   LC_CTYPE=ru_RU.utf8
       LANG=ru_RU.utf8 LANGUAGE=ru_RU.utf8 perl	-I lib bin/gscan2pdf --log=log
       --locale=locale"

       or German:

       "LC_ALL=de_DE	 LC_MESSAGES=de_DE	LC_CTYPE=de_DE	    LANG=de_DE
       LANGUAGE=de_DE perl -I lib bin/gscan2pdf	--log=log --locale=locale"

       If  the	above  doesn't	work,  make sure it is in the list produced by
       "locale -a", including any ".utf8" suffix. If necessary,	 generate  new
       locales with "sudo dpkg-reconfigure locales"

DESCRIPTION
   File
       New

       Clears the page list.

       Open

       Opens  any  format  that	 imagemagick  supports.	 PDFs  will have their
       embedded	images extracted and imported one per page.

       Note that files	can  also  be  imported	 by  dragging  them  into  the
       thumbnail list from a program like nautilus or konqueror.

       Scan

       Sets options before scanning via	SANE.

       Device

       Chooses between available scanners.

       # Pages

       Selects the number of pages, or all pages to scan.

       Source document

       Selects between single sided or double sides pages.

       This  affects  the  page	 numbering.   Single  sided scans are numbered
       consecutively.  Double sided scans are incremented (or decremented, see
       below) by 2, i.e. 1, 3, 5, etc..

       Side to scan

       If double sided is selected above, assuming a non-duplex	scanner,  i.e.
       a  scanner  that	 cannot	 automatically scan both sides of a page, this
       determines whether the page number is incremented or decremented	by 2.

       To scan both sides of three pages, i.e. 6 sides:

       1. Select:
	   # Pages = 3 (or "all" if your scanner can detect when it is out  of
	   paper)

	   Double sided

	   Facing side

       2. Scans	sides 1, 3 & 5.
       3. Put pile back	with scanner ready to scan back	of last	page.
       4. Select:
	   #  Pages = 3	(or "all" if your scanner can detect when it is	out of
	   paper)

	   Double sided

	   Reverse side

       5. Scans	sides 6, 4 & 2.
       6. gscan2pdf automatically sorts	the pages so that they appear in the
       correct order.

       Device-dependent	options

       These, naturally, depend	on your	scanner.  They can include

       Page size.
       Mode (colour/black & white/greyscale)
       Resolution (in PPI)
       Batch-scan
	   Guarantees that a "no documents" condition will be  returned	 after
	   the	last  scanned  page,  to prevent endless flatbed scans after a
	   batch scan.

       Wait-for-button/Button-wait
	   After sending the scan  command,  wait  until  the  button  on  the
	   scanner is pressed before actually starting the scan	process.

       Source
	   Selects  the	document source.  Possible options can include Flatbed
	   or ADF.  On some scanners, this is the only way  of	generating  an
	   out-of-documents signal.

       Save

       Saves the selected or all pages as a PDF, DjVu, TIFF, PNG, JPEG,	PNM or
       GIF.

       Metadata

       Metadata	 are  information  that	 are  not  visible  when  viewing  the
       PDF/DjVu, but are embedded in the file and so  searchable  and  can  be
       examined,  typically  with  the	"Properties"  option  of  the document
       viewer.

       The metadata are	completely optional, but can also be used to  generate
       the filename see	preferences for	details.

       The date	can be selected	with use of the	calendar widget. The displayed
       date  can  be  incremented  or  decremented with	use of the '+' and '-'
       keys.

       DjVu

       Both black and white, and colour	images produce better compression than
       PDF. See	<http://www.djvuzone.org/> for more details.

       Email as	PDF

       Attaches	the selected or	all pages as a PDF to  a  blank	 email.	  This
       requires	 xdg-email, which is in	the xdg-utils package.	If this	is not
       present,	the option is ghosted out.

       Print

       Prints the selected or all pages.

       Compress	temporary files

       If your temporary ($TMPDIR) directory is	getting	 full,	this  function
       can  be	useful - compressing all images	at LZW-compressed TIFFs. These
       require much less space than the	PNM files that are typically  produced
       by SANE or by importing a PDF.

   Edit
       Delete

       Deletes the selected page.

       Renumber

       Renumbers the pages from	1..n.

       Note  that  the	page order can also be changed by drag and drop	in the
       thumbnail view.

       Select

       The select menus	can be used to select, all, even, odd, blank, dark  or
       modified	 pages.	Selecting blank	or dark	pages runs imagemagick to make
       the decision.   Selecting  modified  pages  selects  those  which  have
       modified	by threshold, unsharp, etc., since the last OCR	run was	made.

       Properties

       When  an	image is scanned, gscan2pdf attempts to	extract	the resolution
       from the	scan options. This nearly always works without problem.

       Importing an image can be trickier, however. Some image formats such as
       PNM do not encode metadata for resolution. In other cases, the data  is
       incorrect.   Edit/Properties  allows  the  user to manually correct the
       metadata	for a particular page, thus correcting the size	of  final  PDF
       or DjVu.	The image itself is otherwise not changed - it is not down- or
       upscaled.

       Preferences

       The  preferences	 menu item allows the control of the default behaviour
       of various functions. Most of these are self-explanatory.

       Frontends

       gscan2pdf initially supported two  frontends,  scanimage	 and  scanadf.
       scanadf	support	 was  added  when  it  was realised that scanadf works
       better than scanimage with  some	 scanners.  On	Debian-based  systems,
       scanadf	is in the sane package,	not, like scanimage, in	sane-utils. If
       scanadf is not present, the option is obviously ghosted out.

       In 0.9.27, Perl bindings	for SANE were  introduced.  These  are	called
       libsane-perl.

       Before  1.2.0,  options	available through CLI frontends	like scanimage
       were made visible as users asked	for them. In 1.2.0, all	options	can be
       shown or	hidden via Edit/Preferences, along with	the ability to specify
       which options trigger a reload.

       In 1.8.3, New Perl bindings for SANE were introduced. These are	called
       libimage-sane-perl and are the preferred	frontend.

       In 1.8.5, support for libsane-perl was removed.

       Device blacklist

       Ignore listed devices.

       Note  that  this	 is a device name regular expression, e.g. /dev/video,
       and  not	 the  name  as	listed	in  the	 scan  window,	 e.g.	Noname
       Integrated_Webcam_HD.

       Default filename	for PDF	or DjVu	files

       All  strftime  codes  (e.g.  %Y	for the	current	year) are available as
       variables, with the following additions:

       %Da author

       %De filename extension

       %Dt title

       All document date codes use strftime codes with a leading D, e.g.:

       %DY document year

       %Dm document month

       %Dd document day

   View
       Zoom 100%

       Zooms to	1:1. How this appears depends on the desktop resolution.

       Zoom to fit

       Scales the view such that all the page is visible.

       Zoom in

       Zoom out

       Rotate 90 clockwise

       The rotate options require the package imagemagick and, if this is  not
       present,	are ghosted out.

       Rotate 180

       Rotate 90 anticlockwise

   Tools
       Threshold

       Changes	all  pixels  darker  than the given value to black; all	others
       become white.

       Unsharp mask

       The unsharp option sharpens an image. The image	is  convolved  with  a
       Gaussian	 operator  of the given	radius and standard deviation (sigma).
       For reasonable results, radius should  be  larger  than	sigma.	Use  a
       radius of 0 to have the method select a suitable	radius.

       Crop

       unpaper

       unpaper	(see <https://www.flameeyes.eu/projects/unpaper>) is a utility
       for cleaning up a scan.

       OCR (Optical Character Recognition)

       The gocr, tesseract or cuneiform	utilities are  used  to	 produce  text
       from an image.

       There  is  an  OCR output buffer	for each page and is embedded as plain
       text behind the scanned image in	the PDF	produced. This way, Beagle can
       index (i.e. search) the plain text.

       In DjVu files, the OCR output buffer is embedded	 in  the  hidden  text
       layer.  Thus these can also be indexed by Beagle.

       There	is    an    interesting	   review    of	   OCR	 software   at
       <https://web.archive.org/web/20080529012847/http://groundstate.ca/ocr>.
       An important  conclusion	 was  that  400ppi  is	necessary  for	decent
       results.

       Up  to  v2.04,  the  only way to	tell which languages were available to
       tesseract was to	look for  the  language	 files.	 Therefore,  gscan2pdf
       checks the path returned	by:

       "tesseract '' ''	-l ''"

       If  there  are  no language files in the	above location,	then gscan2pdf
       assumes that tesseract v1.0 is installed, which had no language files.

       Variables for user-defined tools

       The following variables are available:

       %i  input filename

       %o  output filename

       %r  resolution

       An image	can be modified	in-place by just specifying %i.

FAQs
   Why isn't option xyz	available in the scan window?
       Possibly	because	SANE or	your scanner doesn't support it.

       If an option listed in the output of "scanimage --help" that you	 would
       like  to	 use  isn't  available,	 send me the output and	I will look at
       implementing it.

   I've	only got an old	flatbed	scanner	with no	automatic sheetfeeder. How  do
       I scan a	multipage document?
       In Edit/Preferences, tick the box "Allow	batch scanning from flatbed".

       Some  Brother scanners report "out of documents", despite scanning from
       flatbed.	 This can be worked around by ticking the box "Force new  scan
       job between pages".

       If  you	are  lucky, you	have an	option like Wait-for-button or Button-
       wait, where the scanner will wait for you to press the scan  button  on
       the  device  before  it	starts the scan, allowing you to scan multiple
       pages without touching the computer.

       If you are quick, you might be able  to	change	the  document  on  the
       flatbed whilst the scan head is returning.

       Otherwise, you have to set the number of	pages to scan to 1 and hit the
       scan button on the scan window for each page.

   Why is option xyz ghosted out?
       Probably	because	the package required for that option is	not installed.
       Email  as  PDF  requires	 xdg-email (xdg-utils),	unpaper	and the	rotate
       options require imagemagick.

   Why can I not scan from the flatbed of my HP	scanner?
       Generally for HP	scanners with an ADF, to scan from  the	 flatbed,  you
       should set "# Pages" to "1", and	possibly "Batch	scan" to "No".

   When	I update gscan2pdf using the Update Manager in Ubuntu, why is the list
       of changes never	displayed?
       As  far	as  I can tell,	this is	pulled from changelogs.ubuntu.com, and
       therefore  only	the  changelogs	 from  official	 Ubuntu	  builds   are
       displayed.

   Why can gscan2pdf not find my scanner?
       If  your	 scanner is not	connected directly to the machine on which you
       are running gscan2pdf and you  have  not	 installed  the	 SANE  daemon,
       saned,  gscan2pdf  cannot  automatically	find it. In this case, you can
       specify the scanner device on the command line:

       "gscan2pdf --device <device">

   How can I search for	text in	the OCR	layer of  the  finished	 PDF  or  DJVU
       file?
       pdftotext or djvutxt can	extract	the text layer from PDF	or DJVU	files.
       See the respective man pages for	details.

       Having  opened  a  PDF  or  DJVU	 file in evince	or Acrobat Reader, the
       search function will typically find the page with  the  requested  text
       and highlight it.

       There  are various tools	for searching or indexing files, including PDF
       and DJVU:

          (meta) Tracker (<https://projects.gnome.org/tracker/>)

          plone (<http://plone.org/>)

          pdfgrep (<http://pdfgrep.sourceforge.net/>

          swish-e (<http://www.swish-e.org/>)

          recoll (<http://www.lesbonscomptes.com/recoll/>)

          terrier (<http://www.lesbonscomptes.com/recoll/>)

   How can I change the	colour of the selection	box in the image viewer?
       Create a	file called  "~/.config/gtk-3.0/gtk.css"  with	the  following
       content:

	.rubberband,
	rubberband,
	flowbox	rubberband,
	treeview.view rubberband,
	.content-view rubberband,
	.content-view .rubberband {
	  border: 1px solid #2a76c6;
	  background-color: rgba(42, 118, 198, 0.2); }

   How can I change the	colour of the OCR output
       Create  a  file	called	"~/.config/gtk-3.0/gtk.css" with the following
       content:

	#gscan2pdf-ocr-output {
	  color: black;
	}

See Also
       XSane (<http://xsane.org/>)

       Scan Tailor (<http://scantailor.org/>)

Author
       Jeffrey Ratcliffe (jffry	at posteo dot net)

Thanks to
          all the people  who	have  sent  patches,  translations,  bugs  and
	   feedback.

          the gtk+ project for	a most excellent graphics toolkit.

          the Gtk3-Perl project for their superb Perl bindings	for GTK3.

          The SANE project for	scanner	access

          Bjrn	Lindqvist for the gtkimageview widget

          Sourceforge for hosting the project.

LICENSE	AND COPYRIGHT
       Copyright (C) 2006--2024	Jeffrey	Ratcliffe <jffry@posteo.net>

       This program is free software: you can redistribute it and/or modify it
       under  the  terms  of  the  version  3  GNU  General  Public License as
       published by the	Free Software Foundation.

       This program is distributed in the hope that it	will  be  useful,  but
       WITHOUT	 ANY   WARRANTY;   without   even   the	 implied  warranty  of
       MERCHANTABILITY or FITNESS FOR  A  PARTICULAR  PURPOSE.	 See  the  GNU
       General Public License for more details.

       You should have received	a copy of the GNU General Public License along
       with this program.  If not, see <https://www.gnu.org/licenses/>.

perl v5.36.3			  2025-05-13			  GSCAN2PDF(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=gscan2pdf&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help