Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
PULLNEWS(1)		  InterNetNews Documentation		   PULLNEWS(1)

NAME
       pullnews	- Pull news from multiple news servers and feed	it to another

SYNOPSIS
       pullnews	[-BhnOqRx] [-a hashfeed] [-b fraction] [-c config] [-C width]
       [-d level] [-f fraction]	[-F fakehop] [-g groups] [-G newsgroups] [-H
       headers]	[-k checkpt] [-l logfile] [-L size] [-m	header_pats] [-M num]
       [-N timeout] [-p	port] [-P hop_limit] [-Q level]	[-r file] [-s to-
       server[:port][_tlsmode]]	[-S max-run] [-t retries] [-T connect-pause]
       [-w num]	[-z article-pause] [-Z group-pause] [from-server ...]

REQUIREMENTS
       The "Net::NNTP" module must be installed.  This module is available as
       part of the libnet distribution and comes with recent versions of Perl.
       For older versions of Perl, you can download it from
       <http://www.cpan.org/>.

DESCRIPTION
       pullnews	reads a	config file named pullnews.marks, and connects to the
       upstream	servers	given there as a reader	client.	 This file is looked
       for in pathdb when pullnews is run as the user set in runasuser in
       inn.conf	(which is by default the "news"	user); otherwise, this file is
       looked for in the running user's	home directory.

       By default, pullnews connects to	all servers listed in the
       configuration file, but you can limit pullnews to specific servers by
       listing them on the command line: a whitespace-separated	list of	server
       names can be specified, like from-server	for one	of them.  For each
       server it connects to, it pulls over articles and feeds them to the
       destination server via the IHAVE	or POST	commands.  This	means that the
       system pullnews is run on must have feeding access to the destination
       news server.

       pullnews	is designed for	very small sites that do not want to bother
       setting up traditional peering and is not meant for handling large
       feeds.

       In case you have	running	peers and don't	want to	propagate them the
       articles	you are	pulling	from upstream servers, you should add a	fake
       hop with	the -F flag to all the pulled articles,	and add	that very fake
       hop in the exclusion sub-field of all the sites configured in your
       newsfeeds file which should not receive these articles.	For example,
       using "pullnews -F myserverimported", change "sitename:*:Tm:innfeed!"
       to "sitename/myserverimported:*:Tm:innfeed!" for	every sitename in
       newsfeeds you don't want	to feed	the pulled articles to (like your
       outgoing	peers and a possible "inpaths!"	entry).	 Entries like "ME",
       "controlchan!", "innfeed!" or "nocem!" do not need that exclusion.

OPTIONS
       -a hashfeed
	   This	 option	is a deterministic way to control the flow of articles
	   and to split	a feed.	 The hashfeed parameter	must be	 in  the  form
	   "value/mod"	or "start-end/mod".  The Message-ID of each article is
	   hashed using	MD5, which results in  a  128-bit  hash.   The	lowest
	   32 bits  are	 then taken by default as the hashfeed value (which is
	   an integer).	 If the	hashfeed value modulus "mod" plus  one	equals
	   "value"  or	is  between  "start" and "end",	pullnews will feed the
	   article.  All these numbers must be integers.

	   For instance:

	       pullnews	-a 1/2	    Feeds about	50% of all articles.
	       pullnews	-a 2/2	    Feeds the other 50%	of all articles.

	   Another example:

	       pullnews	-a 1-3/10   Feeds about	30% of all articles.
	       pullnews	-a 4-5/10   Feeds about	20% of all articles.
	       pullnews	-a 6-10/10  Feeds about	50% of all articles.

	   You can use an extended syntax of the  form	"value/mod:offset"  or
	   "start-end/mod:offset"  (using an underscore	"_" instead of a colon
	   ":" is also recognized).  As	MD5 generates a	128-bit	return	value,
	   it is possible to specify from which	byte-offset the	32-bit integer
	   used	 by  hashfeed  starts.	The default value for "offset" is ":0"
	   and thirteen	overlapping values from	":0" to	 ":12"	can  be	 used.
	   Only	 up to four totally independent	values exist: ":0", ":4", ":8"
	   and ":12".

	   Therefore, it allows	generating a  second  level  of	 deterministic
	   distribution.   Indeed,  if	pullnews  feeds	 "1/2",	 it  can go on
	   splitting thanks to "1-3/9:4" for instance.	Up to four  levels  of
	   deterministic distribution can be used.

	   The algorithm is compatible with the	one used by Diablo 5.1 and up.

       -b fraction
	   Backtrack  on  server numbering reset.  Specify the proportion (0.0
	   to 1.0) of a	group's	articles to pull  when	the  server's  article
	   number is less than our high	for that group.	 When fraction is 1.0,
	   pull	all the	articles on a renumbered server.  The default is to do
	   nothing.

       -B  Feed	is header-only,	that is	to say pullnews	only feeds the headers
	   of  the  articles,  plus  one blank line.  It adds the Bytes	header
	   field if the	article	does not already have one, and keeps the  body
	   only	if the article is a control article.

       -c config
	   Normally,  the  config  file	 is stored in pullnews.marks in	pathdb
	   when	pullnews is run	as the news user, or otherwise in the  running
	   user's  home	directory.  If -c is given, config will	be used	as the
	   config file instead.	 This is useful	if you're running pullnews  as
	   a system user on an automated basis out of cron or as an individual
	   user, rather	than the news user.

	   See "CONFIG FILE" below for the format of this file.

       -C width
	   Use	width characters per line for the progress table.  The default
	   value is 50.

       -d level
	   Set the debugging level to  the  integer  level  (up	 to  4);  more
	   debugging  output  will  be	logged as this increases.  The default
	   value is 0.

       -f fraction
	   This	changes	the proportion of articles to get from each  group  to
	   fraction  and  should  be  in  the  range 0.0 to 1.0	(1.0 being the
	   default).

       -F fakehop
	   Prepend fakehop as a	host to	the Path header	field body of articles
	   fed.

       -g groups
	   Specify a collection	of  groups  to	get.   groups  is  a  list  of
	   newsgroups  separated  by  commas  (only  commas, no	spaces).  Each
	   group must be defined in the	config file, and only the remote hosts
	   that	carry those groups will	be contacted.  Note  that  this	 is  a
	   simple  list	of groups, not a wildmat expression, and wildcards are
	   not supported.

       -G newsgroups
	   Add the comma-separated list	of groups newsgroups to	each server in
	   the configuration file (see also -g and -w).

       -h  Print a usage message and exit.

       -H headers
	   Remove these	named header fields (colon-separated  list)  from  fed
	   articles.

       -k checkpt
	   Checkpoint  (save)  the config file every checkpt articles (default
	   is 0, that is to say	at the end of the session).

       -l logfile
	   Log progress/stats to logfile (default is "stdout").

       -L size
	   Specify the largest wanted article size in bytes.  The  default  is
	   to download all articles, whatever their size.  When	this option is
	   used,  pullnews will	first retrieve overview	data (if available) of
	   each	newsgroup to process so	as to obtain  articles	sizes,	before
	   deciding which articles to actually download.

       -m header_pats
	   Feed	 an article based on header field body matching.  The argument
	   is a	number of whitespace-separated	tuples	(each  tuple  being  a
	   colon-separated  header  field  name	 and regular expression).  For
	   instance:

	       -m "Hdr1:regexp1	!Hdr2:regexp2 #Hdr3:regexp3 !#Hdr4:regexp4"

	   specifies that the article will be passed only if the "Hdr1"	header
	   field body matches "regexp1"	and the	"Hdr2" header field body  does
	   not	match  "regexp2".   Besides,  if  the "Hdr3" header field body
	   matches "regexp3", that header is removed; and if the "Hdr4"	header
	   field body does not match "regexp4",	that header is removed.

       -M num
	   Specify the maximum number of articles (per group) to process.  The
	   default is to process all new articles.  See	also -f.

       -n  Do nothing but read articles	-- does	not feed articles  downstream,
	   writes no rnews file, does not update the config file.

       -N timeout
	   Specify  the	 timeout length, as timeout seconds, when establishing
	   an NNTP connection.

       -O  Use an optimized mode: pullnews checks whether the article  already
	   exists  on  the  downstream	server,	before downloading it.	It may
	   help	for huge articles or a slow link to upstream hosts.

       -p port
	   Connect to the destination news server on a	port  other  than  the
	   default  of	119.   This  option  does  not change the port used to
	   connect to the source news servers.

       -P hop_limit
	   Restrict feeding an article based on	the  number  of	 hops  it  has
	   already  made.   Count  the	hops  in  the  Path  header field body
	   (hop_count),	feeding	the article only when hop_limit	is "+num"  and
	   hop_count is	more than num; or hop_limit is "-num" and hop_count is
	   less	than num.

       -q  Print out less status information while running.

       -Q level
	   Set the quietness level ("-Q	2" is equivalent to "-q").  The	higher
	   this	value, the less	gets logged.  The default is 0.

       -r file
	   Rather  than	 feeding  the  downloaded  articles  to	 a destination
	   server, instead create a batch file that can	 later	be  fed	 to  a
	   server  using  rnews.   See rnews(1)	for more information about the
	   batch file format.

       -R  Be a	reader (use MODE READER	and POST commands) to  the  downstream
	   server.   Some  posts  will	then be	rejected because of unexpected
	   injection header fields, obsolete or	incorrectly  formatted	header
	   fields,  or	with a date too	far in the past.  You may then want to
	   set artcutoff to 0 in inn.conf,  and	 use  the  -H  flag  to	 strip
	   unwanted  header  fields.  Even with	that, a	few articles may still
	   be rejected.

	   The default is to behave like a feeder and use the  IHAVE  command.
	   (You'll  have  to  allow  in	 incoming.conf	the  connections  from
	   pullnews so that it is recognized as	a feeder.)

       -s to-server[:port][_tlsmode]
	   Normally, pullnews will feed	the articles it	retrieves to the  news
	   server  running  on	localhost.   To	 connect  to a different host,
	   specify a server with the -s	flag.  You can also specify  the  port
	   with	this same flag or use -p.  Default port	is 119.

	   The	connection  is	by  default  unencrypted.   To negotiate a TLS
	   encryption layer, you can set tlsmode to  "TLS"  for	 implicit  TLS
	   (negotiated	immediately  upon  connection  on a dedicated port) or
	   "STARTTLS" for explicit TLS (the appropriate	command	will  be  sent
	   before authenticating or feeding messages).	Examples of use	are:

	       pullnews	-s news.server.com
	       pullnews	-s news.server.com_STARTTLS
	       pullnews	-s news.server.com:433_TLS

	   Note	that not all NNTP servers implement TLS	for feeding articles.

       -S max-run
	   Specify the maximum time max-run in seconds for pullnews to run.

       -t retries
	   The	maximum	number (retries) of attempts to	connect	to a server or
	   reconnect to	a server if the	socket	is  unexpectedly  closed  (see
	   also	-T).  The default is 0.

       -T connect-pause
	   Pause  connect-pause	 seconds  between connection retries (see also
	   -t).	 The default is	1.

       -w num
	   Set each group's high water mark (last received article number)  to
	   num.	  If  num is negative, calculate Current+num instead (i.e. get
	   the last num	articles).  Therefore, a num  of  0  will  re-get  all
	   articles  on	 the  server;  whereas	a  num of "-0" will get	no old
	   articles, setting the  water	 mark  to  Current  (the  most	recent
	   article on the server).

       -x  If  the  -x	flag  is  used,	 an  Xref header field is added	to any
	   article that	lacks one.  It can be useful for instance if  articles
	   are fed to a	news server which has xrefslave	set in inn.conf.

       -z article-pause
	   Sleep article-pause seconds between articles.  The default is 0.

       -Z group-pause
	   Sleep group-pause seconds between groups.  The default is 0.

CONFIG FILE
       The config file for pullnews is divided into blocks, one	block for each
       remote  server to connect to.  A	block begins with the host line	(which
       must have no leading whitespace)	and contains just the hostname of  the
       remote  server with optional port and TLS mode (with the	same semantics
       as  the	-s  flag),  optionally	followed  by  authentication   details
       (username  and  password	 for  that  server).  Note that	authentication
       details can also	be provided for	the downstream server (a host line for
       "localhost" or the hostname specified with the -s flag could  be	 added
       for it in the configuration file, with no newsgroup to fetch).

       Following  the  host  line  should be one or more newsgroup lines which
       start with whitespace followed by the name of a newsgroup to  retrieve.
       Only one	newsgroup should be listed on each line.

       pullnews	 will update the config	file to	include	the time the group was
       last checked and	the highest numbered  article  successfully  retrieved
       and  transferred	to the destination server.  It uses this data to avoid
       doing duplicate work the	next time it runs.

       The full	syntax is:

	   <host>[:<port>][_<tlsmode>] [<username> <password>]
	       <group> [<time> <high>]
	       <group> [<time> <high>]

       where the <host>	line must not have leading whitespace and the  <group>
       lines must.

       A typical configuration file would be:

	   # Format: group date	high
	   data.pa.vix.com
	       rec.bicycles.racing 908086612 783
	       rec.humor.funny 908086613 18
	       comp.programming.threads
	   nnrp.vix.com	pull sekret
	       comp.std.lisp
	   news.server.com:563_TLS joe password
	       news.software.nntp

       Note  that  an  earlier run of pullnews has filled in details about the
       last article downloads from the	two  rec.*  groups.   The  two	comp.*
       groups  and  the	 news.*	group were just	added by the user and have not
       yet been	checked.

       The nnrp.vix.com	server requires	authentication,	and pullnews will  use
       the  username  "pull" and the password "sekret" (without	any encryption
       layer).

       The connection to news.server.com will be encrypted with	 implicit  TLS
       on port 563.  Joe's password won't be sent in plaintext.

FILES
       pathbin/pullnews
	   The	Perl script itself used	to pull	news from upstream servers and
	   feed	it to another news server.

       pathdb/pullnews.marks or	~/pullnews.marks
	   The default config file.  It	is stored in pullnews.marks in	pathdb
	   when	 pullnews is run as the	news user, or otherwise	in the running
	   user's home directory.

HISTORY
       pullnews	was written by James Brister for INN.  The  documentation  was
       rewritten in POD	by Russ	Allbery	<eagle@eyrie.org>.

       Geraint	A. Edwards  greatly  improved  pullnews,  adding  no more than
       16 new  recognized  flags,  fixing  some	 bugs  and   integrating   the
       backupfeed  contrib  script  by	Kai  Henningsen,  adding again 6 other
       flags.

SEE ALSO
       incoming.conf(5), rnews(1).

INN 2.8.0			  2024-01-27			   PULLNEWS(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=pullnews&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help