Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
MARC2RIS(1)			 RefDB Manual			   MARC2RIS(1)

NAME
       marc2ris	- converts MARC	bibliographic data to the RIS format

SYNOPSIS

       marc2ris	[-e log-destination] [-h] [-l log-level] [-L log-file] [-m]
		[-o outfile] [-O outfile] [-t input_type] [-u t|f] file

DESCRIPTION
       marc2ris	attempts to extract the	information useful to RefDB from MARC
       datasets.  MARC (Machine	Readable Catalogue Format) is a	standard
       originating from	the 1960s and is widely	used by	libraries and
       bibliographic agencies. Most libraries that offer Z39.50	access can
       provide the records in at least one MARC	format (like with most other
       "standards" there's a couple to choose from). Currently the following
       MARC dialects are supported:

       MARC21
	  This	is  an	attempt	 to consolidate	existing MARC variants (mainly
	  USMARC and CANMARC) and will most likely be the format supported  by
	  all  libraries  in  the  near	future.	The format is described	on the
	  [1]Library of	Congress MARC pages.

       UNIMARC
	  This is the European equivalent of a	standardization	 attempt.  The
	  specification	can be found [2]here.

       UKMARC
	  This format is fairly	close to the USMARC variant and	is mainly used
	  by  libraries	 in  the  United  Kingdom  and	in  Ireland. Libraries
	  supporting  this  format  may	 switch	 to  MARC21  in	 the   future.
	  Unfortunately	 there	is  no	online description of this format, but
	  this [3]PDF document describes the main differences  between	USMARC
	  and UKMARC.

OPTIONS
       By  default  the	script reads USMARC data from stdin and	sends RIS data
       to stdout.

       -e log-destination
	  log-destination can have the values 0, 1, or 2,  or  the  equivalent
	  strings  stderr, syslog, or file, respectively. This value specifies
	  where	the log	information goes to.  0	(zero) means the messages  are
	  sent	to  stderr.  They  are immediately available on	the screen but
	  they may interfere with command output.  1 will send the  output  to
	  the  syslog facility.	Keep in	mind that syslog must be configured to
	  accept log messages from user	programs, see the syslog(8)  man  page
	  for  further	information.  Unix-like	 systems  usually  save	 these
	  messages in /var/log/user.log.  2 will send the messages to a	custom
	  log file which can be	specified with the -L option.

       -h Displays help	and usage screen, then exits.

       -l log-level
	  Specify the priority up to which events are logged. This is either a
	  number between 0 and 7 or one	of the	strings	 emerg,	 alert,	 crit,
	  err,	warning, notice, info, debug, respectively (see	also Log level
	  definitions).	 -1 disables logging completely. A low log level  like
	  0  means  that  only the most	critical messages are logged. A	higher
	  log level means that less critical events are	 logged	 as  well.   7
	  will include debug messages. The latter can be verbose and abundant,
	  so  you  want	 to avoid this log level unless	you need to track down
	  problems.

       -L log-file
	  Specify the full path	to a  log  file	 that  will  receive  the  log
	  messages. Typically this would be /var/log/refdba.

       -m Switch  on  additional  MARC output. The output data will be the RIS
	  output interspersed with the source MARC data	used to	 generate  the
	  output. This is useful to fix	conversion errors manually.

       -o file
	  Send	 output	 to  file.  If	file  exists,  its  contents  will  be
	  overwritten.

       -O file
	  Send output to file. If file exists, the output will be appended.

       -t input_type
	  Specify the MARC input type. The default is MARC21. Other  available
	  types	are UNIMARC and	UKMARC.

       -u t|f
	  Request Unicode output if set	to "t" (this is	the default). marc2ris
	  attempts  to convert the input data into Unicode (unless the dataset
	  explicitly states that it already uses Unicode). If  the  conversion
	  does	not seem to work, set this to "f" as some MARC variants	do not
	  state	the character encoding explicitly.

CONFIGURATION
       marc2ris	evaluates the file marc2risrc to initialize itself.

       Table-1.-marc2risrc----------------+---------------------+
       | Variable  | Default		  | Comment		|
       +-----------+----------------------+---------------------+
       | outfile   | (none)		  | The	default	 output	|
       |	   |			  | file name.		|
       +-----------+----------------------+---------------------+
       | outappend | t			  | Determines	whether	|
       |	   |			  | output is  appended	|
       |	   |			  | (t)	 to an existing	|
       |	   |			  | file or  overwrites	|
       |	   |			  | (f)	  an   existing	|
       |	   |			  | file.		|
       +-----------+----------------------+---------------------+
       | unmapped  | t			  | If	 set   to    t,	|
       |	   |			  | unknown tags in the	|
       |	   |			  | input  data	will be	|
       |	   |			  | output following  a	|
       |	   |			  | <unmapped> tag; the	|
       |	   |			  | resulting  data can	|
       |	   |			  | be	inspected   and	|
       |	   |			  | then     be	   sent	|
       |	   |			  | through   sed    to	|
       |	   |			  | strip   off	  these	|
       |	   |			  | additional	 lines.	|
       |	   |			  | If	  set	to   f,	|
       |	   |			  | unknown  tags  will	|
       |	   |			  | be	     gracefully	|
       |	   |			  | ignored.		|
       +-----------+----------------------+---------------------+
       | logfile   | /var/log/med2ris.log | The	full path of  a	|
       |	   |			  | custom   log  file.	|
       |	   |			  | This is  used  only	|
       |	   |			  | if	logdest	 is set	|
       |	   |			  | appropriately.	|
       +-----------+----------------------+---------------------+
       | logdest   | 1			  | The	destination  of	|
       |	   |			  | the		    log	|
       |	   |			  | information.  0   =	|
       |	   |			  | print  to stderr; 1	|
       |	   |			  | =  use  the	 syslog	|
       |	   |			  | facility; 2	= use a	|
       |	   |			  | custom logfile. The	|
       |	   |			  | latter    needs   a	|
       |	   |			  | proper  setting  of	|
       |	   |			  | logfile.		|
       +-----------+----------------------+---------------------+
       | loglevel  | 6			  | The	log level up to	|
       |	   |			  | which messages will	|
       |	   |			  | be	 sent.	 A  low	|
       |	   |			  | setting (0)	 allows	|
       |	   |			  | only    the	   most	|
       |	   |			  | important messages,	|
       |	   |			  | a high setting  (7)	|
       |	   |			  | allows all messages	|
       |	   |			  | including	  debug	|
       |	   |			  | messages. -1  means	|
       |	   |			  | nothing   will   be	|
       |	   |			  | logged.		|
       +-----------+----------------------+---------------------+

DATA PROCESSING
       The purpose of the MARC format is entirely different from  the  purpose
       of the RIS format, so you shouldn't be too surprised that the import of
       MARC  data  is somewhat rough at	the edges. The filter apparently deals
       fine with quite a lot of	datasets, but the following  shortcomings  are
       known (and more are likely to be	discovered by the interested reader):

         Some	fields,	 like  846, are	currently ignored completely. This, of
	  course, is bound to change.

         Author names specified in the	natural	 order,	 i.e.  something  like
	  First	 Middle	 Last,	are  not  normalized  due to the problems with
	  multiple middle or last names. Author	names in  the  inverse	order,
	  i.e.	something like Last, First Middle, are normalized correctly in
	  most cases. Handling of non-European names is	a matter of trial  and
	  error.

         Character set	handling is somewhat limited. Only the unaltered input
	  character encoding or	UTF-8 are available for	the output data.

       That  said,  there  is  still  some  hope.  The	-m command line	option
       switches	on additional MARC output. That	is, the	generated output  will
       contain	interspersed lines that	show the contents of the original MARC
       fields used to generate the following RIS line or lines.	 For  example,
       the  following  output  snippet shows how marc2ris generated the	author
       lines from the MARC input:

	  <marc>empty author field (100)
	  <marc>:Author(Ind1): 1
	  <marc>:Author($a): Ershov, A.	P.
	  <marc>:Author($b):
	  <marc>:Author($c):
	  <marc>:Author(Ind1): 1
	  <marc>:Author($a): Knuth, Donald Ervin,
	  <marc>:Author($b):
	  <marc>:Author($c):
	  AU  -	Ershov,A.P.
	  AU  -	Knuth,Donald Ervin

       If you feel marc2ris does not translate your  data  appropriately,  the
       easiest	way might be to	use the	-m switch and redirect the output into
       a file. Then you	can analyze the	situation and fix the RIS lines	as you
       see fit.	Finally	you can	strip the MARC lines off with a	command	like:

	  ~$ grep -v "<marc>" <	withmarc.ris > womarc.ris

FILES
       /usr/local/etc/refdb/marc2risrc
	  The global configuration file	of marc2ris.

       $HOME/.marc2risrc
	  The user configuration file of marc2ris.

SEE ALSO
       RefDB (7), bib2ris (1), db2ris (1), en2ris (1), med2ris (1).

       RefDB		    manual		  (local		 copy)
       <prefix>/share/doc/refdb-<version>/refdb-manual/index.html

       RefDB manual (web) <[4]http://refdb.sourceforge.net/manual/index.html>

       RefDB on	the web	<[5]http://refdb.sourceforge.net/>

AUTHOR
       marc2ris	was written by Markus Hoenicka <markus@mhoenicka.de>.

REFERENCES
       1. Library of Congress MARC pages
	  http://www.loc.gov/marc/

       2. here
	  http://www.ifla.org/VI/3/p1996-1/sec-uni.htm

       3. PDF document
	  www.bl.uk/services/bibliographic/marcchange.pdf

       4. http://refdb.sourceforge.net/manual/index.html
	  http://refdb.sourceforge.net/manual/index.html

       5. http://refdb.sourceforge.net/
	  http://refdb.sourceforge.net/

2005-10-16			  2005-10-16			   MARC2RIS(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=marc2ris&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help