Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
Net::OAI::Record::NameUsereContributed PerNet::OAI::Record::NamespaceFilter(3)

       Net::OAI::Record::NamespaceFilter - general filter class	based on
       namespace URIs

	$plug =	Net::OAI::Record::NamespaceFilter->new(); # Noop

	$multihandler =	Net::OAI::Record::NamespaceFilter->new(
	   '' => 'Net::OAI::Record::OAI_DC',
	   '' => 'MySAX::ProvenanceHandler'

	$saxfilter = new SOME_SAX_Filter;
	$filter	= Net::OAI::Record::NamespaceFilter->new(
	   '*' => $saxfilter, #	'*' for	any namespace

	$filter	= Net::OAI::Record::NamespaceFilter->new(
	  '*' => sub { my $x = "";
		       return XML::SAX::Writer->new(Output => \$x);

       It will forward any element belonging to	a namespace from this list to
       the associated SAX filter and all of the	element's children (regardless
       of their	respective namespace) to the same one. It can be used either
       as a "metadataHandler" or "recordHandler".

       This SAX	filter takes a hashref "namespaces" as argument, with
       namespace URIs for keys ('*' for	"any namespace") and the values	are

	   Matching elements and their subelements are suppressed.

	   If the list of namespaces ist empty or "undefined" is connected to
	   the filter, it effectively acts as a	plug to	Net::OAI::Harvester.
	   This	might come handy if you	are planning to	get to the raw result
	   by other means, e.g.	by tapping the user agent or accessing the
	   result's xml() method:

	    $plug = Net::OAI::Record::NamespaceFilter->new();
	    $harvester = Net::OAI::Harvester->new( [
		baseURL	=> ...,
		] );

	    $tapped_by_ua = "";
	    open ($TAP,	">", \$tapped_by_ua);
	    $harvester->userAgent()->add_handler(response_data => sub {
		   my($response, $ua, $h, $data) = @_;
		   print $TAP $data;

	    $list = $harvester->listRecords(
	       metadataPrefix  => 'a_strange_one',
	       recordHandler =>	$plug,

	    print $tapped_by_ua; # complete OAI	response
	    print $list->xml();	 # should be exactly the same

	   Comment: This is quite an efficient way of not processing the XML
	   content of OAI records received.

       a class name of a SAX filter
	   As usual for	any record element of the OAI response a new instance
	   is created.

	     # end_document() of instances of MyWriter returns something meaningful...
	     $consumer = Net::OAI::Record::NamespaceFilter->new('*'=> 'MyWriter');

	     $filter = Net::OAI::Record::NamespaceFilter->new(
		 '*' =>	$consumer

	     $list = $harvester->listAllRecords(
		metadataPrefix	=> 'oai_dc',
		recordHandler => $filter,

	     while( $r = $list->next() ) {
		next if	$r->status() eq	"deleted";
		$xmlstringref =	$r->recorddata()->result('*');

	   Note: The handlers are instantiated for each	single OAI record in
	   the response	and will see one start_document() and end_document()
	   event in any	case (this behavior is different from that of handler
	   class names directly	specified as "metadataHandler" or
	   "recordHandler" for a request: instances from those constructions
	   will	never see such events).

       a code reference	for an constructor
	   Must	return a SAX filter ready to accept a new document.

	   The following example returns a string serialization	for each
	   single record:

	    # end_document() events will return	\$x
	    $constructor = sub { my $x = "";
				 return	XML::SAX::Writer->new(Output =>	\$x);
	    $filter = Net::OAI::Record::NamespaceFilter->new(
		 '*' =>	$constructor

	    $list = $harvester->listRecords(
		metadataPrefix	=> 'oai_dc',
		recordHandler => $filter,

	    while( $r =	$list->next() )	{
		$xmlstringref =	$r->recorddata()->result('*');

	   Comment: This example shows an approach to insulate the "true
	   contents" of	individual response records without having to provide
	   a SAX handler class of one's	own (just the addidtional prerequisite
	   of XML::SAX::Writer). But what you get is a serialized XML document
	   which then has to be	parsed for further processing ...

       an already instantiated SAX filter
	   As usual in this case no "start_document()" and "end_document()"
	   events are forwarded	to the filter.

	    open $fh, ">", $some_file;
	    $builder = XML::SAX::Writer->new(Output => $fh);
	    $rootEL = {	Name =>	'collection',
		      LocalName	=> 'collection',
		   NamespaceURI	=> "",
			 Prefix	=> "",
		     Attributes	=> {}
	    $builder->start_element( $rootEL );

	    # filter for OAI-Namespace in records: forward all
	    $filter = Net::OAI::Record::NamespaceFilter->new(
		 '' => $builder);

	    $list = $harvester->listRecords(
		metadataPrefix	=> 'a_strange_one',
		metadataHandler	=> $filter,
	    # handle resumption	tokens if more than the	first
	    # chunk shall be stored into $fh ....

	    $builder->end_element( $rootEL );
	    # ... process contents of $some_file

	   In this example calling the "result()" method for individual
	   records in the response will	probably not be	of much	use.

       Caution:	Depending on the namespaces specified, even a handlers which
       are freshly instantiated	for each OAI record might be fed with more
       than one	top-level XML element.

   new(	[%namespaces] )
       Creates a Handler suitable as recordHandler or metadataHandler.
       %namespaces has namespace URIs for keys and values according to the
       four types described as above.

   result ( [namespace]	)
       If called with a	namespace, it returns the result of the	handler, i.e.
       what "end_document()" returned for the record in	question.  Otherwise
       it returns a hashref for	all the	results	with the corresponding
       namespaces as keys.

       Thomas Berger <>

perl v5.32.1			  2016-01-Net::OAI::Record::NamespaceFilter(3)


Want to link to this manual page? Use this URL:

home | help