I've updated the cvs for MARC::File::XML with what I described below,
with one caveat.  The one difference from what I was planning is that,
because as_xml() is generated by MARC::Record, I can't give it new
parameters.  To test exporting to XML you'll need to set the record
format for export either in the use line for the module or using the
default_record_format() class method.  Just call that with 'UNIMARC'
as the parameter and then export your record as normal using as_xml()
on the MARC::Record object.

(new_from_xml() does not suffer from this as that method is defined in
MARC/File/XML.pm, so it takes both an encoding parameter and a format
paramter, as explained in the documentation.)

Will some brave soul please test this with some UNIMARC records and
let me know how it goes?

-----------------------------------

CVS checkout intsructions
  cvs -d:pserver:[EMAIL PROTECTED]:/cvsroot/marcpm login
  cvs -z3 -d:pserver:[EMAIL PROTECTED]:/cvsroot/marcpm co
-P marc-xml

Then,
  cd marc-xml
  perl Makefile.PL
  make
  make test

And assuming 'make test' succeeds ...
  make install

-------------------------------

Thanks in advance,

--miker

On 3/16/06, Mike Rylander <[EMAIL PROTECTED]> wrote:
> I've been attempting to beat the MARC::File::XML stuff into a usable
> shape as of late, so I'm going to take a stab at fixing this.  There
> will be some limitations (at first) as to what encodings we'll accept
> for UNIMARC records, but I'll cover the cases that I know about (and
> understand).
>
> Here's the plan:
>
> I will add a use flag to set the script-wide default for record format
>
>   use MARC::File::XML ( RecordFormat => 'UNIMARC' );
>
> that will default to MARC21.  There will also be a class method to set this 
> flag
>
>   MARC::File::XML->default_record_format( 'UNIMARC' );
>
> and, finally, a flag to both as_xml and new_from_xml to tell
> MARC::File::XML about individual records.  I don't think, at this
> point, we should autodetect based on the existence of a 200 tag, as
> I'd like to stay away from heuristics if it can be avoided.  If others
> disagree, please make the case!
>
> When processing a UNIMARC record, I'll look in 100$a for the encoding,
> and proceed if it's either 01 (iso646 -- nominally compatible with
> iso8859, though it requires interpretation) or 50 (UNICODE, which will
> always mean UTF8 in XML produced by MARC::File::XML).  If it's
> anything else an error will be thrown.  We can add support for other
> encodings as the direct need arises.
>
> For UNIMARC/UNICODE, the XML is obviously going to be UTF-8 encoded.
> For UNIMARC/ISO646, the XML will be marked as ISO-8859-1.  Yes, it's a
> bit of a fib, but most XML parsers don't support ISO646, and most do
> support LATIN1 (8859-1), and the bytes won't get mangled by the parser
> in that case.
>
> Comments?
>
> On 3/16/06, Zeno Tajoli <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > >PROBLEM :
> > >* in MARC21, the encoding is defined by position 9 of the leader.
> > >'a' means UTF-8
> > >* in UNIMARC, this is an empty position ! the encoding is in
> > >positions 26-27 and 28-29 of 100$a (<200 are all fixed coded fields
> > >in unimarc : http://bibliotheque.bgp-fr.com/Unimarc_abrege.pdf, page
> > >8 for 100$a)
> > >
> > >BIG PROBLEM :
> > >MARC::File::XML only checks for position 9, thinking the XML is
> > >necessary a marc21 file.
> > >
> > >I think (& joshua agrees) we will have to hack MARC::File::XML to
> > >solve this problem.
> > >We have 2 solutions :
> > >* add a test to define wether we are UNIMARC or MARC21. In UNIMARC,
> > >title is in 200, while 200 is empty in MARC21.
> > >* add a parameter to ->new_as_xml($xml,'UTF-8','UNIMARC') to specify
> > >we are sending the parser an unimarc file.
> >
> > as a person that has write a Unimarc -> MARC21 converter, I prefer
> > the second solution.
> >
> > Thanks for all
> > Bye
> >
> > Zeno Tajoli
> > CILEA - Segrate (MI)
> > tajoliAT_SPAM_no_prendiATcilea.it
> > (Indirizzo mascherato anti-spam; sostituisci quanto tra AT con @)
> >
> >
>
>
> --
> Mike Rylander
> [EMAIL PROTECTED]
> GPLS -- PINES Development
> Database Developer
> http://open-ils.org
>


--
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

Reply via email to