RE: Early Confusion with MARC::Record

2003-11-14 Thread Bryan Baldus
First, an introduction: I am a cataloger/librarian. I have very limited
programming experience (2-3 classes--Pascal, C(++?) and database
introduction), and am attempting to teach myself Perl with Coriolis' Perl
Black Book.  I have only made it through parts of the book, enough to
understand the basics of the various files (modules) in MARC::Record.  

Morbus Iff wrote:
> Is the "title" of a book always considered the title AND author? Or is the
"author" and "statement of responsibility" the .. . .
>same thing? Does $a always end in a slash if $b doesn't exist? What if $a,
$h, $p, $b, and $c are all used? Is that order 
>(per the LC UMB doc) the method they should be strung together, separated
by slashes? Is the ending period, generated 
>by MARC::Batch an AACR thingy, or just convention?


According to AACR2 rules, the title proper of the book consists of $a $n $p.
Next comes either $h (General Material Description), $b (other title
information), and $c (statement of responsibility). $c is (as far as I know)
always preceeded by /. A recent change to the MARC21 format allows $n or $p
to follow, as well as preceed, $b. $h follows the title proper. A 245 field
always ends in a period.

In another message, Morbus Iff wrote:
>And moving further on, Example 3 starts talking about "warnings", and
follows up with "strict_off". This confused me,
>as I tried to associate their meaning with what I know about Perl's own
warning and strict pragma's

This is something I was wondering about, in attempting to process a few
files of records. I turned "strict_off", to get through the entire file, but
the exported records have had "Invalid indicators forced to blanks." I would
like to get these records, and any others that generate errors, and save
them (as originally read) to a separate file, in order to correct the
errors.  Since I don't fully understand Perl, my solution was to modify
MARC::File, by adding a method, "skipget()", based on the existing "skip()",
but returning $rec : undef, instead of 1 : undef. My understanding is that
this should return the raw, unchanged marc string from the original file.  

In another message, Morbus Iff wrote:

>Is the LC server the "definitive" Z39.50 database? If I suck down a record
from there, send it to my database, 
>add more information, etc., etc., how does it get back to the LC? Does it?
Would the way I'd contribute be 
>to simply run and promote my own Z39.50 server?

As far as I understand it, Z39.50 is simply a transmission protocol. LC's
catalog is a large souce of freely accessible records, which greatly assists
in the cooperative cataloging process.  If you get a record from there and
change it, the changes will probably not make it back to LC. (Generally LC
maintains its own records, accepting change requests through the Cataloging
Policy and Support Office (CPSO) [1]) The easiest way (in this context) to
distribute your records would probably be to run and promote your own Z39.50
server. 

[1]CPSO Web site: http://www.loc.gov/catdir/cpso/

Hope this helps,

Bryan Baldus
Cataloger
Quality Books, Inc.
The Best of America's Independent Presses
1-800-323-4241x460
[EMAIL PROTECTED]



Re: Early Confusion with MARC::Record

2003-11-14 Thread Paul Hoffman
On Thursday, November 13, 2003, at 10:38  PM, Morbus Iff wrote:

Next up, I've been using the camel.usmarc file as the "file.dat"
equivalent in all the examples. When I ran the first example, I got:
  ActivePerl with ASP and ADO / Tobias Martinsson.
  ...
  Cross-platform Perl / Eric F. Johnson.
which confused me. Is the "title" of a book always considered the 
title AND
author?
Warning: IANAC[ataloger], but...

The 245 field ("Title statement") *must* have an $a subfield ("title" 
or "title proper" without subtitles, according to my copy of OCLC's 
Bibliographic Formats and Standards, 1993, which is normative only for 
WorldCat I believe).  All other subfields, including $c ("remainder of 
title page transcription/statement of responsibility") are technically 
optional but should be used whenever applicable.

Basically, from my imperfect understanding, if you can find a statement 
of responsibility on the title page, then the $c subfield should be 
used.  (Right?)  This may be an editor, in which you'll have something 
like "$c edited by ...".  Of course, it's not always obvious what 
should be considered the title page.

Or is the "author" and "statement of responsibility" the .. . .
same thing?
Not exactly, since the statement of responsibility may designate an 
editor.

If there is a person (or more than one) responsible for the creation of 
the work, then their name (or other identifying phrase, e.g., "Author 
of 'Let's have a revolution!'") should go in a 100 field ("Main 
entry--personal name").  This excludes editors, translators.  In 
practical terms, as I understand it, if their name is on the title 
page, then they also belong in 245 $a.

I suggest looking at MARC records for works that you own, comparing the 
MARC record with the title page etc.  That should help you get a better 
feel for practical MARC usage more quickly than just reading 
documentation.

Paul.

--
Paul Hoffman :: Taubman Medical Library :: Univ. of Michigan
[EMAIL PROTECTED] :: [EMAIL PROTECTED] :: http://www.nkuitse.com/


Re: Early Confusion with MARC::Record

2003-11-14 Thread Paul Hoffman
On Friday, November 14, 2003, at 09:40  AM, Bryan Baldus wrote:

I turned "strict_off", to get through the entire file, but
the exported records have had "Invalid indicators forced to blanks." I 
would
like to get these records, and any others that generate errors, and 
save
them (as originally read) to a separate file, in order to correct the
errors.  Since I don't fully understand Perl,
Who does?  :-)

my solution was to modify
MARC::File, by adding a method, "skipget()", based on the existing 
"skip()",
but returning $rec : undef, instead of 1 : undef. My understanding is 
that
this should return the raw, unchanged marc string from the original 
file.
Just FYI, you can modify MARC::File without modifying MARC/File.pm 
itself.  The trick is simply to use a fully qualified name when 
defining the new MARC::File method:

#!/usr/bin/perl -w
use strict;
$| = 1;
use MARC::File;
sub MARC::File::skipget {
... your code here ...
}
... the rest of your script here ...
Better yet, since skipget() might end up in MARC::File some day, you 
can do this:

if (UNIVERSAL::can('MARC::File', 'skipget')) {
warn "Edit this script--MARC::File now has a skipget() method";
} else {
*MARC::File::skipget = sub {
... your code here ...
};
}
These tricks sometimes come in handy.

Paul.

--
Paul Hoffman :: Taubman Medical Library :: Univ. of Michigan
[EMAIL PROTECTED] :: [EMAIL PROTECTED] :: http://www.nkuitse.com/


[patch] warn, not croak, on 010 non-tag access.

2003-11-14 Thread Morbus Iff

Problem:

  When you request an indicator or subfield for tags less than
  010, MARC::Field will croak(), causing the script to fail.


Discussion:

  Since MARC tags less than 010 can not have indicators or subfields,
  not allowing those ::Field methods to be called on those tags make sense.
  However, this should be a warn(), not a croak(), otherwise looping
  code will need to conditionally check tag numbers before continued
  processing.

  For example, in MARC::Doc::Tutorial, one of the examples is:

use MARC::Batch;
my $batch = MARC::Batch->new('USMARC','file.dat');
my $record = $batch->next();

my @fields = $record->fields;

foreach my $field (@fields) {
  print
$field->tag(), " ",
$field->indicator(1),
$field->indicator(2) || undef, " ",
$field->as_string, " ",
"\n";
}

  The purpose of the code is to print out each and every bit of the
  MARC record. If "file.dat" is the t/camel.usmarc file, the above
  code immediately fails:

Fields below 010 do not have indicators at test.pl line 13

  For the code to work as intended, revisions like this would be needed:

use MARC::Batch;
my $batch = MARC::Batch->new('USMARC','file.dat');
my $record = $batch->next();

my @fields = $record->fields;

foreach my $field (@fields) {
  print $field->tag(), " ";

  if ($field->tag() > 10) {
print $field->indicator(1);
print $field->indicator(2);
  }

  print $field->as_string, " \n";
}

  There's certainly nothing wrong with the above code, but it does
  seem like extra work mental work that should, and can, be prevented.


Possible Solution:

  Instead of croaking, warn() instead, with the ability to
  warnings_off(). The attached patch to the latest CPAN
  MARC::Field implements this, changing the following croaks
  to $self->_warn:

Field.pm: croak( "Fields below 010 do not have indicators" )
Field.pm: croak( "Fields below 010 do not have subfields" )
Field.pm: croak( "data() is only for tags less than 010" )

  The example recipe will also need to be revised to handle
  undef'd values - the relevant print statement is below.
  Taking care to check for undef's is extra code, but more
  in line with careful Perl programming than MARC rules.

print
  $field->tag(), " ",
  defined $field->indicator(1) ? $field->indicator(1) : "",
  defined $field->indicator(2) ? $field->indicator(2) : "",
  $field->as_string, " ",
  "\n";
}


Other Notes:

  I believe there's an error in the ::Field POD documentation.
  In the indicator and subfield intro's, we're told that MARC::
  Field::ERROR will be set at various instances. I don't believe
  this is the case - the only time ERROR is set is when the _gripe
  routine is called, and the _gripe routine is .. . never .. called.
  MARC::Record calls its own internal _gripe routine twice:

 return _gripe( $MARC::Field::ERROR );

  but would seem to always return undef, as ERROR is never set.

  The attached patch does not address ANY of the above, as I'm
  not sure if I'm missing something. Someone please confirm -
  I can do the leg work to correct the POD (or, fix the code,
  depending on how you think it should be fixed).

-- 
Morbus Iff ( small pieces of morbus loosely joined )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

%
Description: application/applefile


Field.patch
Description: Binary data


[patch] Revised MARC::Doc::Tutorial

2003-11-14 Thread Morbus Iff

Attached is a gzip'd patch to address a number of corrections
to the MARC::Doc::Tutorial documentation. In particular:

 * numerous grammatical, punctuation and spelling errors fixed (unfinished).
 * "listserv" was replaced with "mailing list". Listserv is a trademark.
 * MARC::Lint was added to the list of modules installed with package.
 * added my name and email to the contributors.

 Concerning the "READING" section:

  * there are no longer two example 5's.
  * minor tweaks to code to match styles (floating spaces in
some code, missing periods from end of comments, etc.).
  * all recipes now show sample output, based on camel.usmarc.
  * some expositions have been extended and/or revised to
reflect alternatives, other recipes, and caveats.

I have plans to attack the rest of the Tutorial as well, but that probably
won't happen for a while, unfortunately - my next few weeks are booked.

-- 
Morbus Iff ( evil is my sour flavor )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

%
Description: application/applefile


Tutorial.patch.gz
Description: Binary data


Re: [patch] Revised MARC::Doc::Tutorial

2003-11-14 Thread Ed Summers

Wow, great work Morbus. Why don't you become a SourceForge developer and we'll
get you set up with CVS access? But as you are doing already, lets discuss 
the changes on the list. 

_gripe() was used at one point but as you can tell it's not anymore. It
should be phased out.

//Ed


Re: [patch] Revised MARC::Doc::Tutorial

2003-11-14 Thread Morbus Iff
>Wow, great work Morbus. Why don't you become a SourceForge developer and we'll

Yup, I'm "morbus". (Did you get my other email on my SF work?)

>_gripe() was used at one point but as you can tell it's not anymore. It

Yeah, I checked the CHANGES and saw it mentioned back in
2002, seemingly being phased out September 10th, 2002.

-- 
Morbus Iff ( there is no morbus, there is only zuul! )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus


Re: Early Confusion with MARC::Record

2003-11-14 Thread Ed Summers
On Thu, Nov 13, 2003 at 10:38:59PM -0500, Morbus Iff wrote:

> Which brings me to my first question: why
> isn't MARC::File::XML installed with it?

It requires utf8, and consequently 5.8.0 at least.

//Ed


Re: [patch] Revised MARC::Doc::Tutorial

2003-11-14 Thread Ed Summers
On Fri, Nov 14, 2003 at 09:25:16PM -0500, Morbus Iff wrote:
> Yup, I'm "morbus". 

Ok, you've been added. Go gently :) The name of the game is that you add
a unit test when you add functionality. Good to have you on board!

//Ed