Here's a program I wrote for our tech services department to make
brief bibs from vendor spreadsheets.
Tech services get a spreadsheet from the vendor with information about
newly ordered material. Each vendor obviously has a different data
layout.
They copy the order spreadsheet into a new spreadsheet based on a
template. They've created a template for each of their vendors. The
first row has the desired MARC tags for the data in the column. The
second row has default values. Third - nth row is vendor data.
Then they export the data to a tab delimited file and run this
program. It makes MARC records that are loaded into III Millenium.
Feedback is welcome :)
Rich Ackerman
Nova Southeastern University
[EMAIL PROTECTED]
# perl script to create MARC records from a tab delimited textfile
# rich ackerman, april 25, 2007
#
# Updates: June 28, 2008 - generalize to new input data structure
#
##########################################################################
#
# This program makes MARC records from an input file, ready for
# uploading to NSU's Millenium system. If there are any problems
# with the data an error message is printed.
#
##########################################################################
#
# Input: The input is a tab delimited file. There are three types
# of input lines.
#
# 1. The first line has MARC tag definitions for each
column.
# Every column must have a MARC tag definition.
#
# There are three different kinds of MARC tag definitions.
#
# 1a. Full Tag Definition - always six characters long.
#
# A full tag defines a complete MARC tag with indicators and a
subfield.
# If you want to specify indicators do it in the first instance of
the tag.
#
# Position Value
#
# 1-3 Field Tag
# 4 First Indicator - space means
"no indicator"
# 5 Second Indicator - space means
"no indicator"
# 6 Subfield Code
#
# Examples: "1001 a", "260 b", "935 a", "24514a"
#
# 1b. Subfield Definition - always five characters long
#
# A subfield definition defines a tag with no
indicators OR adds a
# subfield to an existing tag. The program will
create the tag with
# blank indicators if it is not already defined.
#
# If the tag was defined in an earlier column, the subfield will be
added.
#
# Position Value
#
# 1-3 Field Tag
# 4 Pipe (|)
# 5 Subfield code
#
# Examples: 960|a, 961|b
#
# 1c. Skip Column Definition - always three characters
long
#
# A tag 000 means skip the column. This will be used to void
columns of
# data that are not wanted in the MARC record.
#
# 2. The second line specifies an optional default value for each
MARC tag
# in the first line.
#
# 3. Lines 3-n contain data. The default value from the second line
is used
# if there is no value given.
#
##########################################################################
#
# Output: Two files containing MARC data. One is a compiled binary
for loading into Millenium.
# The second is a text image for human inspection.
#
##########################################################################
#
# Calling sequence: p>perl bb.pl FILENAME [FILENAME2 ...]
#
##########################################################################
use MARC::Record;
use MARC::Field;
use strict;
my $i; # count number of lines of input
my @marc; # array to store MARC tags (line 1)
my @defaults; # array to store default values (line 2)
my @data; # array to store each input line (line 3 - n)
my $SKIPCOLUMN = '000'; # flag that says to ignore the whole column
# loop through multiple input files (even though we only expect one
I'm lazy)
foreach (@ARGV) {
# print "\nProcessing $_\n";
$i = 0; # simple counter to count number of record processed
my $inputfile = $_;
my ($name, $ext) = split (/\./, $inputfile);
my $outputfile = $name . '.mrc';
my $textdump = $name . '.mrc.txt';
# open the input stream
open (INPUT, $inputfile) or die "Cannot open input file $inputfile:
$!";
# open the output streams
open (OUTPUT, "> $outputfile") or die $!;
open (TEXTDUMP, "> $textdump") or die $!;
# read each line of the current input file and make a MARC record for
it
while ( <INPUT> ) {
chomp($_); # get rid of line ending
@data = split (/\t/);
# First line? Save MARC tag definitions
if ( $i == 0 ) {
@marc = @data;
print "Saving MARC definitions @marc\n";
# Second line? Save default values
} elsif ($i == 1) {
@defaults = @data;
print "Saving defaults @defaults\n";
# Data line? Process it
} else {
# Feedback is a good thing
print "Processing $data[0]\n";
# create new record for this input line
my $record = MARC::Record->new();
# Go through each column of data. Each column has a
MARC tag
# in @marc and a default value in @defaults. If the
MARC tag
# is the skip value then we just skip the column -
information from
# the vendor that we just don't need to bother with.
for ( my $j = 0; $j < @marc; $j++ ) {
next if $marc[$j] eq $SKIPCOLUMN;
# get the value we are going to set
my $value = $data[$j] ? $data[$j] :
$defaults[$j];
# Now figure out what tag we are updating
my ($tag, $ind1, $ind2, $subfield) = '';
if ( length($marc[$j]) == 6 ) {
# This is a full one, like 24514a or
156 3b
$tag = substr($marc[$j], 0, 3);
$ind1 = substr($marc[$j], 3, 1);
$ind2 = substr($marc[$j], 4, 1);
$subfield = substr($marc[$j], 5, 1);
} elsif (length($marc[$j]) == 5 ) {
# Just a subfield, like 960|a
$tag = substr($marc[$j], 0, 3);
$subfield = substr($marc[$j], 4, 1);
} else {
# This should never happen...
my $errlen = length ( $marc[$j] ) ;
print "Hey, this $marc[$j] is the wrong
length: $errlen!\n";
die;
}
my $field = $record->field($tag);
# If the field exists and there is data to be
stored in it,
# then add another subfield.
if ($field && $tag != 961) {
$field->add_subfields( $subfield =>
$value);
# print "Adding $value to $subfield\n";
} else {
# Need to create a new field
$field = MARC::Field->new (
$tag, $ind1, $ind2,
$subfield => $value);
$record->append_fields($field);
}
}
# Finally put the record in the output file
print OUTPUT $record->as_usmarc();
print TEXTDUMP $record->as_formatted();
print TEXTDUMP "\n\n";
}
++$i; # bump line counter
}
close(OUTPUT);
close(TEXTDUMP);
$i = $i - 2; # don't count the line with the default settings on it
or the MARC tags
print "\n\nPut $i records into $outputfile and $textdump \n\n";
}
On Nov 13, 2008, at 12:40 AM, Saiful Amin wrote:
On Wed, Nov 12, 2008 at 11:03 PM, Yan Liao
<[EMAIL PROTECTED]> wrote:
Does any one know how to trans data from Excel to MARC? There is a
rush
local project and I don't have time to review my PERL class notes.
The best method would be to convert the Excel file into a delimited
text
file (using 'Save As ..') and use MARC::Record to write the MARC
data by
hand. There is a marvelous tutorial available for this module [1]
with lot
of real examples. Once you get comfortable with this module you
probably
won't look anywhere else for MARC related jobs in future.
Having said that, I've just started working on a personal project of
creating a web-based utility to enable conversion of Excel data into
MARC21.
You can have look at the pre-alpha version in [2]. It is far from
complete,
e.g., the download option in iso2709 or marcxml is yet to be
written. Any
feedback would be welcome.
[1]
http://search.cpan.org/~mikery/MARC-Record-2.0.0/lib/MARC/Doc/Tutorial.pod
[2] http://122.166.0.252/cgi-bin/xl2marc.pl
--
Saiful Amin
Edutech India Pvt Ltd
Bangalore, India.
www.edutech.com