Eric,

> How can I figure out whether or not a MARC record contains ONLY characters 
> from the UTF-8 character set?

You can use a regex to check if a string is utf-8. There are various examples
floating around the internet. An example is the one here:

   http://www.w3.org/International/questions/qa-forms-utf-8

You'll need to add the MARC control characters ^_, ^^, and ^] to the ASCII part
of the expression in the above page. (I think the w3c example is aimed at XML1.0
in which the MARC control characters are not allowed.)

Ashley.
--
Ashley Sanders a.sand...@manchester.ac.uk
http://copac.ac.uk -- A Mimas service funded by JISC at the University of 
Manchester

Reply via email to