> ...the ILS can be upgraded to a new version and  and
> people can start using Unicode, not only for Western
> European languages, but also for languages like Thai.

This is not really apropos to the discussion at hand, but since Thai was
mentioned I thought I would contribute my two cents on an issue that
perhaps not everyone is aware of...  

Although the ILS itself will be able to accommodate the full Unicode
repertoire, according to the MARC 21 specifications, the MARC 21
UCS/Unicode environment is simply the MARC-8 character repertoire
translated into the Unicode equivalent code points.  One of the things
that means is that characters in vernacular alphabets such as Thai are
*not* valid characters in MARC 21 records.  The rational behind this
approach to implementing Unicode is based on the ability to translate
MARC data back and forth (i.e. "round trip") between the MARC-8 and
Unicode character sets [1].  Supported alphabets (and/or ideographs) are
Latin, Greek, Cyrillic, Arabic, Hebrew, and East Asian (CJK) [2].

I think our ILS is fairly typical as to implementation of Unicode [3].
There is nothing stopping you from creating, storing, and displaying
MARC records in Thai (or any other vernacular language) -- other than an
institutional decision to adhere to the MARC 21 standard.  Of course,
the ILS software clients also have validation rules that can be turned
on (or off, since not everyone uses MARC 21).

At some point, when a large enough portion of the library world has
upgraded their systems to MARC Unicode, round tripping will no longer be
a constraint and the MARC 21 standard will be revised to include the
full range of Unicode characters, but that is liable to be awhile.

[1] Coded Character Sets > A Technical Primer for Librarians > MARC
Unicode 
    http://rocky.uta.edu/doran/charsets/unicode.html

[2] An exception is the Unified Canadian Aboriginal Syllabic character
set, which is not defined in MARC-8 but is permitted in the MARC
UCS/Unicode environment. 

[3] Endeavor's Voyager - and we are scheduled for the Unicode version
upgrade on Monday

-- Michael

# Michael Doran, Systems Librarian
# University of Texas at Arlington
# 817-272-5326 office
# 817-688-1926 cell
# [EMAIL PROTECTED]
# http://rocky.uta.edu/doran/ 

Reply via email to