Re: Displaying diacritics in a terminal vs. a browser

2004-07-09 Thread Ed Summers
On Thu, Jul 08, 2004 at 01:17:48PM -0400, Houghton,Andrew wrote: > Unicode specifies four normalization methods, NFC, NFD, NFKC, > and NFKD. While RDF could have just accepted characters in > unnormalized form, it decided to mandate that all data content > be provided in NFC normalization form. T

RE: Displaying diacritics in a terminal vs. a browser

2004-07-06 Thread Michael D Doran
> MARC-XML uses Unicode Normal form D, which means that the base > character is separate from the diacritic. I am not familiar with the MARC-XML specifications, so at the risk of embarrasing myself would it be correct to posit that it may not be that MARC-XML uses Unicode Normal form D, so much as

RE: Displaying diacritics in a terminal vs. a browser

2004-07-06 Thread Michael D Doran
Hi Andy, > From: Houghton,Andrew [mailto:[EMAIL PROTECTED] > > It just so happens that I have recently been converting > MARC-XML to RDF. The RDF specification mandates Unicode > Normal form C, which means that the base character and the > diacritic are combined. That's rather unfortunate, s

Re: Displaying diacritics in a terminal vs. a browser

2004-07-01 Thread Ed Summers
> A MARC-8 sequence places a combining diacritical mark BEFORE the letter > it's supposed to combine. Whereas Unicode syntax is to put it AFTER the > letter it's supposed to combine with. > > Hence for example the letter: ZÌ > is produced by the MARC-8 Sequence: > 75 5A (macron below + "Z")

RE: Displaying diacritics in a terminal vs. a browser

2004-07-01 Thread Christopher Morgan
an to do that. Thanks again! - Chris _ From: Jacobs, Jane W [mailto:[EMAIL PROTECTED] Sent: Thursday, July 01, 2004 1:51 PM To: 'Christopher Morgan' Subject: RE: Displaying diacritics in a terminal vs. a browser Hi Chris, I hope my analysis is correct; I think that two pro

Re: Displaying diacritics in a terminal vs. a browser

2004-07-01 Thread Ed Summers
On Thu, Jul 01, 2004 at 11:22:42AM -0400, Houghton,Andrew wrote: > I'm not sure what MARC::Charset does internally, but MARC-8 > defines the diacritic separate from the base character. So > even using binmode(STDOUT,":utf8") will produce two characters, > one for the base character followed by t

RE: Displaying diacritics in a terminal vs. a browser

2004-07-01 Thread Houghton,Andrew
> From: Paul Hoffman [mailto:[EMAIL PROTECTED] > Sent: 01 July, 2004 11:57 > Subject: Re: Displaying diacritics in a terminal vs. a browser > > Unless I'm very much mistaken, Chris's code is outputting > UTF-8 to the terminal, not MARC-8. > >> From: Christ

Re: Displaying diacritics in a terminal vs. a browser

2004-07-01 Thread Paul Hoffman
Unless I'm very much mistaken, Chris's code is outputting UTF-8 to the terminal, not MARC-8. The key is to find a terminal program that correctly displays UTF-8. I doubt you'll have any trouble finding one -- for example, there are at least two for Mac OS X alone (Terminal.app and iTerm). Depending

RE: Displaying diacritics in a terminal vs. a browser

2004-07-01 Thread Christopher Morgan
Andy, Many thanks. I'd be interested in looking at your scripts if you do post them! -- Chris -Original Message- From: Houghton,Andrew [mailto:[EMAIL PROTECTED] Sent: Thursday, July 01, 2004 10:23 AM To: [EMAIL PROTECTED] Subject: RE: Displaying diacritics in a terminal vs. a br

RE: Displaying diacritics in a terminal vs. a browser

2004-07-01 Thread Houghton,Andrew
> From: Christopher Morgan [mailto:[EMAIL PROTECTED] > Sent: 01 July, 2004 10:50 > Subject: Displaying diacritics in a terminal vs. a browser > > I use the $cs->to_utf8 conversion from MARC::Charset to > display MARC Authority records in a browser, and the > diacritics display properly there. >