> A MARC-8 sequence places a combining diacritical mark BEFORE the letter 
> it's supposed to combine.  Whereas Unicode syntax is to put it AFTER the 
> letter it's supposed to combine with.  
>  
> Hence for example the letter: ZÌ
> is produced by the MARC-8 Sequence: 
> 75 5A (macron below + "Z")
> but 
> 0331 005A  ("Z" + Combining Macron below) in Unicode.
>  
> I believe if you don't account for this in your UTF-8 transformation, you 
> will get either no combining or combining with the wrong character.

Just FYI in case anyone is curious about what MARC::Charset does, to_utf8() 
will take care of repositioning the diacritics from before to after the 
character that they modify. 

//Ed

Reply via email to