Many thanks to Jon Gorman who provided lots of tips on diagnosing my problem. The first of his debugging points turned out to be the issue -- I needed to correct Oracle's NLS_LANGUAGE setting in the registry to AMERICAN_AMERICA.US7ASCII. Once that was done, the program ran fine.
Anne L. Highsmith Consortia Systems Coordinator 5000 TAMU Evans Library Texas A&M University College Station, TX 77843-5000 hism...@tamu.edu 979-862-4234 979-845-6238 (fax) >>> Jon Gorman <jonathan.gor...@gmail.com> 6/25/2009 12:46 PM >>> Hi Anne, Off the top of my head, the \xBF is refering to the end of a byte order mark which is causing issues. After all, if it's having issues with any unicode character, that byte should change depending on the record. Utf-8 doesn't really need the BOM and it can cause issues if present from my understanding. (Utf-8 shouldn't have endian problems, but something seeing the byte order mark might try to "fix" a non-existent problem. Depending on the endian of the machine, this will work or not). Look at the record that can be successfully created in some sort of hex editor to see if the first few bytes are EFBBBF or similar. If it does you might just want to check for those bytes in the beginning and yank them. Some debugging steps I might do... 1) If the database is an oracle one, look and compare the various NLS_LANGUAGE settings for both of the machines. Are they the same? Then look at the various driver configurations. After all, if it's not reading unicode information correctly from the database, it could cause issues when trying to create the record. 2) Instead of trying to do both the read from the database and creating the MARC object in one program, split the responsibilities. Have one script just write the results of the database feeds to disk and check to see if the unicode looks right. Compare the output as how it appears on both machines. Again, a hex editor of some sort is a good choice here. 3) If the above looks identical and there's not a bom, try taking the output from the new machine and running it on the old and vica-versa. 4) If all the above fail, try to get the activestate and modules to the same version as the working machine. Is it still broken? ActiveState and the various modules on it are not always the most recent than their cpan counterparts. Jon Gorman |
- Desperately seeking unicode (PC) Anne L. Highsmith
- Re: Desperately seeking unicode (PC) Shawn Boyette
- Fwd: Re: Desperately seeking unicode (PC) Anne L. Highsmith