OK, I think I got this figured out and working, finally.

What I did is this. After file upload, I opened the uploaded file,
decoded it and re-encoded, but using my own auto-detect routine rather
than gluon's decoder routine. This worked!

I'm curious why gluon's autodetector doesn't work. At first
inspection, I saw this:

autodetect_dict={ # bytepattern     : ("name",
                (0x00, 0x00, 0xFE, 0xFF) : ("ucs4_be"),
                (0xFF, 0xFE, 0x00, 0x00) : ("ucs4_le"),
                (0xFE, 0xFF, None, None) : ("utf_16_be"),
                (0xFF, 0xFE, None, None) : ("utf_16_le"),
                (0x00, 0x3C, 0x00, 0x3F) : ("utf_16_be"),
                (0x3C, 0x00, 0x3F, 0x00) : ("utf_16_le"),
                (0x3C, 0x3F, 0x78, 0x6D): ("utf_8"),
                (0x4C, 0x6F, 0xA7, 0x94): ("EBCDIC")
                 }


It looks as if utf_16_be and utf_16_le byte patterns are defined
twice. That can't be right, can it? Nevertheless, the code above
shouldn't be a problem in and of itself.

Anyway, I would be happy to contribute my autodetector to the web2py
codebase if it helps.

Reply via email to