Not all non-unicode text streams can be reliably detected, as they could be
ambiguous to eachother.

And the attached file is actually in UTF-8, not Big5, confirmed by file(1)
as well as manual reading:

        $ file 111D2012841-01.txt
        111D2012841-01.txt: Unicode text, UTF-8 text
        $ iconv -f utf-8 -t big-5 111D2012841-01.txt | file -
        /dev/stdin: ISO-8859 text
        $ iconv -f utf-8 -t gb18030 111D2012841-01.txt | file -
        /dev/stdin: Non-ISO extended-ASCII text, with LF, NEL line terminators

Though the GB18030 result appears more correct than Big5.

Reply via email to