On 9/6/06, aiening <[EMAIL PROTECTED]> wrote:
>text/html; lynx --dump %s |iconv -f cp936 -t utf-8; nametemplate=%s.html; copiousoutput >-- >Wang Xu 兄弟,我懂你那代码的意思,不过实践起来就有点麻烦,比如有什么非法输入序列, 如果加入-c,转传出来的代码就有一部分有问题,你认为是iconv造成的问题,还是 html文本本身不规范,里边有很多非GB的编码符号呢?
How about using gbk or gb18030 instead of gb2312? Many webpages are claimed as GB2312, but actually GBK or GB18030. w3m is a much better choice than lynx. -- Best Regards Carlos