On 9/6/06, aiening <[EMAIL PROTECTED]> wrote:
>text/html; lynx --dump %s |iconv -f cp936 -t utf-8; nametemplate=%s.html; 
copiousoutput
>--
>Wang Xu
兄弟,我懂你那代码的意思,不过实践起来就有点麻烦,比如有什么非法输入序列,
如果加入-c,转传出来的代码就有一部分有问题,你认为是iconv造成的问题,还是
html文本本身不规范,里边有很多非GB的编码符号呢?

How about using gbk or gb18030 instead of gb2312? Many webpages are
claimed as GB2312, but actually GBK or GB18030.

w3m is a much better choice than lynx.

--
Best Regards
Carlos

回复