Dear Jaganadh,
I have tried with separate individual execution as
{{{
$ python html2text.py index1.htm
Traceback (most recent call last):
File "../aaronsw-html2text-d9bf7d6/html2text.py", line 488, in <module>
data = data.decode(encoding)
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x88 in position 11366:
invalid start byte
}}}
Where index1.htm is the fetched page from "
http://www.hindu.com/nic/kye/index1.htm"
while with index2.htm , fetched from "
http://www.hindu.com/nic/kye/index2.htm " the command works fine..!!
I also tried with importing as, "from html2text import * " and calling the
function accordingly. Same results.!!
Thanks,
Nikunj
On Sun, Apr 17, 2011 at 9:21 PM, JAGANADH G <[email protected]> wrote:
>
>
> On Sun, Apr 17, 2011 at 9:13 PM, Nikunj Badjatya <[email protected]
> > wrote:
>
>>
>> Tried with the change.
>> {{{
>> ...
>> ...
>> - myunistr = smart_str(fetch)
>>
>> + myunistr = smart_str(fetch.read())
>> ...
>> ...
>> }}}
>>
>> Output:
>>
>> {{{
>> Traceback (most recent call last):
>> File "html2text.py", line 447, in <module>
>> data = open(arg, 'r').read().decode(encoding)
>> File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
>> return codecs.utf_8_decode(input, errors, True)
>> UnicodeDecodeError: 'utf8' codec can't decode byte 0x88 in position 11366:
>> invalid start byte
>> }}}
>>
>> Same error as before. !! ??
>>
>>
>
>
> I think the error is coming from this line
> os.system('python2.6 html2text.py main.html > main.txt')
>
> Insted of calling os.system try to import concerned function from
> html2text.py in the program
>
>
>
> --
> **********************************
> JAGANADH G
> http://jaganadhg.freeflux.net/blog
> *ILUGCBE*
> http://ilugcbe.techstud.org
>
>
_______________________________________________
BangPypers mailing list
[email protected]
http://mail.python.org/mailman/listinfo/bangpypers