Perhaps a silly question, but what encoding is the page you are getting?
You can check this by loading the page in FireFox, going to the view menu,
and selecting "character encoding". That will tell you what FireFox thinks
is the encoding.
If it's not UTF-8, you'll probably have to convert it.
-
On Sun, Apr 17, 2011 at 11:17 PM, JAGANADH G wrote:
>
>
> On Sun, Apr 17, 2011 at 11:13 PM, Nikunj Badjatya <
> nikunjbadja...@gmail.com> wrote:
>
>> Hi,
>>
>> With stripogram Its working fine.
>> Thanks a lot. :) !!
>>
>> But couldnt understand the reason behind the previous html2text
>> malfunc
On Sun, Apr 17, 2011 at 11:13 PM, Nikunj Badjatya
wrote:
> Hi,
>
> With stripogram Its working fine.
> Thanks a lot. :) !!
>
> But couldnt understand the reason behind the previous html2text malfunction
> for that particular (index1.htm) link.??!
>
>
beacuse html2text encounters a problem with ut
Hi,
With stripogram Its working fine.
Thanks a lot. :) !!
But couldnt understand the reason behind the previous html2text malfunction
for that particular (index1.htm) link.??!
On Sun, Apr 17, 2011 at 10:28 PM, JAGANADH G wrote:
>
>>
> Hi
> Do the following things
>
> install the python li
Dear Jaganadh,
I have tried with separate individual execution as
{{{
$ python html2text.py index1.htm
Traceback (most recent call last):
File "../aaronsw-html2text-d9bf7d6/html2text.py", line 488, in
data = data.decode(encoding)
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in
On Sun, Apr 17, 2011 at 9:13 PM, Nikunj Badjatya
wrote:
>
> Tried with the change.
> {{{
> ...
> ...
> - myunistr = smart_str(fetch)
>
> + myunistr = smart_str(fetch.read())
> ...
> ...
> }}}
>
> Output:
>
> {{{
> Traceback (most recent call last):
> File "html2text.py", line 447, in
> dat
Tried with the change.
{{{
...
...
- myunistr = smart_str(fetch)
+ myunistr = smart_str(fetch.read())
...
...
}}}
Output:
{{{
Traceback (most recent call last):
File "html2text.py", line 447, in
data = open(arg, 'r').read().decode(encoding)
File "/usr/lib/python2.6/encodings/utf_8.py", l
On Sun, Apr 17, 2011 at 8:43 PM, Nikunj Badjatya
wrote:
> Thanks for the quick reply..
> I hve never touched Django before.
>
> I tried as:
>
> {{{
>
> #!/bin/python
>
> import os
> import urllib
> + from django.utils.encoding import smart_str
>
> fetch = urllib.urlopen("some-web-link.htm")
>
> ma
Thanks for the quick reply..
I hve never touched Django before.
I tried as:
{{{
#!/bin/python
import os
import urllib
+ from django.utils.encoding import smart_str
fetch = urllib.urlopen("some-web-link.htm")
mainfile = open ('main.html', 'w' )
+ myunistr = smart_str(fetch)
print myunistr
ma
On Sun, Apr 17, 2011 at 8:01 PM, Nikunj Badjatya
wrote:
> Hi All,
>
> I am working on a self project for grabbing certain URL's from the web. Do
> some processing and store the final contents in text/pdf file.
>
> I am also using html2text (
> https://github.com/aaronsw/html2text/archives/master )
10 matches
Mail list logo