Perhaps a silly question, but what encoding is the page you are getting?
You can check this by loading the page in FireFox, going to the view menu,
and selecting "character encoding".  That will tell you what FireFox thinks
is the encoding.

If it's not UTF-8, you'll probably have to convert it.

-Sam

On Sun, Apr 17, 2011 at 10:55 AM, Nikunj Badjatya
<nikunjbadja...@gmail.com>wrote:

> On Sun, Apr 17, 2011 at 11:17 PM, JAGANADH G <jagana...@gmail.com> wrote:
>
> >
> >
> > On Sun, Apr 17, 2011 at 11:13 PM, Nikunj Badjatya <
> > nikunjbadja...@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> With stripogram Its working fine.
> >> Thanks a lot. :) !!
> >>
> >> But couldnt understand the reason behind the previous html2text
> >> malfunction for that particular (index1.htm) link.??!
> >>
> >>
> >
> > beacuse html2text encounters a problem with utf-8 decoding . nothing much
> >
>
> So theres no other alternative to use besides utf-8 encoding with older
> html2text.? On looking at the source code , at line no. 444. They have used
> encoding=utf-8. Is this can be replaced by something else.?
>
>
> Also there is a lil problem with this new html2text. it does not provide
> the
> href links which the older html2text was providing. !! Its a complete text
> now.  Anyways I will figure out the solution for it.. !
>
> Thanks
>
>
> > --
> > **********************************
> > JAGANADH G
> > http://jaganadhg.freeflux.net/blog
> > *ILUGCBE*
> > http://ilugcbe.techstud.org
> >
> >
> _______________________________________________
> BangPypers mailing list
> BangPypers@python.org
> http://mail.python.org/mailman/listinfo/bangpypers
>
_______________________________________________
BangPypers mailing list
BangPypers@python.org
http://mail.python.org/mailman/listinfo/bangpypers

Reply via email to