Guys, I am sorry I wrote these messages very late at night.
Naturally what came before the dot is the language defining two letter
string that is usual of wikipedia urls.
Something in my code is obviously gobbling that up. Thanks for pointing
that out and my apologies again for not seeing this o
Flavio schrieb:
> something like this, for instance:
> http://.wikipedia.org/wiki/Copper%28II%29_hydroxide
>
> but even url with any non-ascii characters such as this
>
> http://.wikipedia.org/wiki/Ammonia
>
> also fail when passed to urlopen :
> File "/usr/lib/python2.4/encodings/idna.py", line
something like this, for instance:
http://.wikipedia.org/wiki/Copper%28II%29_hydroxide
but even url with any non-ascii characters such as this
http://.wikipedia.org/wiki/Ammonia
also fail when passed to urlopen :
File "/usr/lib/python2.4/encodings/idna.py", line 72, in ToASCII
raise Unicode
Flavio schrieb:
> What I am doing is very simple:
>
> I fetch an url (html page) parse it using BeautifulSoup, extract the
> links and try to open each of the links, repeating the cycle.
>
> Beautiful soup converts the html to unicode. That's why when I try to
> open the links extracted from the
What I am doing is very simple:
I fetch an url (html page) parse it using BeautifulSoup, extract the
links and try to open each of the links, repeating the cycle.
Beautiful soup converts the html to unicode. That's why when I try to
open the links extracted from the page I get this error.
This i
In <[EMAIL PROTECTED]>, Flavio wrote:
> Hi I am havin a problem with urllib2.urlopen.
>
> I get this error when I try to pass a unicode to it.
>
> raise UnicodeError, "label too long"
>
> is this problem avoidable? no browser or programs such as wget seem to
> have a problem with these strings.