I have a snippet of Python code that makes an HTTP GET request to an
Apache web server (v 2.2.3) using urllib2. The server responds with an
HTTP 400 error presumably because of a malformed 'Host' header.
The snippet is quite simple: it creates a url based on IPv6 string
literal sy
On 10 Aug, 18:11, "Diez B. Roggisch" wrote:
> dorzey wrote:
> > "geturl - this returns the real URL of the page fetched. This is
> > useful because urlopen (or the opener object used) may have followed a
> > redirect. The URL of the page fetched may not be the same as the URL
> > requested." from
dorzey wrote:
> "geturl - this returns the real URL of the page fetched. This is
> useful because urlopen (or the opener object used) may have followed a
> redirect. The URL of the page fetched may not be the same as the URL
> requested." from
> http://www.voidspace.org.uk/python/articles/urllib2.
Yes Piet you were right this works. But seems does not work on google
app engine, since it appends it own agent info as seen below
'User-Agent': 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US;
rv:1.9.0.13) Gecko/2009073021 Firefox/3.0.13 AppEngine-Google;
(+http://code.google.com/appengi
> jitu (j) wrote:
>j> Hi,
>j> A html page contains 'anchor' elements with 'href' attribute having
>j> a semicolon in the url , while fetching the page using
>j> urllib2.urlopen, all such href's containing 'semicolons' are
>j> truncated.
>j> For example the href
>http://travel.yahoo.co
l page contains 'anchor' elements with 'href' attribute having
> > a semicolon in the url , while fetching the page using
> > urllib2.urlopen, all such href's containing 'semicolons' are
> > truncated.
>
> > For example the
>
On Aug 10, 4:39 pm, jitu wrote:
> Hi,
>
> A html page contains 'anchor' elements with 'href' attribute having
> a semicolon in the url , while fetching the page using
> urllib2.urlopen, all such href's containing 'semicolons'
Hi,
A html page contains 'anchor' elements with 'href' attribute having
a semicolon in the url , while fetching the page using
urllib2.urlopen, all such href's containing 'semicolons' are
truncated.
For example the href
http://travel.yahoo.com/p-travelgu
On Fri, 24 Apr 2009 04:25:20 -0700 (PDT), Lakshman wrote:
> I am trying to authenticate using urllib2. The basic authentication
> works if I hard code authheaders.
...
> except IOError, e:
> print "Something wrong. This shouldnt happen"
First of all, don't
I am trying to authenticate using urllib2. The basic authentication
works if I hard code authheaders.
def is_follows(follower, following):
theurl = 'http://twitter.com/friendships/exists.json?
user_a='+follower+'&user_b='+following
username = 'uname1'
Dec 5, 2008 at 11:57 AM, rishi pathak <[EMAIL PROTECTED]>wrote:
>>
>>> Are you sitting behind a proxy. If so then you have to set proxy for http
>>>
>>> On Fri, Dec 5, 2008 at 11:47 AM, svalbard colaco <
>>> [EMAIL PROTECTED]> wrote:
>>&g
nd a proxy. If so then you have to set proxy for http
>>
>> On Fri, Dec 5, 2008 at 11:47 AM, svalbard colaco <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi all
>>>
>>> I have written a small code snippet to open a URL using urllib2 to open a
>>&
:57 AM, rishi pathak <[EMAIL PROTECTED]>wrote:
> Are you sitting behind a proxy. If so then you have to set proxy for http
>
> On Fri, Dec 5, 2008 at 11:47 AM, svalbard colaco <[EMAIL PROTECTED]
> > wrote:
>
>> Hi all
>>
>> I have written a small code
Are you sitting behind a proxy. If so then you have to set proxy for http
On Fri, Dec 5, 2008 at 11:47 AM, svalbard colaco
<[EMAIL PROTECTED]>wrote:
> Hi all
>
> I have written a small code snippet to open a URL using urllib2 to open a
> web page , my python version is 2.4 bu
Hi all
I have written a small code snippet to open a URL using urllib2 to open a
web page , my python version is 2.4 but i get an urlopen error called
connection timed out
The following is the code snippet
*import urllib2
f = urllib2.urlopen('http://www.google.com/')
print f.read(100
Hi!
Can I do this in python?
No.
The "default page" is a property of the web-server ; and it is not
client side.
Examples :
for Apache, it's index.html or index.htm ; but if PHP is installed,
index.php is also possible.
for APS, it's init.htm (between others possibilites).
etc.
@-s
On Oct 27, 2008, at 12:17 PM, barrett wrote:
Is there a way to find the name of a page you are retrieving using
python. For example, if I get http://www.cnn.com/ i want to know that
the page is index.html. I can do this using wget. as seen in the code
below. Can I do this in python?
Hi bar
Is there a way to find the name of a page you are retrieving using
python. For example, if I get http://www.cnn.com/ i want to know that
the page is index.html. I can do this using wget. as seen in the code
below. Can I do this in python?
Thanks,
$ wget cnn.com
--11:15:25-- http://cnn.com/
En Thu, 18 Sep 2008 10:23:50 -0300, Mohit Ranka <[EMAIL PROTECTED]>
escribió:
I am trying to fetch HTML content from a website that has
different version of pages for "logged" users and "guseuests" users. I
need
to fetch the "logged" user pages. The problem is, even with the use
Hi All,
I am trying to fetch HTML content from a website that has
different version of pages for "logged" users and "guseuests" users. I need
to fetch the "logged" user pages. The problem is, even with the use of basic
authentication, I am getting "guest" user page with urllib2.urlopen
I didn't spend a lot of time debugging that code -- I've been using
beautiful soup a lot at work lately and really pulled that out of
memory at about 2:00 AM a couple days ago.
In the 5 minute I spent on it, it appeared that the definitions were
setup like so:
Blah
Definition>
I was attempti
>>>
>>>> >> >> > I stumbled across this a while back:
>>>> >> >> >http://www.voidspace.org.uk/python/articles/urllib2.shtml.
>>>> >> >> > It covers quite a bit. The urllib2 module is pretty
>>>> s
gt;> learn
>>> >> >> >> this?
>>>
>>> >> >> >> Maric Michaud wrote:
>>>
>>> >> >> >> > Le Friday 27 June 2008 10:43:06 Alexnb, vous avez écrit :
>>> >> >> >> >> I hav
t;> am
>> >> >> >> here!
>> >> >> >> >> :].
>> >> >> >> >> Okay, so basically I want to be able to submit a word to
>> >> >> >> dictionary.com
>> >> >> >> >> and
>> >> >
gt;> >> >> >> urllib2 hasn't helped me. Anyway, How would you go about doing
> >> >> this.
> >> >> >> No,
> >> >> >> >> I
> >> >> >> >> did not post the html, but I mean if you want, right click on
> >> your
> >> >
t;> >> >> browser
>> >> >> >> and hit view source of the google homepage. Basically what I
>> want
>> >> to
>> >> >> know
>> >> >> >> is how to submit the values(the search term) and then search for
>
his
>> >> > case,
>> >> > but this is a valid example for the general case :
>>
>> >> >>>>[207]: import urllib, urllib2
>>
>> >> > You need to trick the server with an imaginary User-Agent.
>>
>> >> &
2]: res = google_search("python & co")
>>
>> > Now you got the whole html response, you'll have to parse it to recover
>> > datas,
>> > a quick & dirty try on google response page :
>>
>> >>>>[213]: import re
>>
.read()
>> >.:
>>
>> >>>>[212]: res = google_search("python & co")
>>
>> > Now you got the whole html response, you'll have to parse it to recover
>> > datas,
>> > a quick & dirty try on google resp
...[229]:
> > ['Python Gallery',
> > 'Coffret Monty Python And Co 3 DVD : La Premi\xe8re folie des Monty ...',
> > 'Re: os x, panther, python & co: msg#00041',
> > 'Re: os x, panther, python & co: msg#00040',
> > 'Cardiff Web Site Design, Professional web site design services ...',
> > 'Python Properties',
> > 'Frees < Programs < Python < Bin-Co',
> > 'Torb: an interface between Tcl and CORBA',
> > 'Royal Python Morphs',
> > 'Python & Co']
>
> > --
> > _
>
> > Maric Michaud
> > --
> >http://mail.python.org/mailman/listinfo/python-list
>
> --
> View this message in
> context:http://www.nabble.com/using-urllib2-tp18150669p18160312.html
> Sent from the Python - python-list mailing list archive at Nabble.com.
--
http://mail.python.org/mailman/listinfo/python-list
> 'Coffret Monty Python And Co 3 DVD : La Premi\xe8re folie des Monty ...',
> 'Re: os x, panther, python & co: msg#00041',
> 'Re: os x, panther, python & co: msg#00040',
> 'Cardiff Web Site Design, Professional web site design services ...',
> 'Python Properties',
> 'Frees < Programs < Python < Bin-Co',
> 'Torb: an interface between Tcl and CORBA',
> 'Royal Python Morphs',
> 'Python & Co']
>
>
> --
> _
>
> Maric Michaud
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
--
View this message in context:
http://www.nabble.com/using-urllib2-tp18150669p18160312.html
Sent from the Python - python-list mailing list archive at Nabble.com.
--
http://mail.python.org/mailman/listinfo/python-list
Le Friday 27 June 2008 10:43:06 Alexnb, vous avez écrit :
> I have never used the urllib or the urllib2. I really have looked online
> for help on this issue, and mailing lists, but I can't figure out my
> problem because people haven't been helping me, which is why I am here! :].
> Okay, so basica
I know that all this does is print the source, but thats about all I
know. I know it may be a lot to ask to have someone show/help me, but I
really would appreciate it.
--
View this message in context:
http://www.nabble.com/using-urllib2-tp18150669p18150669.html
Sent from the Python - python-lis
"Kushal Kumaran" <[EMAIL PROTECTED]> writes:
[...]
> If, at any time, an error response fails to reach your machine, the
> code will have to wait for a timeout. It should not have to wait
> forever.
[...]
...but it might have to wait a long time. Even if you use
socket.setdefaulttimeout(), DNS l
On Apr 2, 2:52 am, "ken" <[EMAIL PROTECTED]> wrote:
> Hi,
>
> i have the following code to load a url.
> My question is what if I try to load an invalide url
> ("http://www.heise.de/";), will I get an IOException? or it will wait
> forever?
>
Depends on why the URL is invalid. If the URL refers t
Hi,
i have the following code to load a url.
My question is what if I try to load an invalide url ("http://
www.heise.de/"), will I get an IOException? or it will wait forever?
Thanks for any help.
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
urllib2.install_opener(opener)
On Mar 31, 7:21 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> I have the following code to load a url (address).
>
> When I have a url like this,
>
> http://www.testcom.co.uk/dev_12345/www.cnn.com
>
> i get an error "Failed to openhttp://www.testcom.co.uk/dev_12345/www.cnn.com";.
> Is there
I have the following code to load a url (address).
When I have a url like this,
http://www.testcom.co.uk/dev_12345/www.cnn.com
i get an error "Failed to open http://www.testcom.co.uk/dev_12345/www.cnn.com";.
Is there something wrong with my URL? or something wrong with my code?
Thank you for an
[EMAIL PROTECTED] wrote:
> I am fetching different web pages (never the same one) from a web
> server. Does that make a difference with them trying to block me?
> Also, if it was only that site blocking me, then why does the internet
> not work in other programs when this happens in the script.
At Monday 8/1/2007 21:30, [EMAIL PROTECTED] wrote:
I am fetching different web pages (never the same one) from a web
server. Does that make a difference with them trying to block me?
Also, if it was only that site blocking me, then why does the internet
not work in other programs when this happ
I am fetching different web pages (never the same one) from a web
server. Does that make a difference with them trying to block me?
Also, if it was only that site blocking me, then why does the internet
not work in other programs when this happens in the script. It is
almost like something is see
[EMAIL PROTECTED] wrote:
> I have a script that uses urllib2 to repeatedly lookup web pages (in a
> spider sort of way). It appears to function normally, but if it runs
> too long I start to get 404 responses. If I try to use the internet
> through any other programs (Outlook, FireFox, etc.) it
I have a script that uses urllib2 to repeatedly lookup web pages (in a
spider sort of way). It appears to function normally, but if it runs
too long I start to get 404 responses. If I try to use the internet
through any other programs (Outlook, FireFox, etc.) it will also fail.
If I stop the scri
Edward Elliott <[EMAIL PROTECTED]> writes:
> [EMAIL PROTECTED] wrote:
> > can anybody explain, in the first case why i need to do two attempts.
>
> I would guess it's because redhat requires your browser to submit a session
> cookie with the login form. In the urllib2 example, the first request
[EMAIL PROTECTED] writes:
[...]
> 1) >>>import urllib2,urllib,cookielib
> 2) >>>cj = cookielib.CookieJar()
> 3) >>>opener = urllib2.build_opener( urllib2.HTTPCookieProcessor(cj))
> 4) >>>data = urllib.urlencode ( { "username" : "user" ,
>"password" :"" } )
> 5) >>>fp = opener.open(
> "h
ok , got it . Thanks
--
http://mail.python.org/mailman/listinfo/python-list
[EMAIL PROTECTED] wrote:
> can anybody explain, in the first case why i need to do two attempts.
I would guess it's because redhat requires your browser to submit a session
cookie with the login form. In the urllib2 example, the first request you
make tries to submit login form data directly. Si
Hi ,
I am using python2.4 "urllib2" and "cookelib".
In line "5" below i provide my credentials to
login into a web site.During the first attempt i "fail",
judging from the output of line "6".
I try again and the second time i succeed,judging
from the output of line "8".
Now using the "twill" modu
Hi ,
I am using python2.4 "urllib2" and "cookelib".
In line "5" below i provide my credentials to
login into a web site.During the first attempt i "fail",
judging from the output of line "6".
I try again and the second time i succeed,judging
from the output of line "8".
Now using the "twill" modu
Licheng Fang napisał(a):
> I use a HTTP proxy to connect to Internet. When I run ulropen command I
> get HTTP Error 407: Proxy authorization required. Could anybody tell me
> how to resolve this? Thanks!
You can build and install opener instance, then all urllib2 calls will
use it.
Some informat
I want to know that, too!
--
http://mail.python.org/mailman/listinfo/python-list
I use a HTTP proxy to connect to Internet. When I run ulropen command I
get HTTP Error 407: Proxy authorization required. Could anybody tell me
how to resolve this? Thanks!
--
http://mail.python.org/mailman/listinfo/python-list
You may want to look at using PyCurl. (http://pycurl.sourceforge.net/)
It makes it easyer to handle this type of stuff.
--
http://mail.python.org/mailman/listinfo/python-list
I've been using urllib2 to try and automate logging into the google
adsense page. I want to download the csv report files, so that I can
do
some analysis of them. However, I don't really know how web forms
work,
and the examples on the python.org/doc site aren't really helpf
import httplib
import base64
import sys
import random
#
# Get the length of the file from os.stat
#
username=''
password=''
file=''
size=os.stat(file)[6]
#
# file contains the entire path, split off the name
# WebSafe.
#
name=os.path.basename(file)
url='https://www.somedomain.com'
auth_string =
Can somebody provide an example of how to retrieve a https url, given
username and password? I don't find it in the standard documentation.
TIA,
Michele Simionato
--
http://mail.python.org/mailman/listinfo/python-list
When I request a URL using urllib2, it appears that urllib2 always
makes the request using HTTP 1.0, and not HTTP 1.1. I'm trying to use
the "If-None-Match"/"ETag" HTTP headers to conserve bandwidth, but if
I'm not mistaken, these are HTTP 1.1 headers, so I can'
57 matches
Mail list logo