I think is making an example, stef want to demostrate that web2py is returning 200 instead of 400. Is this a bug?
On 21 ago, 10:26, mdipierro <mdipie...@cs.depaul.edu> wrote: > why are the urls in the first set truncated? > > On Aug 21, 8:07 am, Stef Mientki <stef.mien...@gmail.com> wrote: > > > > > On 21-08-2010 14:46, mdipierro wrote: > > > > what do you find that is strange? > > > This is the result with the last letter removed, so all links should give > > an error, > > but they differ with the 2 methods, > > and some of them produce 200, while they are definitely wrong > > 404 500http://127.0.0.1:8000/welcome/default/user/logi > > 404 500http://127.0.0.1:8000/welcome/default/user/registe > > 404 500http://127.0.0.1:8000/welcome/default/user/request_reset_passwor > > 200 500http://127.0.0.1:8000/welcome/default > > 400 500http://127.0.0.1:8000/welcome/default/inde > > 200 500http://127.0.0.1:8000/admin/default/design/welcom > > 200 > > 500http://127.0.0.1:8000/admin/default/edit/welcome/controllers/default.p > > 200 > > 500http://127.0.0.1:8000/admin/default/edit/welcome/views/default/index.htm > > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/views/layout.htm > > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/static/base.cs > > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/models/db.p > > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/models/menu.p > > 400 500http://127.0.0.1:8000/welcome/appadmin/inde > > 200 500http://127.0.0.1:8000/admin/default/inde > > 400 400http://127.0.0.1:8000/examples/default/inde > > 200 -1http://web2py.co > > 400 400http://web2py.com/boo > > 400 500http://127.0.0.1:8000/welcome/default/inde > > 200 500http://127.0.0.1:8000/welcome/default > > 200 > > 500http://127.0.0.1:8000/admin/default/peek/welcome/controllers/default.p > > 200 > > 500http://127.0.0.1:8000/admin/default/peek/welcome/views/default/index.htm > > 200 -1http://www.web2py.co > > > This is the normal result > > 200 500http://127.0.0.1:8000/welcome/default/user/login > > 200 500http://127.0.0.1:8000/welcome/default/user/register > > 200 500http://127.0.0.1:8000/welcome/default/user/request_reset_password > > 200 500http://127.0.0.1:8000/welcome/default > > 200 500http://127.0.0.1:8000/welcome/default/index > > 200 500http://127.0.0.1:8000/admin/default/design/welcome > > 200 > > 500http://127.0.0.1:8000/admin/default/edit/welcome/controllers/default.py > > 200 > > 500http://127.0.0.1:8000/admin/default/edit/welcome/views/default/index.... > > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/views/layout.html > > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/static/base.css > > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/models/db.py > > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/models/menu.py > > 200 500http://127.0.0.1:8000/welcome/appadmin/index > > 200 500http://127.0.0.1:8000/admin/default/index > > 200 200http://127.0.0.1:8000/examples/default/index > > 200 200http://web2py.com > > 200 500http://web2py.com/book > > 200 500http://127.0.0.1:8000/welcome/default/index > > 400 500http://127.0.0.1:8000/welcome/default/index# > > 200 > > 500http://127.0.0.1:8000/admin/default/peek/welcome/controllers/default.py > > 200 > > 500http://127.0.0.1:8000/admin/default/peek/welcome/views/default/index.... > > 200 200http://www.web2py.com > > > So when is a URL valid ? > > > thanks, > > Stef > > > > On Aug 21, 7:32 am, Stef Mientki <stef.mien...@gmail.com> wrote: > > >>> Graphical representation of links or pages that don't get linked to. > > >> I tried to test the links (with 2 algorithms, code below) in a generated > > >> webpage, but the result I > > >> get are very weird. > > >> Probably one you knows a better way ? > > > >> cheers, > > >> Stef > > > >> from BeautifulSoup import BeautifulSoup > > >> from urllib import urlopen > > >> from httplib import HTTP > > >> from urlparse import urlparse > > > >> def Check_URL_1 ( URL ) : > > >> try: > > >> fh = urlopen ( URL ) > > >> return fh.code == 200 > > >> except : > > >> return False > > > >> def Check_URL_2 ( URL ) : > > >> p = urlparse ( URL ) > > >> h = HTTP ( p[1] ) > > >> h.putrequest ( 'HEAD', p[2] ) > > >> h.endheaders() > > >> if h.getreply()[0] == 200: > > >> return True > > >> else: > > >> return False > > > >> def Verify_Links ( URL ) : > > >> Parts = URL.split('/') > > >> Site = '/'.join ( Parts [:3] ) > > >> Current = '/'.join ( Parts [:-1] ) > > > >> fh = urlopen ( URL ) > > >> lines = fh.read () > > >> fh.close() > > > >> Soup = BeautifulSoup ( lines ) > > >> hrefs = lines = Soup.findAll ( 'a' ) > > > >> for href in hrefs : > > >> href = href [ 'href' ] #[:-1] ## <== remove "#" to generate all > > >> errors > > > >> if href.startswith ( '/' ) : > > >> href = Site + href > > >> elif href.startswith ('#' ) : > > >> href = URL + href > > >> elif href.startswith ( 'http' ) : > > >> pass > > >> else : > > >> href = Current + href > > > >> try: > > >> fh = urllib.urlopen ( href ) > > >> except : > > >> pass > > >> print Check_URL_1 ( href ), Check_URL_2 ( href ), href > > > >> URL = 'http://127.0.0.1:8000/welcome/default/index' > > >> fh = Verify_Links ( URL )