Tim Chase wrote: > When you get the second page, are you getting the same content > back that you get if you do a search in your favorite browser? > > Using just > > content = urllib.urlopen(url2).read() > 'Error' in content # True > 'Friedrich' in content # False > > However, when you browse to the page, those two should be inverted: > > 'Error' in content # False > 'Friedrich' in content # True > > I've tried adding in the parameters correctly via post > > params = urllib.urlencode([ > ('params.forzaQuery', 'N'), > ... > ('layout', 'busquedaisbn'), > ]) > content = urllib.urlopen(url2, data).read() > > However, this too fails because the underlying engine expects a > session ID in the URL. I finally got it to work with the code below: > > import urllib > > data = [ > ('params.forzaQuery', 'N'), > ('params.cdispo', 'A'), > ('params.cisbnExt', '8484031128'), > ('params.liConceptosExt[0].texto', ''), > ('params.orderByFormId', '1'), > ('action', 'Buscar'), > ('language', 'es'), > ('prev_layout', 'busquedaisbn'), > ('layout', 'busquedaisbn'), > ] > > params = urllib.urlencode(data) > > url = > 'http://www.mcu.es/webISBN/tituloSimpleDispatch.do;jsessionid=5E8D9A11E4A28BDF0BA6B254D0118262' > > fp = urllib.urlopen(url, params) > content = fp.read() > fp.close() > > > but I had to hard-code the jsessionid parameter in the URL. This > would have to be determined from the initial call & response of > the initial URL (the initial URL returns a <FORM> element with > the URL to POST to, including this magic jsessionid parameter). > > Hope this helps nudge you (the OP) in the right direction to get > what you're looking for. > > -tkc > > > > > > > -- > http://mail.python.org/mailman/listinfo/python-list
OK, Tim, I think you got the point. The jsessionid change in every response of the initial URL, so I need to read it and stand with it during the session. Now I must guess how to do it. Thank you very much to you and also to Chris. Kind regards, Toni -- http://mail.python.org/mailman/listinfo/python-list