On Tuesday, April 21, 2020 at 3:16:51 PM UTC-4, Barry Scott wrote: > > On 21 Apr 2020, at 18:11, dc wrote: > > > > On Tuesday, April 21, 2020 at 12:40:25 PM UTC-4, Dieter Maurer wrote: > >> dc wrote at 2020-4-20 14:48 -0700: > >>> ... > >>> I tried telneting the landing page, i.e. without the specific node that > >>> requires the login. So e.g. > >>> > >>> Telnet thissite.oh.gov 80 > >>> > >>> , but it returns a 400 Bad Request. Before that, the Telnet screen is > >>> completely blank ; I have to press a key before it returns the Bad > >>> Request. > >>> > >>> > >>> Roger on knowing what the site is asking for. But I don't know how to > >>> determine that. > >> > >> I use `wget -S` to learn about server responses. > >> I has the advantage (over `telnet`) to know the HTTP protocl. > > > > Sure enough, wget DOES return a lot of information. In fact, although an > > initial response of 401 is returned, it waits for the response and finally > > returns a 200. > > > > So, I guess the question finally comes down to: How do we make the > > requests.get() wait for a response? The timeout value isn't the same thing > > that I thought it was. So how do we tell .get() to wait 20 or 30 seconds > > for an OK response? > > The way HTTP protocol works is that you send a request and get a response. 1 > in 1 out. > The response can tell you that you need to do more work, like add > authentication data. > > The only use of the timeout is to allow you to give up if a response does not > comeback > before you get bored waiting. > > In the case of the 401 you can read what it means here: > https://httpstatuses.com/401 > > It is then up to your code to issue a new request with the requirer > authentication headers. > The headers you got back in the first response will tell you what type of > authentication is requires, > basic, digest etc. > > The library you are using should be able to handle this if you provide what > the library requires from > you to do the authenticate. > > Personally I debug stuff using the curl command. curl -v <url> shows you the > request and the response. > You can then add curl options to provide authenicate data (username/password) > and how to use it --basic > and --digest for example. > > Oh and the other status that needs handling is a 302 redirect. This allows a > web site to more a page > and tell you the new location. Again you have to allow your library to do > this for you. > > Barry > > > > > > > -- > > https://mail.python.org/mailman/listinfo/python-list > >
Barry, Thanks. I'm starting to get a bigger picture, now. So I really do need to raise the status, in order to get the headers I had put this in orginally, but then thought it wasn't necessary. So in the case of this particular site, if I understand correctly, I would be using the NTLM to decide which type of Authentication to follow up with (I think). Content-Length: 1293 Content-Type: text/html WWW-Authenticate: Negotiate, NTLM -- https://mail.python.org/mailman/listinfo/python-list