Re: Getting a 401 from requests.get, but not when logging in via the browser.

dcwhatthe Tue, 21 Apr 2020 12:53:25 -0700

On Tuesday, April 21, 2020 at 3:16:51 PM UTC-4, Barry Scott wrote:
> > On 21 Apr 2020, at 18:11, dc wrote:
> > 
> > On Tuesday, April 21, 2020 at 12:40:25 PM UTC-4, Dieter Maurer wrote:
> >> dc wrote at 2020-4-20 14:48 -0700:
> >>> ...
> >>> I tried telneting the landing page, i.e. without the specific node that 
> >>> requires the login.  So e.g.
> >>> 
> >>> Telnet thissite.oh.gov 80
> >>> 
> >>> , but it returns a 400 Bad Request.  Before that, the Telnet screen is 
> >>> completely blank ; I have to press a key before it returns the Bad 
> >>> Request.
> >>> 
> >>> 
> >>> Roger on knowing what the site is asking for.  But I don't know how to 
> >>> determine that.
> >> 
> >> I use `wget -S` to learn about server responses.
> >> I has the advantage (over `telnet`) to know the HTTP protocl.
> > 
> > Sure enough, wget DOES return a lot of information.  In fact, although an 
> > initial response of 401 is returned, it waits for the response and finally 
> > returns a 200.
> > 
> > So, I guess the question finally comes down to:  How do we make the 
> > requests.get() wait for a response?  The timeout value isn't the same thing 
> > that I thought it was.  So how do we tell .get() to wait 20 or 30 seconds 
> > for an OK response?
> 
> The way HTTP protocol works is that you send a request and get a response. 1 
> in 1 out.
> The response can tell you that you need to do more work, like add 
> authentication data.
> 
> The only use of the timeout is to allow you to give up if a response does not 
> comeback
> before you get bored waiting.
> 
> In the case of the 401 you can read what it means here: 
> https://httpstatuses.com/401
> 
> It is then up to your code to issue a new request with the requirer 
> authentication headers.
> The headers you got back in the first response will tell you what type of 
> authentication is requires,
> basic, digest etc.
> 
> The library you are using should be able to handle this if you provide what 
> the library requires from
> you to do the authenticate.
> 
> Personally I debug stuff using the curl command. curl -v <url> shows you the 
> request and the response.
> You can then add curl options to provide authenicate data (username/password) 
> and how to use it --basic
> and --digest for example.
> 
> Oh and the other status that needs handling is a 302 redirect. This allows a 
> web site to more a page
> and tell you the new location. Again you have to allow your library to do 
> this for you.
> 
> Barry
> 
> 
> 
> > 
> > -- 
> > https://mail.python.org/mailman/listinfo/python-list
> >


Barry, Thanks.  I'm starting to get a bigger picture, now.

So I really do need to raise the status, in order to get the headers  I had put 
this in orginally, but then thought it wasn't necessary.

So in the case of this particular site, if I understand correctly, I would be 
using the NTLM to decide which type of Authentication to follow up with (I 
think).

Content-Length:          1293
Content-Type:            text/html
WWW-Authenticate:        Negotiate, NTLM

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Getting a 401 from requests.get, but not when logging in via the browser.

Reply via email to