Write callback function when following HTTP redirections

Nicolas Roeser via curl-library Mon, 15 Apr 2019 08:41:49 -0700

Hello again!

I am still writing my code in PHP, but looking at the code of the PHPcurl extension, I have found my question to be a general libcurlquestion; this is why I am writing to this list.

My code uses a write callback function, a header callback function, anda progress callback function. The latter may cause a download to becanceled (see my earlier question in thread<mid:20190407213813.gd11...@imap.uni-ulm.de>). I have enabledCURLOPT_FOLLOWLOCATION, and CURLOPT_HEADER. (The code also enablesCURLOPT_RETURNTRANSFER, but that is specific to the curl extension ofPHP, and merely causes the internal buffer to be output if there is noerror.) The write callback function appends the received data to adynamic buffer.

I would like to parse the received data (or the first part of it) evenif the download has been aborted.

My problem is that I do not know where the boundary between header andbody is if the download has been aborted. To make things worse, I havethe feeling that it may be difficult to properly detect.


With the prerequisites listed above, consider the following scenario:
1. client sends a HTTP GET request to the server,

2. server responds with 3xx, Location header field, no Content-Length,and a body with chunked transfer coding,

3. client reads the chunked body and then follows the redirection,

4. server responds with 200 and sends a huge document (which _might_contain parts that look like message/http content 😉),5. client starts reading the resource, but aborts after a certain amountof bytes.

I would like to clear the receive buffer each time the client startsreading a new resource. But I am not sure when this can safely be done.From the man pages for CURLOPT_WRITEFUNCTION andCURLOPT_HEADERFUNCTION, I can see that while the header callbackfunction is called once per header line (to simplify their handling),the write callback function may be called with big blocks of data. So Iassume that it is _not_ safe to clear the receive buffer as soon as Isee an HTTP status-line.


Guessing the start of the document is not an option of course.

I first thought that I might disable CURLOPT_HEADER and handle someheaders differently from what is done now. But this seems not to helpwith my problem of identifying when to clear my receive buffer as longas CURLOPT_FOLLOWLOCATION is on.

Now I am a bit lost, and assume that I am missing something here. Thisis why I would like to ask for help:

How can I extract the body of my target resource which has beenpartially received? Are the man pages or my interpretation of them toostrict? Do I need to switch to a completely different approach?

I have a feeling that the write callback function will never be calledwith data from two HTTP responses at once (that is, will never crossredirections). Is my guess correct? If yes, is this guaranteed/will thisstay?


Cheers
--
Nico

Nicolas Roeser
kiz – Information Systems Department, Ulm University
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Write callback function when following HTTP redirections

Reply via email to