On Thu, May 21, 2020 at 8:58 PM James Read <jamesread5...@gmail.com> wrote:
> > > On Thu, May 21, 2020 at 4:18 PM Dan Fandrich via curl-library < > curl-library@cool.haxx.se> wrote: > >> On Thu, May 21, 2020 at 03:46:33PM +0100, James Read via curl-library >> wrote: >> > I'm implementing a simple web crawler with curl and want to retrieve the >> > Last-Modified header so I can implement a sensible recrawl policy. I've >> found >> > https://curl.haxx.se/libcurl/c/getinfo.html which is a nice easy way to >> > retrieve the Content-Type header. Is there a similarly easy way to >> retrieve the >> > Last-Modified header? Or I do I need to parse the header myself? >> > >> > If I need to parse the header myself I found >> https://curl.haxx.se/libcurl/c/ >> > sepheaders.html which prints headers to a file. Is there a way of just >> storing >> > the headers in memory so I can parse them there? I don't want to have >> to write >> > a file just to read it again. >> >> You can use that example as a basis, then set CURLOPT_HEADERFUNCTION with >> a >> function like WriteMemoryCallback() in the getinmemory.c example to store >> the >> headers in memory instead. Or, do something more intelligent since you're >> only >> interested in a single header. libcurl writes to a file by default, so by >> setting your own header callback function you can process them however >> you want. >> >> > OK, This is as far as I got: > > static size_t > write_cb(void *contents, size_t size, size_t nmemb, void *p) > { > ConnInfo *conn = (ConnInfo *)p; > size_t realsize = size * nmemb; > > conn->data = realloc(conn->data, conn->size + realsize + 1); > if (conn->data == NULL) { > /* out of memory! */ > printf("not enough memory (realloc returned NULL)\n"); > return 0; > } > > memcpy(&(conn->data[conn->size]), contents, realsize); > conn->size += realsize; > conn->data[conn->size] = 0; > > return realsize; > } > > When I print out conn->data it just prints out the body. How do I get the > header? > I forgot to add : curl_easy_setopt(conn->easy, CURLOPT_HEADERDATA, conn); It's working now. Thanks > > >> Dan >> ------------------------------------------------------------------- >> Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library >> Etiquette: https://curl.haxx.se/mail/etiquette.html > >
------------------------------------------------------------------- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html