On Thu, May 21, 2020 at 8:58 PM James Read <jamesread5...@gmail.com> wrote:

>
>
> On Thu, May 21, 2020 at 4:18 PM Dan Fandrich via curl-library <
> curl-library@cool.haxx.se> wrote:
>
>> On Thu, May 21, 2020 at 03:46:33PM +0100, James Read via curl-library
>> wrote:
>> > I'm implementing a simple web crawler with curl and want to retrieve the
>> > Last-Modified header so I can implement a sensible recrawl policy. I've
>> found
>> > https://curl.haxx.se/libcurl/c/getinfo.html which is a nice easy way to
>> > retrieve the Content-Type header. Is there a similarly easy way to
>> retrieve the
>> > Last-Modified header? Or I do I need to parse the header myself?
>> >
>> > If I need to parse the header myself I found
>> https://curl.haxx.se/libcurl/c/
>> > sepheaders.html which prints headers to a file. Is there a way of just
>> storing
>> > the headers in memory so I can parse them there? I don't want to have
>> to write
>> > a file just to read it again.
>>
>> You can use that example as a basis, then set CURLOPT_HEADERFUNCTION with
>> a
>> function like WriteMemoryCallback() in the getinmemory.c example to store
>> the
>> headers in memory instead. Or, do something more intelligent since you're
>> only
>> interested in a single header. libcurl writes to a file by default, so by
>> setting your own header callback function you can process them however
>> you want.
>>
>>
> OK, This is as far as I got:
>
> static size_t
> write_cb(void *contents, size_t size, size_t nmemb, void *p)
> {
>         ConnInfo *conn = (ConnInfo *)p;
>         size_t realsize = size * nmemb;
>
>         conn->data = realloc(conn->data, conn->size + realsize + 1);
>         if (conn->data == NULL) {
>                 /* out of memory! */
>                 printf("not enough memory (realloc returned NULL)\n");
>                 return 0;
>         }
>
>         memcpy(&(conn->data[conn->size]), contents, realsize);
>         conn->size += realsize;
>         conn->data[conn->size] = 0;
>
>         return realsize;
> }
>
> When I print out conn->data it just prints out the body. How do I get the
> header?
>

I forgot to add :

curl_easy_setopt(conn->easy, CURLOPT_HEADERDATA, conn);

It's working now.

Thanks

>
>
>> Dan
>> -------------------------------------------------------------------
>> Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
>> Etiquette:   https://curl.haxx.se/mail/etiquette.html
>
>
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Reply via email to