On Mon, 22 Jul 2013, Michael Dowling wrote:

I've noticed that there appears to be a significant performance hit when using CURLOPT_READFUNCTION. This issue seems to be platform dependent as I've only been able to get poor performance on Linux (Amazon Linux m1.large 64-bit ami-0358ce33) across multiple versions of cURL and PHP. I've not seen any performance issues on my Mac running PHP 5.3.15 and cURL 7.21.4.

Hi Michael,

Thanks for your email and detailed report. I have some troubles to sort it all out, and the many levels of different software with unknown behaviours doesn't really make things easier. Let me start out with a bunch of questions...

So this version pair on Linux has problems while not on Mac? And if you run another version set on your Mac, you get the perceived problems?

Which versions on Mac are fine ?

When sending PUT requests containing a 10 byte body (testing123) to a node.js
server (others have reported issues with Jetty as well) using
CURLOPT_READFUNCTION, the read and write times returned from
CURLINFO_SPEED_UPLOAD and CURLINFO_SPEED_DOWNLOAD are very poor: ~833 upload and
1333 download.

Doing transfer performance measurements on 10 bytes is going to be very shaky and unreliable. Send 10 million bytes or something and you can start getting something to measure!

Also, I'm not convinced both ways will count the numbers exactly the same internally since the postfields approach will send the body as part of the initial request send.

I suggest you use an external measuring method!

I wrote a very simple test script that demonstrates the performance issue: https://gist.github.com/mtdowling/6059009. You'll need to have a node.js server running to handle the requests. I've written up a simple bash script that will install PHP, node.js, start the test server, and run the performance test: https://gist.github.com/anonymous/6059035.

I would really prefer to have a test case without anything at all required than just a libcurl-using appliction in the client side. I don't want PHP in there, it makes my life far too complicated and things are much harder to follow. For the server side, we can just send it to whatever that can just eat what we send to it.

Thinking that this might be an issue with a specific version of cURL or PHP, I manually compiled different versions of PHP and cURL and ran the performance tests. There was no improvement using the version combination I had success with on my mac or using the latest version of cURL (7.31) and PHP (5.5.1). This does not appear to be version dependent. Here are the results of that testing: https://github.com/guzzle/guzzle/issues/349#issuecomment-21284834

For the plain HTTP (without SSL) POST case, there's basically no difference between the Mac and the Linux version. They run the same code. But if you saw a machine specific difference, then surely you'd see the same differences even when you run other versions.

I ran strace on the PHP script and found that using CURLOPT_POSTFIELDS appears to send the headers and the entire payload before receiving anything from the server, while CURLOPT_READFUNCTION appears to send the request headers, receive the response headers, then sends the body afterwards.

Yes, and that seems quite natural to me. If you send a small POST with POSTFIELDS, you will then get away with less system calls and less checking on the socket as everything is sent off in a single go.

When using the callback approach, we don't have the data around so it has to be split up in multiple writes.

The loop used to execute the curl_multi handles is very simple and can be found in the test script at https://gist.github.com/mtdowling/6059009#file-readfuction_perf-php-L5.

It isn't exactly following best practices when it comes to using libcurl's API but I doubt it matters a lot in your case. (It is written to use older libcurl, and it has no timeout in the curl_multi_select use and if curl_multi_exec returns something else than OK it'll busy-loop etc.)

I converted your test case to plain C and used the plain easy API instead [1], and then I had it send the POST 10000 times and measured how long time it took on my old laptop, sending the data to the curl test suite's HTTP server (which certainly isn't in any way a fast server implementation). The response to the request is very small, just a bunch of headers and a couple of bytes of body.

My results contradict your results quite significantly:

$ time ./debugit

real    0m9.412s
user    0m1.752s
sys     0m1.732s

$ time ./debugit 1
runs fixed string version

real    0m9.457s
user    0m1.528s
sys     0m1.712s

Roughly 1000 requests per second with both solutions.

This test ran on a dual-core 1.83GHz thing, Linux kernel 3.9.8 in 32bit mode.

curl -V:
curl 7.32.0-DEV (i686-pc-linux-gnu) libcurl/7.32.0-DEV OpenSSL/1.0.1e zlib/1.2.8 c-ares/1.9.2-DEV libidn/1.25 libssh2/1.4.3_DEV librtmp/2.3 Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smtp smtps telnet tftp Features: AsynchDNS Debug TrackMemory IDN IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP

Can you modify this test code to show the differences you saw?

[1] = I chose the easy interface just out of laziness since it made it have to write less code, we can of course make it use the multi API instead to make it mimic your code even closer - but I seriously doubt it will make any performance difference.

--

 / daniel.haxx.se
#include <stdio.h>
#include <string.h>
#include <curl/curl.h>

const char data[]="0123456789";

struct WriteThis {
  const char *readptr;
  long sizeleft;
};

static size_t
write_callback(void *contents, size_t size, size_t nmemb, void *userp)
{
  return size*nmemb;
}

static size_t read_callback(void *ptr, size_t size, size_t nmemb, void *userp)
{
  struct WriteThis *pooh = (struct WriteThis *)userp;

  if(size*nmemb < sizeof(data))
    return 0;

  if(pooh->sizeleft) {
    memcpy(ptr, data, sizeof(data));
    pooh->sizeleft -= 10;
    return 10;
  }

  return 0;                          /* no more data left to deliver */
}

static int runonce(CURL *curl,
                   struct WriteThis *p)
{
  CURLcode res;

  p->sizeleft = (long)strlen(data);

  /* First set the URL that is about to receive our POST. */
  curl_easy_setopt(curl, CURLOPT_URL, "http://127.0.0.1:8999/1";);

  /* Now specify we want to POST data */
  curl_easy_setopt(curl, CURLOPT_POST, 1L);

  /* we want to use our own read function */
  curl_easy_setopt(curl, CURLOPT_READFUNCTION, read_callback);

  curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);

  /* pointer to pass to our read function */
  curl_easy_setopt(curl, CURLOPT_READDATA, p);

  /* get verbose debug output please */
  curl_easy_setopt(curl, CURLOPT_VERBOSE, 0L);

  /* Set the expected POST size. If you want to POST large amounts of data,
     consider CURLOPT_POSTFIELDSIZE_LARGE */
  curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, p->sizeleft);
  
  /*
    Using POST with HTTP 1.1 implies the use of a "Expect: 100-continue"
    header.  You can disable this header with CURLOPT_HTTPHEADER as usual.
    NOTE: if you want chunked transfer too, you need to combine these two
    since you can only set one list of headers with CURLOPT_HTTPHEADER. */
  
  /* A less good option would be to enforce HTTP 1.0, but that might also
     have other implications. */
  {
    struct curl_slist *chunk = NULL;

    chunk = curl_slist_append(chunk, "Expect:");
    res = curl_easy_setopt(curl, CURLOPT_HTTPHEADER, chunk);
    /* use curl_slist_free_all() after the *perform() call to free this
       list again */
  }

  /* Perform the request, res will get the return code */
  res = curl_easy_perform(curl);
  /* Check for errors */
  if(res != CURLE_OK)
    fprintf(stderr, "curl_easy_perform() failed: %s\n",
            curl_easy_strerror(res));
}


static int runfixed(CURL *curl,
                    struct WriteThis *p)
{
  CURLcode res;

  /* First set the URL that is about to receive our POST. */
  curl_easy_setopt(curl, CURLOPT_URL, "http://127.0.0.1:8999/1";);

  curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "0123456789");

  curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);

  /* get verbose debug output please */
  curl_easy_setopt(curl, CURLOPT_VERBOSE, 0L);

  /* Set the expected POST size. If you want to POST large amounts of data,
     consider CURLOPT_POSTFIELDSIZE_LARGE */
  curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, 10);
  
  /*
    Using POST with HTTP 1.1 implies the use of a "Expect: 100-continue"
    header.  You can disable this header with CURLOPT_HTTPHEADER as usual.
    NOTE: if you want chunked transfer too, you need to combine these two
    since you can only set one list of headers with CURLOPT_HTTPHEADER. */
  
  /* A less good option would be to enforce HTTP 1.0, but that might also
     have other implications. */
  {
    struct curl_slist *chunk = NULL;

    chunk = curl_slist_append(chunk, "Expect:");
    res = curl_easy_setopt(curl, CURLOPT_HTTPHEADER, chunk);
    /* use curl_slist_free_all() after the *perform() call to free this
       list again */
  }

  /* Perform the request, res will get the return code */
  res = curl_easy_perform(curl);
  /* Check for errors */
  if(res != CURLE_OK)
    fprintf(stderr, "curl_easy_perform() failed: %s\n",
            curl_easy_strerror(res));
}

#define LOOPS 10000

int main(int argc, char **argv)
{
  CURL *curl;
  CURLcode res;
  struct WriteThis pooh;
  int alt=0;

  if(argc > 1) {
    alt = 1;
    printf("runs fixed string version\n");
  }

  pooh.readptr = data;

  /* In windows, this will init the winsock stuff */
  res = curl_global_init(CURL_GLOBAL_DEFAULT);
  /* Check for errors */
  if(res != CURLE_OK) {
    fprintf(stderr, "curl_global_init() failed: %s\n",
            curl_easy_strerror(res));
    return 1;
  }

  /* get a curl handle */
  curl = curl_easy_init();
  if(curl) {
    int i;
    if(alt) {
      for(i=0; i< LOOPS; i++)
        runfixed(curl, &pooh);
    }
    else {
      for(i=0; i< LOOPS; i++)
        runonce(curl, &pooh);
    }

    /* always cleanup */
    curl_easy_cleanup(curl);
  }
  curl_global_cleanup();
  return 0;
}
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html

Reply via email to