On Mon, 22 Jul 2013, Michael Dowling wrote:
I've noticed that there appears to be a significant performance hit when
using CURLOPT_READFUNCTION. This issue seems to be platform dependent as
I've only been able to get poor performance on Linux (Amazon Linux m1.large
64-bit ami-0358ce33) across multiple versions of cURL and PHP. I've not seen
any performance issues on my Mac running PHP 5.3.15 and cURL 7.21.4.
Hi Michael,
Thanks for your email and detailed report. I have some troubles to sort it all
out, and the many levels of different software with unknown behaviours doesn't
really make things easier. Let me start out with a bunch of questions...
So this version pair on Linux has problems while not on Mac? And if you run
another version set on your Mac, you get the perceived problems?
Which versions on Mac are fine ?
When sending PUT requests containing a 10 byte body (testing123) to a node.js
server (others have reported issues with Jetty as well) using
CURLOPT_READFUNCTION, the read and write times returned from
CURLINFO_SPEED_UPLOAD and CURLINFO_SPEED_DOWNLOAD are very poor: ~833 upload and
1333 download.
Doing transfer performance measurements on 10 bytes is going to be very shaky
and unreliable. Send 10 million bytes or something and you can start getting
something to measure!
Also, I'm not convinced both ways will count the numbers exactly the same
internally since the postfields approach will send the body as part of the
initial request send.
I suggest you use an external measuring method!
I wrote a very simple test script that demonstrates the performance issue:
https://gist.github.com/mtdowling/6059009. You'll need to have a node.js
server running to handle the requests. I've written up a simple bash script
that will install PHP, node.js, start the test server, and run the
performance test: https://gist.github.com/anonymous/6059035.
I would really prefer to have a test case without anything at all required
than just a libcurl-using appliction in the client side. I don't want PHP in
there, it makes my life far too complicated and things are much harder to
follow. For the server side, we can just send it to whatever that can just eat
what we send to it.
Thinking that this might be an issue with a specific version of cURL or PHP,
I manually compiled different versions of PHP and cURL and ran the
performance tests. There was no improvement using the version combination I
had success with on my mac or using the latest version of cURL (7.31) and
PHP (5.5.1). This does not appear to be version dependent. Here are the
results of that testing:
https://github.com/guzzle/guzzle/issues/349#issuecomment-21284834
For the plain HTTP (without SSL) POST case, there's basically no difference
between the Mac and the Linux version. They run the same code. But if you saw
a machine specific difference, then surely you'd see the same differences even
when you run other versions.
I ran strace on the PHP script and found that using CURLOPT_POSTFIELDS
appears to send the headers and the entire payload before receiving anything
from the server, while CURLOPT_READFUNCTION appears to send the request
headers, receive the response headers, then sends the body afterwards.
Yes, and that seems quite natural to me. If you send a small POST with
POSTFIELDS, you will then get away with less system calls and less checking on
the socket as everything is sent off in a single go.
When using the callback approach, we don't have the data around so it has to
be split up in multiple writes.
The loop used to execute the curl_multi handles is very simple and can be
found in the test script at
https://gist.github.com/mtdowling/6059009#file-readfuction_perf-php-L5.
It isn't exactly following best practices when it comes to using libcurl's API
but I doubt it matters a lot in your case. (It is written to use older
libcurl, and it has no timeout in the curl_multi_select use and if
curl_multi_exec returns something else than OK it'll busy-loop etc.)
I converted your test case to plain C and used the plain easy API instead [1],
and then I had it send the POST 10000 times and measured how long time it took
on my old laptop, sending the data to the curl test suite's HTTP server (which
certainly isn't in any way a fast server implementation). The response to the
request is very small, just a bunch of headers and a couple of bytes of body.
My results contradict your results quite significantly:
$ time ./debugit
real 0m9.412s
user 0m1.752s
sys 0m1.732s
$ time ./debugit 1
runs fixed string version
real 0m9.457s
user 0m1.528s
sys 0m1.712s
Roughly 1000 requests per second with both solutions.
This test ran on a dual-core 1.83GHz thing, Linux kernel 3.9.8 in 32bit mode.
curl -V:
curl 7.32.0-DEV (i686-pc-linux-gnu) libcurl/7.32.0-DEV OpenSSL/1.0.1e
zlib/1.2.8 c-ares/1.9.2-DEV libidn/1.25 libssh2/1.4.3_DEV librtmp/2.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3
pop3s rtmp rtsp scp sftp smtp smtps telnet tftp
Features: AsynchDNS Debug TrackMemory IDN IPv6 Largefile NTLM NTLM_WB SSL libz
TLS-SRP
Can you modify this test code to show the differences you saw?
[1] = I chose the easy interface just out of laziness since it made it have to
write less code, we can of course make it use the multi API instead to make it
mimic your code even closer - but I seriously doubt it will make any
performance difference.
--
/ daniel.haxx.se
#include <stdio.h>
#include <string.h>
#include <curl/curl.h>
const char data[]="0123456789";
struct WriteThis {
const char *readptr;
long sizeleft;
};
static size_t
write_callback(void *contents, size_t size, size_t nmemb, void *userp)
{
return size*nmemb;
}
static size_t read_callback(void *ptr, size_t size, size_t nmemb, void *userp)
{
struct WriteThis *pooh = (struct WriteThis *)userp;
if(size*nmemb < sizeof(data))
return 0;
if(pooh->sizeleft) {
memcpy(ptr, data, sizeof(data));
pooh->sizeleft -= 10;
return 10;
}
return 0; /* no more data left to deliver */
}
static int runonce(CURL *curl,
struct WriteThis *p)
{
CURLcode res;
p->sizeleft = (long)strlen(data);
/* First set the URL that is about to receive our POST. */
curl_easy_setopt(curl, CURLOPT_URL, "http://127.0.0.1:8999/1");
/* Now specify we want to POST data */
curl_easy_setopt(curl, CURLOPT_POST, 1L);
/* we want to use our own read function */
curl_easy_setopt(curl, CURLOPT_READFUNCTION, read_callback);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);
/* pointer to pass to our read function */
curl_easy_setopt(curl, CURLOPT_READDATA, p);
/* get verbose debug output please */
curl_easy_setopt(curl, CURLOPT_VERBOSE, 0L);
/* Set the expected POST size. If you want to POST large amounts of data,
consider CURLOPT_POSTFIELDSIZE_LARGE */
curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, p->sizeleft);
/*
Using POST with HTTP 1.1 implies the use of a "Expect: 100-continue"
header. You can disable this header with CURLOPT_HTTPHEADER as usual.
NOTE: if you want chunked transfer too, you need to combine these two
since you can only set one list of headers with CURLOPT_HTTPHEADER. */
/* A less good option would be to enforce HTTP 1.0, but that might also
have other implications. */
{
struct curl_slist *chunk = NULL;
chunk = curl_slist_append(chunk, "Expect:");
res = curl_easy_setopt(curl, CURLOPT_HTTPHEADER, chunk);
/* use curl_slist_free_all() after the *perform() call to free this
list again */
}
/* Perform the request, res will get the return code */
res = curl_easy_perform(curl);
/* Check for errors */
if(res != CURLE_OK)
fprintf(stderr, "curl_easy_perform() failed: %s\n",
curl_easy_strerror(res));
}
static int runfixed(CURL *curl,
struct WriteThis *p)
{
CURLcode res;
/* First set the URL that is about to receive our POST. */
curl_easy_setopt(curl, CURLOPT_URL, "http://127.0.0.1:8999/1");
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "0123456789");
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);
/* get verbose debug output please */
curl_easy_setopt(curl, CURLOPT_VERBOSE, 0L);
/* Set the expected POST size. If you want to POST large amounts of data,
consider CURLOPT_POSTFIELDSIZE_LARGE */
curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, 10);
/*
Using POST with HTTP 1.1 implies the use of a "Expect: 100-continue"
header. You can disable this header with CURLOPT_HTTPHEADER as usual.
NOTE: if you want chunked transfer too, you need to combine these two
since you can only set one list of headers with CURLOPT_HTTPHEADER. */
/* A less good option would be to enforce HTTP 1.0, but that might also
have other implications. */
{
struct curl_slist *chunk = NULL;
chunk = curl_slist_append(chunk, "Expect:");
res = curl_easy_setopt(curl, CURLOPT_HTTPHEADER, chunk);
/* use curl_slist_free_all() after the *perform() call to free this
list again */
}
/* Perform the request, res will get the return code */
res = curl_easy_perform(curl);
/* Check for errors */
if(res != CURLE_OK)
fprintf(stderr, "curl_easy_perform() failed: %s\n",
curl_easy_strerror(res));
}
#define LOOPS 10000
int main(int argc, char **argv)
{
CURL *curl;
CURLcode res;
struct WriteThis pooh;
int alt=0;
if(argc > 1) {
alt = 1;
printf("runs fixed string version\n");
}
pooh.readptr = data;
/* In windows, this will init the winsock stuff */
res = curl_global_init(CURL_GLOBAL_DEFAULT);
/* Check for errors */
if(res != CURLE_OK) {
fprintf(stderr, "curl_global_init() failed: %s\n",
curl_easy_strerror(res));
return 1;
}
/* get a curl handle */
curl = curl_easy_init();
if(curl) {
int i;
if(alt) {
for(i=0; i< LOOPS; i++)
runfixed(curl, &pooh);
}
else {
for(i=0; i< LOOPS; i++)
runonce(curl, &pooh);
}
/* always cleanup */
curl_easy_cleanup(curl);
}
curl_global_cleanup();
return 0;
}
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html