Evgeny Kotkov <evgeny.kot...@visualsvn.com> writes:

> In the meanwhile, apparently, there is an oversight in the core V1 patch
> (3-reduce-syscalls-for-buffered-writes-v1.patch.txt):
>
> If the buffer is not empty, and the caller issues a write with a chunk
> that slightly exceeds the buffer size, for example, 4100 bytes, it would
> result in flushing the buffer _and_ writing the remaining couple of bytes
> with WriteFile().  An appropriate thing to do here would be to flush the
> buffer, but put the few remaining bytes into the buffer, instead of writing
> them to disk and thus making an unnecessary syscall.
>
> With all this in mind, I will prepare the V2 version of the core patch.

I have attached the V2 version of the core patch.

To solve the aforementioned oversight in the V1 patch, I implemented the
optimization in a slightly different manner, by keeping the existing loop
but also handling a condition where we can write the remaining data with
a single WriteFile() call.  Apart from this, the V2 patch also includes an
additional test, test_small_and_large_writes_buffered().

Speaking of the discussed idea of adjusting the strategy to avoid a memcpy()
with the write pattern like

    WriteFile(13)
    WriteFile(86267)
    ...

instead of

    WriteFile(4096)
    WriteFile(82185)
    ...

I preferred to keep the latter approach which keeps the minimum size of
the WriteFile() chunk equal to 4096, for the following reasons:

  - My benchmarks do not show that the alternative pattern that also avoids
    a memcpy() results in a quantifiable performance improvement.

  - The existing code had a property that all WriteFile() calls, except
    for maybe the last one, were performed in chunks with sizes that are
    never less than 4096.  Although I can't reproduce this in my environment,
    it could be that introducing a possibility of writing in smaller chunks
    would result in the decreased performance with specific hardware, non-
    file handles or in situations when the OS write caching is disabled.
    Therefore, currently, I think that it would be better to keep this
    existing property.

  - Probably, implementing the first approach would result in a bit more
    complex code, as I think that it would require introducing an additional
    if-else code path.

The log message is included in the beginning of the patch file.


Thanks,
Evgeny Kotkov
Win32: Improve apr_file_write() performance on buffered files by reducing
the amount of WriteFile() calls for large writes.

Previously, writing has been implemented with a loop that keeps copying the
data to the internal 4KB buffer and writing this buffer to disk by calling
WriteFile(4096).  This patch reduces the amount of syscalls for large writes
by performing them with a single syscall, if possible.  If the buffer is
not empty at the moment when the large write occurs, it is first filled
up to its 4KB capacity, flushed, and the remaining part of the data is
written with a single syscall.

* file_io/win32/readwrite.c
  (write_buffered): Within the write loop, check if we have a situation
   with an empty buffer and a large chunk pending to be written.  In this
   case, bypass the buffering and write the remaining chunk with a single
   syscall.  Return an appropriate number of written bytes to satisfy
   the apr_file_write() function contract.
  (apr_file_write): Adjust call to write_buffered().

* test/testfile.c
  (test_large_write_buffered,
   test_two_large_writes_buffered,
   test_small_and_large_writes_buffered,
   test_write_buffered_spanning_over_bufsize): New tests.
  (testfile): Run the new tests.

Patch by: Evgeny Kotkov <evgeny.kotkov {at} visualsvn.com>

Index: file_io/win32/readwrite.c
===================================================================
--- file_io/win32/readwrite.c   (revision 1805607)
+++ file_io/win32/readwrite.c   (working copy)
@@ -290,9 +290,8 @@ static apr_status_t write_helper(HANDLE filehand,
 }
 
 static apr_status_t write_buffered(apr_file_t *thefile, const char *buf,
-                                   apr_size_t len)
+                                   apr_size_t len, apr_size_t *pwritten)
 {
-    apr_size_t blocksize;
     apr_status_t rv;
 
     if (thefile->direction == 0) {
@@ -306,20 +305,41 @@ static apr_status_t write_buffered(apr_file_t *the
         thefile->direction = 1;
     }
 
-    rv = 0;
-    while (rv == 0 && len > 0) {
-        if (thefile->bufpos == thefile->bufsize)   /* write buffer is full */
+    *pwritten = 0;
+
+    while (len > 0) {
+        if (thefile->bufpos == thefile->bufsize) { /* write buffer is full */
             rv = apr_file_flush(thefile);
+            if (rv) {
+                return rv;
+            }
+        }
+        /* If our buffer is empty, and we cannot fit the remaining chunk
+         * into it, write the chunk with a single syscall and return.
+         */
+        if (thefile->bufpos == 0 && len > thefile->bufsize) {
+            apr_size_t written;
 
-        blocksize = len > thefile->bufsize - thefile->bufpos ?
-                    thefile->bufsize - thefile->bufpos : len;
-        memcpy(thefile->buffer + thefile->bufpos, buf, blocksize);
-        thefile->bufpos += blocksize;
-        buf += blocksize;
-        len -= blocksize;
+            rv = write_helper(thefile->filehand, buf, len, &written);
+            thefile->filePtr += written;
+            *pwritten += written;
+            return rv;
+        }
+        else {
+            apr_size_t blocksize = len;
+
+            if (blocksize > thefile->bufsize - thefile->bufpos) {
+                blocksize = thefile->bufsize - thefile->bufpos;
+            }
+            memcpy(thefile->buffer + thefile->bufpos, buf, blocksize);
+            thefile->bufpos += blocksize;
+            buf += blocksize;
+            len -= blocksize;
+            *pwritten += blocksize;
+        }
     }
 
-    return rv;
+    return APR_SUCCESS;
 }
 
 APR_DECLARE(apr_status_t) apr_file_write(apr_file_t *thefile, const void *buf, 
apr_size_t *nbytes)
Index: test/testfile.c
===================================================================
--- test/testfile.c     (revision 1805607)
+++ test/testfile.c     (working copy)
@@ -1446,6 +1446,188 @@ static void test_append(abts_case *tc, void *data)
     apr_file_close(f2);
 }
 
+static void test_large_write_buffered(abts_case *tc, void *data)
+{
+    apr_status_t rv;
+    apr_file_t *f;
+    const char *fname = "data/testlarge_write_buffered.dat";
+    apr_size_t len;
+    apr_size_t bytes_written;
+    apr_size_t bytes_read;
+    char *buf;
+    char *buf2;
+
+    apr_file_remove(fname, p);
+
+    rv = apr_file_open(&f, fname,
+                       APR_FOPEN_CREATE | APR_FOPEN_WRITE | APR_FOPEN_BUFFERED,
+                       APR_FPROT_OS_DEFAULT, p);
+    APR_ASSERT_SUCCESS(tc, "open test file for writing", rv);
+
+    /* Test a single large write. */
+    len = 80000;
+    buf = apr_palloc(p, len);
+    memset(buf, 'a', len);
+    rv = apr_file_write_full(f, buf, len, &bytes_written);
+    APR_ASSERT_SUCCESS(tc, "write to file", rv);
+    ABTS_INT_EQUAL(tc, (int)len, (int)bytes_written);
+    apr_file_close(f);
+
+    rv = apr_file_open(&f, fname, APR_FOPEN_READ,
+                       APR_FPROT_OS_DEFAULT, p);
+    APR_ASSERT_SUCCESS(tc, "open test file for reading", rv);
+
+    buf2 = apr_palloc(p, len + 1);
+    rv = apr_file_read_full(f, buf2, len + 1, &bytes_read);
+    ABTS_INT_EQUAL(tc, APR_EOF, rv);
+    ABTS_INT_EQUAL(tc, (int)len, (int)bytes_read);
+    ABTS_TRUE(tc, memcmp(buf, buf2, len) == 0);
+    apr_file_close(f);
+
+    apr_file_remove(fname, p);
+}
+
+static void test_two_large_writes_buffered(abts_case *tc, void *data)
+{
+    apr_status_t rv;
+    apr_file_t *f;
+    const char *fname = "data/testtwo_large_writes_buffered.dat";
+    apr_size_t len;
+    apr_size_t bytes_written;
+    apr_size_t bytes_read;
+    char *buf;
+    char *buf2;
+
+    apr_file_remove(fname, p);
+
+    rv = apr_file_open(&f, fname,
+                       APR_FOPEN_CREATE | APR_FOPEN_WRITE | APR_FOPEN_BUFFERED,
+                       APR_FPROT_OS_DEFAULT, p);
+    APR_ASSERT_SUCCESS(tc, "open test file for writing", rv);
+
+    /* Test two consecutive large writes. */
+    len = 80000;
+    buf = apr_palloc(p, len);
+    memset(buf, 'a', len);
+
+    rv = apr_file_write_full(f, buf, len / 2, &bytes_written);
+    APR_ASSERT_SUCCESS(tc, "write to file", rv);
+    ABTS_INT_EQUAL(tc, (int)(len / 2), (int)bytes_written);
+
+    rv = apr_file_write_full(f, buf, len / 2, &bytes_written);
+    APR_ASSERT_SUCCESS(tc, "write to file", rv);
+    ABTS_INT_EQUAL(tc, (int)(len / 2), (int)bytes_written);
+
+    apr_file_close(f);
+
+    rv = apr_file_open(&f, fname, APR_FOPEN_READ,
+                       APR_FPROT_OS_DEFAULT, p);
+    APR_ASSERT_SUCCESS(tc, "open test file for reading", rv);
+
+    buf2 = apr_palloc(p, len + 1);
+    rv = apr_file_read_full(f, buf2, len + 1, &bytes_read);
+    ABTS_INT_EQUAL(tc, APR_EOF, rv);
+    ABTS_INT_EQUAL(tc, (int) len, (int)bytes_read);
+    ABTS_TRUE(tc, memcmp(buf, buf2, len) == 0);
+    apr_file_close(f);
+
+    apr_file_remove(fname, p);
+}
+
+static void test_small_and_large_writes_buffered(abts_case *tc, void *data)
+{
+    apr_status_t rv;
+    apr_file_t *f;
+    const char *fname = "data/testtwo_large_writes_buffered.dat";
+    apr_size_t len;
+    apr_size_t bytes_written;
+    apr_size_t bytes_read;
+    char *buf;
+    char *buf2;
+
+    apr_file_remove(fname, p);
+
+    rv = apr_file_open(&f, fname,
+                       APR_FOPEN_CREATE | APR_FOPEN_WRITE | APR_FOPEN_BUFFERED,
+                       APR_FPROT_OS_DEFAULT, p);
+    APR_ASSERT_SUCCESS(tc, "open test file for writing", rv);
+
+    /* Test small write followed by a large write. */
+    len = 80000;
+    buf = apr_palloc(p, len);
+    memset(buf, 'a', len);
+
+    rv = apr_file_write_full(f, buf, 5, &bytes_written);
+    APR_ASSERT_SUCCESS(tc, "write to file", rv);
+    ABTS_INT_EQUAL(tc, 5, (int)bytes_written);
+
+    rv = apr_file_write_full(f, buf, len - 5, &bytes_written);
+    APR_ASSERT_SUCCESS(tc, "write to file", rv);
+    ABTS_INT_EQUAL(tc, (int)(len - 5), (int)bytes_written);
+
+    apr_file_close(f);
+
+    rv = apr_file_open(&f, fname, APR_FOPEN_READ,
+                       APR_FPROT_OS_DEFAULT, p);
+    APR_ASSERT_SUCCESS(tc, "open test file for reading", rv);
+
+    buf2 = apr_palloc(p, len + 1);
+    rv = apr_file_read_full(f, buf2, len + 1, &bytes_read);
+    ABTS_INT_EQUAL(tc, APR_EOF, rv);
+    ABTS_INT_EQUAL(tc, (int) len, (int)bytes_read);
+    ABTS_TRUE(tc, memcmp(buf, buf2, len) == 0);
+    apr_file_close(f);
+
+    apr_file_remove(fname, p);
+}
+
+static void test_write_buffered_spanning_over_bufsize(abts_case *tc, void 
*data)
+{
+    apr_status_t rv;
+    apr_file_t *f;
+    const char *fname = "data/testwrite_buffered_spanning_over_bufsize.dat";
+    apr_size_t len;
+    apr_size_t bytes_written;
+    apr_size_t bytes_read;
+    char *buf;
+    char *buf2;
+
+    apr_file_remove(fname, p);
+
+    rv = apr_file_open(&f, fname,
+                       APR_FOPEN_CREATE | APR_FOPEN_WRITE | APR_FOPEN_BUFFERED,
+                       APR_FPROT_OS_DEFAULT, p);
+    APR_ASSERT_SUCCESS(tc, "open test file for writing", rv);
+
+    /* Test three writes than span over the default buffer size. */
+    len = APR_BUFFERSIZE + 1;
+    buf = apr_palloc(p, len);
+    memset(buf, 'a', len);
+
+    rv = apr_file_write_full(f, buf, APR_BUFFERSIZE - 1, &bytes_written);
+    APR_ASSERT_SUCCESS(tc, "write to file", rv);
+    ABTS_INT_EQUAL(tc, APR_BUFFERSIZE - 1, (int)bytes_written);
+
+    rv = apr_file_write_full(f, buf, 2, &bytes_written);
+    APR_ASSERT_SUCCESS(tc, "write to file", rv);
+    ABTS_INT_EQUAL(tc, 2, (int)bytes_written);
+
+    apr_file_close(f);
+
+    rv = apr_file_open(&f, fname, APR_FOPEN_READ,
+                       APR_FPROT_OS_DEFAULT, p);
+    APR_ASSERT_SUCCESS(tc, "open test file for reading", rv);
+
+    buf2 = apr_palloc(p, len + 1);
+    rv = apr_file_read_full(f, buf2, len + 1, &bytes_read);
+    ABTS_INT_EQUAL(tc, APR_EOF, rv);
+    ABTS_INT_EQUAL(tc, (int)len, (int)bytes_read);
+    ABTS_TRUE(tc, memcmp(buf, buf2, len) == 0);
+    apr_file_close(f);
+
+    apr_file_remove(fname, p);
+}
+
 abts_suite *testfile(abts_suite *suite)
 {
     suite = ADD_SUITE(suite)
@@ -1495,6 +1677,10 @@ abts_suite *testfile(abts_suite *suite)
     abts_run_test(suite, test_buffer_set_get, NULL);
     abts_run_test(suite, test_xthread, NULL);
     abts_run_test(suite, test_append, NULL);
+    abts_run_test(suite, test_large_write_buffered, NULL);
+    abts_run_test(suite, test_two_large_writes_buffered, NULL);
+    abts_run_test(suite, test_small_and_large_writes_buffered, NULL);
+    abts_run_test(suite, test_write_buffered_spanning_over_bufsize, NULL);
 
     return suite;
 }

Reply via email to