New write performance optimizations coming

Damien Katz Thu, 23 Jun 2011 15:49:54 -0700

Hi everyone,

As it’s known by many of you, Filipe and I have been working on improving 
performance, specially write performance [1]. This work has been public in the 
Couchbase github account since the beginning, and the non Couchbase specific 
changes are now isolated in [2] and [3].
In [3] there’s an Erlang module that is used to test the performance when 
writing and updating batches of documents with concurrency, which was used, 
amongst other tools, to measure the performance gains. This module bypasses the 
network stack and the JSON parsing, so that basically it allows us to see more 
easily how significant the changes in couch_file, couch_db and couch_db_updater 
are.


The main and most important change is asynchronous writes. The file module no 
longer blocks callers until the write calls complete. Instead they immediately 
reply to the caller with the position in the file where the data is going to be 
written to. The data is then sent to a dedicated loop process that is 
continuously writing the data it receives, from the couch_file gen_server, to 
disk (and batching when possible). This allows callers (such as the db updater 
for.e.g.) to issue write calls and keep doing other work (preparing documents, 
etc) while the writes are being done in parallel. After issuing all the writes, 
callers simply call the new ‘flush’ function in the couch_file gen_server, 
which will block the caller until everything was effectively written to disk - 
normally this flush call ends up not blocking the caller or it blocks it for a 
very small period.

There are other changes such as avoiding 2 btree lookups per document ID 
(COUCHDB-1084 [4]), faster sorting in the updater (O(n log n) vs O(n^2)) and 
avoid sorting already sorted lists in the updater.

Checking if attachments are compressible was also moved into a new 
module/process. We verified this took much CPU time when all or most of the 
documents to write/update have attachments - building the regexps and matching 
against them for every single attachment is surprisingly expensive.

There’s also a new couch_db:update_doc/s flag named ‘optimistic’ which 
basically changes the behaviour to write the document bodies before entering 
the updater and skip some attachment related checks (duplicated names for 
e.g.). This flag is not yet exposed to the HTTP api, but it could be via an 
X-Optimistic-Write header in the doc PUT/POST requests and _bulk_docs for e.g. 
We’ve seen this as good when the client knows that the documents to write don’t 
exist yet in the database and we aren’t already IO bound, such as when SSDs are 
used.

We used relaximation, Filipe’s basho bench based tests [5] and the Erlang test 
module mentioned before [6, 7], exposed via the HTTP . Here follow some 
benchmark results.


# Using the Erlang test module (test output)

## 1Kb documents, 10 concurrent writers, batches of 500 docs

trunk before snappy was added:

{"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":270071}

trunk:  

{"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":157328}

trunk + async writes (and snappy):

{"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":121518}

## 2.5Kb documents, 10 concurrent writers, batches of 500 docs

trunk before snappy was added:

{"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":507098}

trunk:

{"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":230391}

trunk + async writes (and snappy):

{"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":190151}


# bash bench tests, via the public HTTP APIs

## batches of 1 1Kb docs, 50 writers, 5 minutes run

trunk:     147 702 docs written
branch:  149 534 docs written

## batches of 10 1Kb docs, 50 writers, 5 minutes run

trunk:     878 520 docs written
branch:  991 330 docs written

## batches of 100 1Kb docs, 50 writers, 5 minutes run

trunk:    1 627 600 docs written
branch: 1 865 800 docs written

## batches of 1 2.5Kb docs, 50 writers, 5 minutes run

trunk:    142 531 docs written
branch: 143 012 docs written

## batches of 10 2.5Kb docs, 50 writers, 5 minutes run

trunk:     724 880 docs written
branch:   780 690 docs written

## batches of 100 2.5Kb docs, 50 writers, 5 minutes run

trunk:      1 028 600 docs written
branch:   1 152 800 docs written


# bash bench tests, via the internal Erlang APIs
## batches of 100 2.5Kb docs, 50 writers, 5 minutes run

trunk:    3 170 100 docs written
branch: 3 359 900 docs written


# Relaximation tests

1Kb docs:

http://graphs.mikeal.couchone.com/#/graph/4843dbdf8fa104783870094b83002a1a

2.5Kb docs:

http://graphs.mikeal.couchone.com/#/graph/4843dbdf8fa104783870094b830022c0

4Kb docs:

http://graphs.mikeal.couchone.com/#/graph/4843dbdf8fa104783870094b8300330d


All the documents used for these tests can be found at:  
https://github.com/fdmanana/basho_bench_couch/tree/master/couch_docs


Now some view indexing tests.

# indexer_test_2 database 
(http://fdmanana.couchone.com/_utils/database.html?indexer_test_2)

## trunk

$ time curl 
http://localhost:5984/indexer_test_2/_design/test/_view/view1?limit=1
{"total_rows":1102400,"offset":0,"rows":[
{"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871.73},{"x":153746.28,"y":190006.59}]}
]}

real    20m51.388s
user    0m0.040s
sys     0m0.000s


## branch async writes

$ time curl 
http://localhost:5984/indexer_test_2/_design/test/_view/view1?limit=1
{"total_rows":1102400,"offset":0,"rows":[
{"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871.73},{"x":153746.28,"y":190006.59}]}
]}

real    15m17.908s
user    0m0.008s
sys     0m0.020s


# indexer_test_3_database 
(http://fdmanana.couchone.com/_utils/database.html?indexer_test_3)

## trunk

$ time curl 
http://localhost:5984/indexer_test_3/_design/test/_view/view1?limit=1
{"total_rows":1102400,"offset":0,"rows":[
{"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871.73},{"x":153746.28,"y":190006.59}]}
]}

real    21m17.346s
user    0m0.012s
sys     0m0.028s

## branch async writes

$ time curl 
http://localhost:5984/indexer_test_3/_design/test/_view/view1?limit=1
{"total_rows":1102400,"offset":0,"rows":[
{"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871.73},{"x":153746.28,"y":190006.59}]}
]}

real    16m28.558s
user    0m0.012s
sys     0m0.020s

We don’t show nearly as big of improvements for single write per request 
benchmarks as we do with bulk writes. This is due to the HTTP request overhead 
and our own inefficiencies at that layer. We have lots of room yet for 
optimizations at the networking layer.

We'd like to merge this code into trunk next week by next wednesday. Please 
respond with any improvement, objections or comments by then. Thanks!

-Damien


[1] - 
http://blog.couchbase.com/driving-performance-improvements-couchbase-single-server-two-dot-zero
[2] - https://github.com/fdmanana/couchdb/compare/async_file_writes_no_test
[3] - https://github.com/fdmanana/couchdb/compare/async_file_writes
[4] - https://issues.apache.org/jira/browse/COUCHDB-1084
[5] - https://github.com/fdmanana/basho_bench_couch
[6] - https://github.com/fdmanana/couchdb/blob/async_file_writes/gen_load.sh
[7] - 
https://github.com/fdmanana/couchdb/blob/async_file_writes/src/couchdb/couch_internal_load_gen.erl

New write performance optimizations coming

Reply via email to