On 16.06.2011 12:50, Philip Martin wrote:
Philip Martin<philip.mar...@wandisco.com>  writes:

That's using the sync option on the NFS server.  Using async

Checkout using 1.6:
Elapsed:  73s CPU: 16s
[...]

Checkout using 1.7
Elapsed: 180s CPU: 26s
By comparison the same checkout to a local disk takes about 5s elapsed
for both 1.6 and 1.7.

I tried an experiment with the update editor used by checkout.  At
present it inserts a not-present NODES row for each file in add_file()
and then replaces it with a normal NODES row in close_file().  I removed
the code that inserts the not-present row, the checkout still works
provided it runs to completeion.  This change removes one transaction
per-file from the checkout, and reduces the elapsed time by 27s or 15%.
This matches exactly what I discovered 3 weeks ago
but hadn't found the time, yet to investigate in detail.
So, take the following with a grain of salt.

My hypothesis is that we need only a single db transaction
(plus maybe one for managing the pristine store). Without
changing the editor logic, a file c/o into an empty directory
should look like this:

(1) Receive content and stream to pristine temp
(2) Move to pristine store
(3) Copy & translate to w/c temp
(4) Set flags and time stamp
(5) Add row to NODES
(6) Move from w/c temp to w/c final location
    (preserving flags and time stamp)

The above should be valid workflow that can be interrupted
at any point without corrupting the w/c:

(before 2 is finished) w/c is locked, temps to be cleared upon cleanup
(after 2 is finished) orphaned pristine entry, no idea whether we are
  strict about that today; becomes (most likely) used after w/c update
(before 5 is finished) w/c is locked, temps to be cleared upon cleanup
(after 5 is finished) w/c looks like the file had been added but got
   deleted manually; update should simply "restore" it
(6) supposed to be atomic and non-modifying

In the editor, one of the transactions is commented with
"mark the parent as incomplete". That mark seems to be
unnecessary: In case of an interruption, the w/c will need
to be cleaned up as described above. After that, it looks
like either the file had not been sent at all or the user
removed it manually. There is no intermediate state that
needs to be tracked here.

So to get better performance on network disks we have to remove or
combine transactions.
Eliminating transactions should speed up c/o for any
type of "disk".

-- Stefan^2.

Reply via email to