Hi, On Fri, Nov 30, 2012 at 8:19 PM, Philip Martin <philip.mar...@wandisco.com> wrote: > Stefan Küng <tortoise...@gmail.com> writes: > >> Here's how to reproduce: >> >> $ svn co https://tortoisesvn.googlecode.com/svn/trunk/src/Resources/tools >> tools >> >> get the file here: >> https://skydrive.live.com/redir?resid=D000F60A347E5B37!11352 >> and replace the one in 'tools' with this one. > > I can reproduce locally by importing tools into a local repository, > checking out, replacing the file and attempting the commit. That is > using serf 1.1.x. Using serf trunk the commit goes into a loop. >
I see the same problem in a local repository. With some extra logging I see that one of the delta windows isn't handled correctly by the server: This is svn trunk with serf: write_handler window: {sview_offset = 102400, sview_len = 102400, tview_len = 102400, num_ops = 55, src_ops = 27, ops->action = svn_txdelta_new, new_data = 0x15cbc28} write_handler window: {sview_offset = 204800, sview_len = 102400, tview_len = 102400, num_ops = 143, src_ops = 71, ops->action = svn_txdelta_new, new_data = 0x15c0028} write_handler window: {sview_offset = 307200, sview_len = 102400, tview_len = 102400, num_ops = 23, src_ops = 11, ops->action = svn_txdelta_new, new_data = 0x15be428} write_handler window: {sview_offset = 0, sview_len = 0, tview_len = 102400, num_ops = 1, src_ops = 0, ops->action = svn_txdelta_new, new_data = 0x17e8028} This is svn 1.7.7 with neon: write_handler window: {sview_offset = 102400, sview_len = 102400, tview_len = 102400, num_ops = 55, src_ops = 27, ops->action = svn_txdelta_new, new_data = 0x15cbc28} write_handler window: {sview_offset = 204800, sview_len = 102400, tview_len = 102400, num_ops = 143, src_ops = 71, ops->action = svn_txdelta_new, new_data = 0x15c0028} write_handler window: {sview_offset = 307200, sview_len = 102400, tview_len = 102400, num_ops = 23, src_ops = 11, ops->action = svn_txdelta_new, new_data = 0x15be428} write_handler window: {sview_offset = 0, sview_len = 0, tview_len = 102400, num_ops = 1, src_ops = 0, ops->action = svn_txdelta_new, new_data = 0x17e8028} ... The core issue seems to be introduced in r1390435 as part of the svndiff optimizations. Attached patch fixes the issue for me. I don't know how it impacts other parts of the code, so review is appreciated. The patch still contains logging so not meant to be applied directly! > As far as I can tell the problem is the client causing mod_dav_svn to > SEGV (serf trunk keep retrying and causing multiple SEGVs). The > mod_dav_svn stack trace isn't very useful, I'll need a httpd debug > build: > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7fe2c42e7700 (LWP 31534)] > 0x00007fe2c98245cc in apr_brigade_cleanup () from /usr/lib/libaprutil-1.so.0 > (gdb) bt > #0 0x00007fe2c98245cc in apr_brigade_cleanup () > from /usr/lib/libaprutil-1.so.0 > #1 0x00007fe2c75258bf in ?? () from /usr/lib/apache2/modules/mod_dav.so > #2 0x00007fe2c7528960 in ?? () from /usr/lib/apache2/modules/mod_dav.so > #3 0x00007fe2c9ee51f0 in ap_run_handler () > #4 0x00007fe2c9ee563b in ap_invoke_handler () > #5 0x00007fe2c9ef5448 in ap_process_request () > #6 0x00007fe2c9ef2308 in ?? () > #7 0x00007fe2c9eebbb0 in ap_run_process_connection () > #8 0x00007fe2c9efb55d in ?? () > #9 0x00007fe2c960f597 in ?? () from /usr/lib/libapr-1.so.0 > #10 0x00007fe2c93cbb50 in start_thread (arg=<optimized out>) > at pthread_create.c:304 > #11 0x00007fe2c9115a7d in clone () > at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 > #12 0x0000000000000000 in ?? () > > I'd guess it's memory corruption in the server. Well, besides the client seemingly sending incorrect svndiff windows, the server should not crash. I got the following stack trace from httpd in the debugger: Out of memory - terminating application. Program received signal SIGABRT, Aborted. 0x00007fff88cd7ce2 in __pthread_kill () (gdb) bt #0 0x00007fff88cd7ce2 in __pthread_kill () #1 0x00007fff8381f7d2 in pthread_kill () #2 0x00007fff83810a7a in abort () #3 0x00000001011ef651 in abort_on_pool_failure (retcode=12) at pool.c:55 #4 0x000000010030e290 in apr_palloc () #5 0x00000001012067c7 in svn_stringbuf_create_ensure (blocksize=12804161111182623672, pool=0x100a72428) at string.c:329 #6 0x0000000101206867 in svn_stringbuf_ncreate (bytes=0x1017dd035 "??", size=12804161111182623667, pool=0x100a72428) at string.c:346 #7 0x0000000101199dbe in write_handler (baton=0x100a048b8, buffer=0x1009bfc48 "????ل$8\001", len=0x7fff5fbff2d8) at svndiff.c:886 #8 0x00000001012011fa in svn_stream_write (stream=0x100a04900, data=0x1009bfc48 "????ل$8\001", len=0x7fff5fbff2d8) at stream.c:162 #9 0x000000010102d30f in write_stream (stream=0x1009c8ba8, buf=0x1009bfc48, bufsize=2048) at repos.c:2892 #10 0x00000001007969d4 in dav_handler () #11 0x0000000100001cd6 in ap_invoke_handler () #12 0x0000000100021433 in ap_process_request () #13 0x000000010001eb50 in ap_process_http_connection () #14 0x000000010000da28 in ap_process_connection () #15 0x0000000100027219 in child_main () #16 0x000000010002696a in make_child () #17 0x000000010002600b in ap_mpm_run () #18 0x0000000100007139 in main () (gdb) frame 7 #7 0x0000000101199dbe in write_handler (baton=0x100a048b8, buffer=0x1009bfc48 "????ل$8\001", len=0x7fff5fbff2d8) at svndiff.c:886 886 db->buffer = (gdb) p *len $1 = 2048 (gdb) p remaining $2 = 12804161111182623667 .. (gdb) p db->buffer->data $5 = 0xe4d8c0d9ec42b70f <Address 0xe4d8c0d9ec42b70f out of bounds> Looks like the db->buffer struct is overwritten with data, thereby invalidating the db->buffer->data pointer. A third issue is that serf is either segfaulting or retrying when the server aborts the connection due to this segfault. I'll look into this further. Lieven