danielsh had two comments on IRC, https://colabti.org/irclogger/irclogger_log/svn-dev?date=2021-08-09, which I will try to address here. I'm taking the liberty of copying from the IRC log to get some context.
Den mån 9 aug. 2021 kl 03:12 skrev Daniel Sahlberg < daniel.l.sahlb...@gmail.com>: > Hi, > > I've been researching issue #4711 [1] and, as requested by danielsh in > JIRA, I'm turning to dev@ to discuss a possible solution. > > As already found by danielsh, the problem lies in the filtering code in > svn_cl__log_entry_receiver_xml. If the log entry doesn't match the > search pattern, the code will return without emitting any opening tag. > However when the library calls back to close a tag, it is done without > checking if the opening tag was emitted or not. > > Without filtering, the tree is recursive. I have made a test case where > I have three branches, a, b and c. I've made commits to a, which I have > subsequently merged to b and then from b to c. The XML is as follows: > > [[[ > <logentry revision="12"> > <author>dsahlberg</author> > <date>2021-08-08T19:29:38.411358Z</date> > <msg>merge b to c</msg> > <logentry reverse-merge="false" revision="11"> > <author>dsahlberg</author> > <date>2021-08-08T19:29:26.563564Z</date> > <msg>merge a to b</msg> > <logentry revision="10" reverse-merge="false"> > <author>dsahlberg</author> > <date>2021-08-08T19:29:10.860725Z</date> > <msg>add 1 file</msg> > </logentry> > </logentry> > </logentry> > ]]] > > I'm proposing to keep the XML tree intact, but filter out the > information that doesn't match the search pattern. If searching for "add > 1 file", the result would be as follows: > > [[[ > <logentry revision="12"> > <logentry revision="11"> > <logentry revision="10" reverse-merge="false"> > <author>dsahlberg</author> > <date>2021-08-08T19:29:10.860725Z</date> > <msg>add 1 file</msg> > </logentry> > </logentry> > </logentry> > ]]] > danielsh: 1. Re your "I'm proposing", how would with/without XML compare to each other? Cf. site/tools/ compiling XML output into plain output. (It seems I didn't keep my test repo, but I hope the examples should be readable anyway). For readability, I have reformatted the XML with proper indentation: [[[ dsahlberg@daniel-2020:~/temp/wc/c$ ~/svn_trunk_4711/subversion/svn/svn log --use-merge-history --search 'merge a' ------------------------------------------------------------------------ r5 | dsahlberg | 2021-08-09 02:31:34 +0200 (Mon, 09 Aug 2021) | 1 line Merged via: r6 merge a to b ------------------------------------------------------------------------ dsahlberg@daniel-2020:~/temp/wc/c$ ~/svn_trunk_4711/subversion/svn/svn log --use-merge-history --search 'merge a' --xml <?xml version="1.0" encoding="UTF-8"?> <log> <logentry revision="6"> <logentry reverse-merge="false" revision="5"> <author>dsahlberg</author> <date>2021-08-09T00:31:34.969440Z</date> <msg>merge a to b</msg> </logentry> </logentry> </log> ]]] In plain text output the inheritance is show by the "Merged via:" line, in XML is found by examining the tree. I checked site/tools/upcoming.py. It doesn't use "-g" (which is responsible for generating the recursive tree) so it expects all logentry elements to be children of <log>, ie, it would not consider revision 5 above. Any tool that is using "-g" should already be able to traverse a tree with logentry elements that are children of other logentry elements (which could be children of other logentry elements which ...). > The most simple solution is to add the opening tag (<logentry > revision="n">) whenever a log entry has children: > [[[ > > Index: subversion/svn/log-cmd.c > =================================================================== > --- subversion/svn/log-cmd.c (revision 1892053) > +++ subversion/svn/log-cmd.c (working copy) > @@ -552,6 +552,7 @@ > return SVN_NO_ERROR; > } > > + revstr = apr_psprintf(pool, "%ld", log_entry->revision); > /* Match search pattern before XML-escaping. */ > if (lb->search_patterns && > ! match_search_patterns(lb->search_patterns, author, date, message, > @@ -559,6 +560,9 @@ > { > if (log_entry->has_children) > { > + svn_xml_make_open_tag(&sb, pool, svn_xml_normal, "logentry", > + "revision", revstr, SVN_VA_NULL); > + SVN_ERR(svn_cl__error_checked_fputs(sb->data, stdout)); > if (! lb->merge_stack) > lb->merge_stack = apr_array_make(lb->pool, 1, > sizeof(svn_revnum_t)); > > @@ -575,7 +579,6 @@ > if (message) > message = svn_xml_fuzzy_escape(message, pool); > > - revstr = apr_psprintf(pool, "%ld", log_entry->revision); > /* <logentry revision="xxx"> */ > if (lb->merge_stack && lb->merge_stack->nelts > 0) > { > > ]]] > > The problem with this solution is that it might emit empty XML elements, > like this: > > [[[ > > <logentry revision="6"> > </logentry> > ]]] > > Depending on how the XML parser is built and how the information is > used/displayed this may or may not be a problem. As far as I understand, > most (if not all) tags are optional so it should not be an issue that > there are no children (for example <msg></msg>). But it may be that an > empty revision 6 is displayed in this case. > > A more complex solution would require keeping track of the XML tags that > should be emitted if a child matches the search pattern. There is > already an array (lb->merge_stack) that could probably be used for this > purpose. Instead of just storing the revision number it would have to > store if the opening tag has been emitted or not. When a child matches > the search pattern, all parent nodes that has not previously been > emitted will be emitted. When the library signals to close a tag, the > code checks if the corresponding open tag has been emitted or not. > > The following is a concept for such a patch (I know HACKING has some > things to say about how to name typedefs etc, but I don't have time to > study it now, there are also whitespace issues). If this is a more > desirable solution (ie, I get some +1 (concept)) then I will cleanup the > patch and repost. > > [[[ > > Index: subversion/svn/log-cmd.c > =================================================================== > --- subversion/svn/log-cmd.c (revision 1892053) > +++ subversion/svn/log-cmd.c (working copy) > @@ -45,6 +45,11 @@ > > #include "svn_private_config.h" > > +typedef struct merge_stack { > + svn_revnum_t rev; > + svn_boolean_t emitted; > +} merge_stack_t; > + > > /*** Code. ***/ > > @@ -355,10 +360,14 @@ > { > if (log_entry->has_children) > { > + merge_stack_t ms; > + > if (! lb->merge_stack) > - lb->merge_stack = apr_array_make(lb->pool, 1, > sizeof(svn_revnum_t)); > + lb->merge_stack = apr_array_make(lb->pool, 1, > sizeof(merge_stack_t)); > > - APR_ARRAY_PUSH(lb->merge_stack, svn_revnum_t) = > log_entry->revision; > + ms.rev = log_entry->revision; > + ms.emitted = FALSE; > + APR_ARRAY_PUSH(lb->merge_stack, merge_stack_t) = ms; > } > > return SVN_NO_ERROR; > @@ -435,10 +444,10 @@ > iterpool = svn_pool_create(pool); > for (i = 0; i < lb->merge_stack->nelts; i++) > { > - svn_revnum_t rev = APR_ARRAY_IDX(lb->merge_stack, i, > svn_revnum_t); > + merge_stack_t ms = APR_ARRAY_IDX(lb->merge_stack, i, > merge_stack_t); > > svn_pool_clear(iterpool); > - SVN_ERR(svn_cmdline_printf(iterpool, " r%ld%c", rev, > + SVN_ERR(svn_cmdline_printf(iterpool, " r%ld%c", ms.rev, > i == lb->merge_stack->nelts - 1 ? > '\n' : ',')); > } > @@ -475,10 +484,14 @@ > > if (log_entry->has_children) > { > + merge_stack_t ms; > + > if (! lb->merge_stack) > - lb->merge_stack = apr_array_make(lb->pool, 1, > sizeof(svn_revnum_t)); > + lb->merge_stack = apr_array_make(lb->pool, 1, > sizeof(merge_stack_t)); > > - APR_ARRAY_PUSH(lb->merge_stack, svn_revnum_t) = log_entry->revision; > + ms.rev = log_entry->revision; > + ms.emitted = FALSE; > + APR_ARRAY_PUSH(lb->merge_stack, merge_stack_t) = ms; > } > > return SVN_NO_ERROR; > @@ -533,6 +546,7 @@ > const char *author; > const char *date; > const char *message; > + int i; > > if (lb->ctx->cancel_func) > SVN_ERR(lb->ctx->cancel_func(lb->ctx->cancel_baton)); > @@ -544,11 +558,14 @@ > > if (! SVN_IS_VALID_REVNUM(log_entry->revision)) > { > - svn_xml_make_close_tag(&sb, pool, "logentry"); > - SVN_ERR(svn_cl__error_checked_fputs(sb->data, stdout)); > - if (lb->merge_stack) > - apr_array_pop(lb->merge_stack); > - > + if (lb->merge_stack) { > + if (lb->merge_stack->nelts > 0 && (APR_ARRAY_IDX(lb->merge_stack, > lb->merge_stack->nelts-1, merge_stack_t).emitted)) > + { > + svn_xml_make_close_tag(&sb, pool, "logentry"); > + SVN_ERR(svn_cl__error_checked_fputs(sb->data, stdout)); > + } > + apr_array_pop(lb->merge_stack); > + } > return SVN_NO_ERROR; > } > > @@ -559,15 +576,32 @@ > { > if (log_entry->has_children) > { > + merge_stack_t ms; > if (! lb->merge_stack) > - lb->merge_stack = apr_array_make(lb->pool, 1, > sizeof(svn_revnum_t)); > + lb->merge_stack = apr_array_make(lb->pool, 1, > sizeof(merge_stack_t)); > > - APR_ARRAY_PUSH(lb->merge_stack, svn_revnum_t) = > log_entry->revision; > + ms.rev = log_entry->revision; > + ms.emitted = FALSE; > + APR_ARRAY_PUSH(lb->merge_stack, merge_stack_t) = ms; > } > > return SVN_NO_ERROR; > } > - > + else if (lb->search_patterns && lb->merge_stack) { > + /* match_search_patterns returned true */ > + for (i = 0; i < lb->merge_stack->nelts; i++) > + { > + if (!APR_ARRAY_IDX(lb->merge_stack, i, merge_stack_t).emitted) > + { > + revstr = apr_psprintf(pool, "%ld", > + APR_ARRAY_IDX(lb->merge_stack, i, merge_stack_t).rev); > + svn_xml_make_open_tag(&sb, pool, svn_xml_normal, "logentry", > + "revision", revstr, SVN_VA_NULL); > + APR_ARRAY_IDX(lb->merge_stack, i, merge_stack_t).emitted = TRUE; > + } > + } > + } > + > if (author) > author = svn_xml_fuzzy_escape(author, pool); > if (date) > @@ -606,7 +640,6 @@ > if (log_entry->changed_paths2) > { > apr_array_header_t *sorted_paths; > - int i; > > /* <paths> */ > svn_xml_make_open_tag(&sb, pool, svn_xml_normal, "paths", > @@ -684,10 +717,14 @@ > > if (log_entry->has_children) > { > + merge_stack_t ms; > + > if (! lb->merge_stack) > - lb->merge_stack = apr_array_make(lb->pool, 1, > sizeof(svn_revnum_t)); > + lb->merge_stack = apr_array_make(lb->pool, 1, > sizeof(merge_stack_t)); > > - APR_ARRAY_PUSH(lb->merge_stack, svn_revnum_t) = log_entry->revision; > + ms.rev = log_entry->revision; > + ms.emitted = TRUE; > + APR_ARRAY_PUSH(lb->merge_stack, merge_stack_t) = ms; > } > else > svn_xml_make_close_tag(&sb, pool, "logentry"); > > ]]] > > I have run the testsuite with both patch versions without any problems. > (I have obviously removed XFail() from the test for #4711 in log_tests.py). > danielsh: 2. Wondering about a new svn_stream_t type that wraps another stream, caches writes until svn_stream_flush(), and has an undo_last_write() function danielsh: A 2-tuple struct may well be better :-) There is still the need to store "state" somewhere to know if we should flush() or undo(). Pseudo call-sequence (each line starting with - is a call to the receiver), I'm using the same WC as above with an additional revision 7. The calls with r-1 is the invalid revision number used to generate the </logentry> closing tag after evaluating the merge history (the -g). - receiver r7 ... match_search_patterns succeed. We want this revision. Output contents. No children so output the closing tag immediately. Flush(). - receiver r6 ... match_search_patterns fail, so we don't want this revision. It has children so output anyway (but possibly ignoring <msg> etc.). - receiver r5 (because of -g) ... match_search_patterns succeed, so we want this revision. Output contents. No children so output the closing tag immediately. Flush(). - receiver -1 (this was the end of the merge history) ... Output closing tag. Now we need to decide if to Flush() or Undo() depending on if anything was emitted since r6. (The contents of r5 has already been flush()ed but we also need to flush the </logentry> tag). The end [riding off into the sunset]. The only way I can come up with is to store this information in the merge_stack. That being said, I'm not opposed to use an undo()-able stream but I don't think it is enough by itself. Kind regards, Daniel Sahlberg