>> * subversion/svnsync/main.c >> (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list >> of acceptable options for the init, sync, and copy-revprops >> subcommands. > > Missing indentation on the non-first lines. Should be in one of these forms: > > [[[ > * subversion/svnsync/main.c > (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list > of acceptable options for the init, sync, and copy-revprops subcommands. > ]]] > > [[[ > * subversion/svnsync/main.c > (svnsync_cmd_table): > Added svnsync_opt_source_encoding to the list of acceptable options for > the init, sync, and copy-revprops subcommands. > ]]]
Gmail added in the line breaks, but they aren't in my log message text file. Should I wrap the log message to a certain number of characters? >> (log_properties_normalized): Removed this obsolete function. > > In other words, you dropped the entire "count changes" feature, without > any discussion. (It's good that the log message is clear about this > change, though.) > > Please don't do that. Someone spent some time writing and debugging > this feature, you can't just come in and drop their code. If you really > think this is redundant, then start a discussion and get consensus to > drop this functionality. > > And at any rate, your "recode log messages" work needn't depend on it. > It can (IMO: should) even introduce counters for how many properties it > recoded. When I was working on my changes, I was looking for a "to UTF-8" function that would return whether it actually re-encoded the input string, but did not find one. The re-encoding function that I used, `svn_subst_translate_string`, actually converts line endings to LF as well as re-encodes from the given encoding to UTF-8, but it does not inform the caller of whether it took either action. I guess that I could write a utility function, kind of like a `strcmp`, but which ignores any differences at line endings. Unfortunately, this adds another scan through every property value that is encountered. Already there is a noticeable decrease in the performance of the modified `svnsync` as a result of calling `svn_subst_translate_string` on basically every property value, and adding an additional scan through each property value would decrease performance further. I for one do not think that the final "NOTE: Normalized ## properties to LF line endings" messages are particularly helpful because they do not say *which* properties of specific revisions were normalized. As a possible compromise, I could modify a Perl script that I wrote for the purpose of identifying all, non-ASCII property values from the GNU Nano repository to list revisions and property values that *would* require normalization and/or re-encoding. Just a thought... >> * subversion/svnsync/sync.c >> Standardized terminology by replacing "normalize" with "translate". > > Please don't do this. It makes it harder to read the diff (it obscures > the functional changes), it's a bikeshed (http://12a2a2.bikeshed.com), > and in my opinion "normalize" is the more accurate term for what you're > doing now after the patch. :-) > > Just stick to the existing terminology please. Okay. I'll change it back. >> @@ -785,7 +772,7 @@ >> { >> const char *to_url, *from_url; >> svn_ra_session_t *to_session; >> - opt_baton_t *opt_baton = b; >> + opt_baton_t *opt_baton = (opt_baton_t*)b; > > Unnecessary change. As a rule, patches shouldn't have them. Reverted. >> @@ -1625,9 +1589,9 @@ >> /* SUBCOMMAND: help */ >> static svn_error_t * >> -help_cmd(apr_getopt_t *os, void *baton, apr_pool_t *pool) >> +help_cmd(apr_getopt_t *os, void *b, apr_pool_t *pool) >> { > > Why did you rename the formal parameter? Isn't it another unnecessary > change (as I explained above) that shouldn't be randomly included in > a patch? I renamed it because all of the other functions that take a void* parameter for passing the opt_baton_t* call it `b`. It doesn't matter, though. I'll change it back. >> @@ -1982,6 +1951,8 @@ >> >> config = apr_hash_get(opt_baton.config, SVN_CONFIG_CATEGORY_CONFIG, >> APR_HASH_KEY_STRING); >> + >> + opt_baton.source_encoding = source_encoding; >> > > It seems that most other options are parsed into opt_baton.foo directly, > skipping a local variable. Okay. I'll change that. >> Index: subversion/svnsync/sync.c >> =================================================================== >> --- subversion/svnsync/sync.c (revision 995839) >> +++ subversion/svnsync/sync.c (working copy) >> @@ -45,29 +45,39 @@ >> -/* Normalize the line ending style of *STR, so that it contains only >> - * LF (\n) line endings. After return, *STR may point at a new >> - * svn_string_t* allocated from POOL. >> +/* Translate the encoding and line ending style of *STR, so that it contains >> + * only LF (\n) line endings and is encoded in UTF-8. After return, *STR may >> + * point at a new svn_string_t* allocated from POOL if *WAS_TRANSLATED. If >> + * ENCODING is not NULL, then *STR is presumed to be encoded in UTF-8. >> * >> - * *WAS_NORMALIZED is set to TRUE when *STR needed to be normalized, >> + * *WAS_TRANSLATED is set to TRUE when *STR needed to be translated, > > Do you want to maintain separate counters for "encoding had to be > changed" and "linefeeds had to be fixed", or a combined counter? > > (As I said: if you think counters should be removed completely, make > your case in a separate thread. That's orthogonal to the encoding work.) Would dev@ or users@ be better? >> @@ -501,13 +507,11 @@ >> + SVN_ERR(translate_string(&value, &was_translated, eb->prop_encoding, >> pool)); > > Please wrap to 80 characters. Fixed