On Tue, 2010-11-30, Danny Trebbien wrote:
> Attached is version 4 of the patch and corresponding log message that
> address Daniel Shahaf's feedback from November 9.
> 
> Per a message from Julian
> (http://article.gmane.org/gmane.comp.version-control.subversion.devel/124073)
> and Daniel Shahaf
> (http://article.gmane.org/gmane.comp.version-control.subversion.devel/124082),
> I have removed the changes to optimize translate_newline() for now.

Thanks, Danny.

This looks fine.

I added doc strings to the new static functions stream_translated() and
translate_cstring().  I clarified in all doc strings whether
TRANSLATED_EOL is set to FALSE or untouched if there is no translation.
(In some cases you documented these more in the log message than in the
code :-)

Just about to commit this ... FAIL: autoprop_tests.py 15, blame_tests.py
7, commit_tests.py 40, ...

subst.c' line 668: assertion failed (STRING_IS_EOL(eol_str,
eol_str_len))

Whassup?  Attaching my version, as I'm out of time tonight.  Hope I
didn't mess it up.  I was testing tr...@1042741 (patched as attached),
svnserve and FSFS, in case it matters.

Cheers,
- Julian


Add a public API function, svn_subst_translate_string2(), an extension of
svn_subst_translate_string(), that has two additional output parameters for
determining whether re-encoding and/or line ending translation were performed.

As discussed at:
  <http://thread.gmane.org/gmane.comp.version-control.subversion.devel/122550>
  <http://thread.gmane.org/gmane.comp.version-control.subversion.devel/123020>

The essential changes are to the translate_newline() function, which now takes
an svn_boolean_t pointer, the value at which is set to TRUE if the pointer is
non-NULL and a different newline is written out. Most other changes are to pass
the svn_boolean_t pointer through to translate_newline().

* subversion/include/svn_subst.h
  (svn_subst_translate_string2): New function.
  (svn_subst_translate_string): Deprecate in favor of
    svn_subst_translate_string2().

* subversion/libsvn_subr/subst.c
  (STRING_IS_EOL): New macro that tests whether a string is an end-of-line
    string ("\n", "\r", "\r\n").
  (DIFFERENT_EOL_STRINGS): New macro that tests whether two end-of-line strings
    are different.
  (translate_newline): Add the TRANSLATED_EOL parameter. If the function
    writes out a different newline, then it sets TRANSLATED_EOL to TRUE.
  (translation_baton): Add the TRANSLATED_EOL field.
  (create_translation_baton): Add a new parameter TRANSLATED_EOL that is
    passed to the resulting translation_baton.
  (translate_chunk): When calling translate_newline(), pass TRANSLATED_EOL from
    the translation_baton.
  (stream_translated): New static function. Its implementation is the old
    implementation of svn_subst_stream_translated(), but accepting another
    parameter, TRANSLATED_EOL, that is passed to the in/out translation batons
    that it creates.
  (svn_subst_stream_translated): Now a wrapper for stream_translated().
  (translate_cstring): New static function. Its implementation is the old
    implementation of svn_subst_translate_cstring2(), but modified to accept
    another parameter, TRANSLATED_EOL, that is passed to stream_translated().
  (svn_subst_translate_cstring2): Now a wrapper for translate_cstring().
  (svn_subst_translate_string): Move to deprecated.c.
  (svn_subst_translate_string2): New function. It takes three additional
    parameters: TRANSLATED_TO_UTF8, TRANSLATED_LINE_ENDINGS, and another pool
    parameter. The task of recording whether it translates a line ending is
    delegated to translate_cstring().

* subversion/libsvn_subr/deprecated.c
  (svn_subst_translate_string): Now a wrapper for svn_subst_translate_string2().

Patch by: Danny Trebbien <dtrebbien{_AT_}gmail.com>
--This line, and those below, will be ignored--

Index: subversion/include/svn_subst.h
===================================================================
--- subversion/include/svn_subst.h	(revision 1042741)
+++ subversion/include/svn_subst.h	(working copy)
@@ -592,19 +592,46 @@ svn_subst_stream_detranslated(svn_stream_t **strea
 
 /* EOL conversion and character encodings */
 
+/** Similar to svn_subst_translate_string2(), except that the information about
+ * whether re-encoding or line ending translation were performed is discarded.
+ *
+ * @deprecated Provided for backward compatibility with the 1.6 API.
+ */
+SVN_DEPRECATED
+svn_error_t *svn_subst_translate_string(svn_string_t **new_value,
+                                        const svn_string_t *value,
+                                        const char *encoding,
+                                        apr_pool_t *pool);
+
 /** Translate the string @a value from character encoding @a encoding to
  * UTF8, and also from its current line-ending style to LF line-endings.  If
  * @a encoding is @c NULL, translate from the system-default encoding.
  *
+ * If @a translated_to_utf8 is not @c NULL, then set @a *translated_to_utf8
+ * to @c TRUE if at least one character of @a value in the source character
+ * encoding was translated to UTF-8, or to @c FALSE otherwise.
+ *
+ * If @a translated_line_endings is not @c NULL, then set @a
+ * *translated_line_endings to @c TRUE if at least one line ending was
+ * changed to LF, or to @c FALSE otherwise.
+ *
  * Recognized line endings are LF, CR, CRLF.  If @a value has inconsistent
  * line endings, return @c SVN_ERR_IO_INCONSISTENT_EOL.
  *
- * Set @a *new_value to the translated string, allocated in @a pool.
+ * Set @a *new_value to the translated string, allocated in @a result_pool.
+ *
+ * @a scratch_pool is used for temporary allocations.
+ *
+ * @since New in 1.7.
  */
-svn_error_t *svn_subst_translate_string(svn_string_t **new_value,
-                                        const svn_string_t *value,
-                                        const char *encoding,
-                                        apr_pool_t *pool);
+svn_error_t *
+svn_subst_translate_string2(svn_string_t **new_value,
+                            svn_boolean_t *translated_to_utf8,
+                            svn_boolean_t *translated_line_endings,
+                            const svn_string_t *value,
+                            const char *encoding,
+                            apr_pool_t *result_pool,
+                            apr_pool_t *scratch_pool);
 
 /** Translate the string @a value from UTF8 and LF line-endings into native
  * character encoding and native line-endings.  If @a for_output is TRUE,
Index: subversion/libsvn_subr/subst.c
===================================================================
--- subversion/libsvn_subr/subst.c	(revision 1042741)
+++ subversion/libsvn_subr/subst.c	(working copy)
@@ -606,10 +606,34 @@ translate_keyword(char *buf,
   return FALSE;
 }
 
+/* A boolean expression that evaluates to true if the first STR_LEN characters
+   of the string STR are one of the end-of-line strings LF, CR, or CRLF;
+   to false otherwise.  */
+#define STRING_IS_EOL(str, str_len) \
+  (((str_len) == 2 &&  (str)[0] == '\r' && (str)[1] == '\n') || \
+   ((str_len) == 1 && ((str)[0] == '\n' || (str)[0] == '\r')))
 
+/* A boolean expression that evaluates to true if the end-of-line string EOL1,
+   having length EOL1_LEN, and the end-of-line string EOL2, having length
+   EOL2_LEN, are different, assuming that EOL1 and EOL2 are both from the
+   set {"\n", "\r", "\r\n"};  to false otherwise.
+
+   Given that EOL1 and EOL2 are either "\n", "\r", or "\r\n", then if
+   EOL1_LEN is not the same as EOL2_LEN, then EOL1 and EOL2 are of course
+   different. If EOL1_LEN and EOL2_LEN are both 2 then EOL1 and EOL2 are both
+   "\r\n" and *EOL1 == *EOL2. Otherwise, EOL1_LEN and EOL2_LEN are both 1.
+   We need only check the one character for equality to determine whether
+   EOL1 and EOL2 are different in that case. */
+#define DIFFERENT_EOL_STRINGS(eol1, eol1_len, eol2, eol2_len) \
+  (((eol1_len) != (eol2_len)) || (*(eol1) != *(eol2)))
+
+
 /* Translate the newline string NEWLINE_BUF (of length NEWLINE_LEN) to
    the newline string EOL_STR (of length EOL_STR_LEN), writing the
    result (which is always EOL_STR) to the stream DST.
+
+   This function assumes that each of NEWLINE_BUF and EOL_STR is either "\n",
+   "\r", or "\r\n".
 
    Also check for consistency of the source newline strings across
    multiple calls, using SRC_FORMAT (length *SRC_FORMAT_LEN) as a cache
@@ -620,6 +644,11 @@ translate_keyword(char *buf,
    newline in the file, and copy it to {SRC_FORMAT, *SRC_FORMAT_LEN} to
    use for later consistency checks.
 
+   If TRANSLATED_EOL is not NULL, then set *TRANSLATED_EOL to TRUE if the
+   newline string that was written (EOL_STR) is not the same as the newline
+   string that was translated (NEWLINE_BUF), otherwise leave *TRANSLATED_EOL
+   untouched.
+
    Note: all parameters are required even if REPAIR is TRUE.
    ### We could require that REPAIR must not change across a sequence of
        calls, and could then optimize by not using SRC_FORMAT at all if
@@ -633,17 +662,20 @@ translate_newline(const char *eol_str,
                   const char *newline_buf,
                   apr_size_t newline_len,
                   svn_stream_t *dst,
+                  svn_boolean_t *translated_eol,
                   svn_boolean_t repair)
 {
+  SVN_ERR_ASSERT(STRING_IS_EOL(eol_str, eol_str_len));
+  SVN_ERR_ASSERT(STRING_IS_EOL(newline_buf, newline_len));
+
   /* If we've seen a newline before, compare it with our cache to
      check for consistency, else cache it for future comparisons. */
   if (*src_format_len)
     {
       /* Comparing with cache.  If we are inconsistent and
          we are NOT repairing the file, generate an error! */
-      if ((! repair) &&
-          ((*src_format_len != newline_len) ||
-           (strncmp(src_format, newline_buf, newline_len))))
+      if ((! repair) && DIFFERENT_EOL_STRINGS(src_format, *src_format_len,
+                                              newline_buf, newline_len))
         return svn_error_create(SVN_ERR_IO_INCONSISTENT_EOL, NULL, NULL);
     }
   else
@@ -653,8 +685,15 @@ translate_newline(const char *eol_str,
       strncpy(src_format, newline_buf, newline_len);
       *src_format_len = newline_len;
     }
+
   /* Write the desired newline */
-  return translate_write(dst, eol_str, eol_str_len);
+  SVN_ERR(translate_write(dst, eol_str, eol_str_len));
+
+  if (translated_eol != NULL && DIFFERENT_EOL_STRINGS(eol_str, eol_str_len,
+                                                      newline_buf, newline_len))
+    *translated_eol = TRUE;
+
+  return SVN_NO_ERROR;
 }
 
 
@@ -765,10 +804,12 @@ svn_subst_keywords_differ2(apr_hash_t *a,
   return FALSE;
 }
 
+
 /* Baton for translate_chunk() to store its state in. */
 struct translation_baton
 {
   const char *eol_str;
+  svn_boolean_t *translated_eol;
   svn_boolean_t repair;
   apr_hash_t *keywords;
   svn_boolean_t expand;
@@ -813,6 +854,7 @@ struct translation_baton
  */
 static struct translation_baton *
 create_translation_baton(const char *eol_str,
+                         svn_boolean_t *translated_eol,
                          svn_boolean_t repair,
                          apr_hash_t *keywords,
                          svn_boolean_t expand,
@@ -826,6 +868,7 @@ create_translation_baton(const char *eol_str,
 
   b->eol_str = eol_str;
   b->eol_str_len = eol_str ? strlen(eol_str) : 0;
+  b->translated_eol = translated_eol;
   b->repair = repair;
   b->keywords = keywords;
   b->expand = expand;
@@ -888,6 +931,9 @@ eol_unchanged(struct translation_baton *b,
  * To finish a series of chunk translations, flush all buffers by calling
  * this routine with a NULL value for BUF.
  *
+ * If B->translated_eol is not NULL, then set *B->translated_eol to TRUE if
+ * an end-of-line sequence was changed, otherwise leave it untouched.
+ *
  * Use POOL for temporary allocations.
  */
 static svn_error_t *
@@ -924,7 +970,8 @@ translate_chunk(svn_stream_t *dst,
               SVN_ERR(translate_newline(b->eol_str, b->eol_str_len,
                                         b->src_format,
                                         &b->src_format_len, b->newline_buf,
-                                        b->newline_off, dst, b->repair));
+                                        b->newline_off, dst, b->translated_eol,
+                                        b->repair));
 
               b->newline_off = 0;
             }
@@ -1070,7 +1117,8 @@ translate_chunk(svn_stream_t *dst,
                                             b->src_format,
                                             &b->src_format_len,
                                             b->newline_buf,
-                                            b->newline_off, dst, b->repair));
+                                            b->newline_off, dst,
+                                            b->translated_eol, b->repair));
 
                   b->newline_off = 0;
                   break;
@@ -1086,7 +1134,7 @@ translate_chunk(svn_stream_t *dst,
           SVN_ERR(translate_newline(b->eol_str, b->eol_str_len,
                                     b->src_format, &b->src_format_len,
                                     b->newline_buf, b->newline_off,
-                                    dst, b->repair));
+                                    dst, b->translated_eol, b->repair));
           b->newline_off = 0;
         }
 
@@ -1350,13 +1398,20 @@ svn_subst_read_specialfile(svn_stream_t **stream,
 }
 
 
-svn_stream_t *
-svn_subst_stream_translated(svn_stream_t *stream,
-                            const char *eol_str,
-                            svn_boolean_t repair,
-                            apr_hash_t *keywords,
-                            svn_boolean_t expand,
-                            apr_pool_t *result_pool)
+/* Same as svn_subst_stream_translated(), except for the following.
+ *
+ * If TRANSLATED_EOL is not NULL, then reading and/or writing to the stream
+ * will set *TRANSLATED_EOL to TRUE if an end-of-line sequence was changed,
+ * otherwise leave it untouched.
+ */
+static svn_stream_t *
+stream_translated(svn_stream_t *stream,
+                  const char *eol_str,
+                  svn_boolean_t *translated_eol,
+                  svn_boolean_t repair,
+                  apr_hash_t *keywords,
+                  svn_boolean_t expand,
+                  apr_pool_t *result_pool)
 {
   struct translated_stream_baton *baton
     = apr_palloc(result_pool, sizeof(*baton));
@@ -1398,9 +1453,11 @@ svn_subst_read_specialfile(svn_stream_t **stream,
   /* Setup the baton fields */
   baton->stream = stream;
   baton->in_baton
-    = create_translation_baton(eol_str, repair, keywords, expand, result_pool);
+    = create_translation_baton(eol_str, translated_eol, repair, keywords,
+                               expand, result_pool);
   baton->out_baton
-    = create_translation_baton(eol_str, repair, keywords, expand, result_pool);
+    = create_translation_baton(eol_str, translated_eol, repair, keywords,
+                               expand, result_pool);
   baton->written = FALSE;
   baton->readbuf = svn_stringbuf_create("", result_pool);
   baton->readbuf_off = 0;
@@ -1417,15 +1474,33 @@ svn_subst_read_specialfile(svn_stream_t **stream,
   return s;
 }
 
+svn_stream_t *
+svn_subst_stream_translated(svn_stream_t *stream,
+                            const char *eol_str,
+                            svn_boolean_t repair,
+                            apr_hash_t *keywords,
+                            svn_boolean_t expand,
+                            apr_pool_t *result_pool)
+{
+  return stream_translated(stream, eol_str, NULL, repair, keywords, expand,
+                           result_pool);
+}
 
-svn_error_t *
-svn_subst_translate_cstring2(const char *src,
-                             const char **dst,
-                             const char *eol_str,
-                             svn_boolean_t repair,
-                             apr_hash_t *keywords,
-                             svn_boolean_t expand,
-                             apr_pool_t *pool)
+
+/* Same as svn_subst_translate_cstring2(), except for the following.
+ *
+ * If TRANSLATED_EOL is not NULL, then set *TRANSLATED_EOL to TRUE if an
+ * end-of-line sequence was changed, or to FALSE otherwise.
+ */
+static svn_error_t *
+translate_cstring(const char **dst,
+                  svn_boolean_t *translated_eol,
+                  const char *src,
+                  const char *eol_str,
+                  svn_boolean_t repair,
+                  apr_hash_t *keywords,
+                  svn_boolean_t expand,
+                  apr_pool_t *pool)
 {
   svn_stringbuf_t *dst_stringbuf;
   svn_stream_t *dst_stream;
@@ -1442,9 +1517,12 @@ svn_subst_read_specialfile(svn_stream_t **stream,
   dst_stringbuf = svn_stringbuf_create("", pool);
   dst_stream = svn_stream_from_stringbuf(dst_stringbuf, pool);
 
+  if (translated_eol)
+    *translated_eol = FALSE;
+
   /* Another wrapper to translate the content. */
-  dst_stream = svn_subst_stream_translated(dst_stream, eol_str, repair,
-                                           keywords, expand, pool);
+  dst_stream = stream_translated(dst_stream, eol_str, translated_eol, repair,
+                                 keywords, expand, pool);
 
   /* Jam the text into the destination stream (to translate it). */
   SVN_ERR(svn_stream_write(dst_stream, src, &len));
@@ -1456,6 +1534,19 @@ svn_subst_read_specialfile(svn_stream_t **stream,
   return SVN_NO_ERROR;
 }
 
+svn_error_t *
+svn_subst_translate_cstring2(const char *src,
+                             const char **dst,
+                             const char *eol_str,
+                             svn_boolean_t repair,
+                             apr_hash_t *keywords,
+                             svn_boolean_t expand,
+                             apr_pool_t *pool)
+{
+  return translate_cstring(dst, NULL, src, eol_str, repair, keywords, expand,
+                            pool);
+}
+
 /* Given a special file at SRC, generate a textual representation of
    it in a normal file at DST.  Perform all allocations in POOL. */
 /* ### this should be folded into svn_subst_copy_and_translate3 */
@@ -1768,14 +1859,16 @@ svn_subst_stream_from_specialfile(svn_stream_t **s
 
 /*** String translation */
 svn_error_t *
-svn_subst_translate_string(svn_string_t **new_value,
-                           const svn_string_t *value,
-                           const char *encoding,
-                           apr_pool_t *pool)
+svn_subst_translate_string2(svn_string_t **new_value,
+                            svn_boolean_t *translated_to_utf8,
+                            svn_boolean_t *translated_line_endings,
+                            const svn_string_t *value,
+                            const char *encoding,
+                            apr_pool_t *result_pool,
+                            apr_pool_t *scratch_pool)
 {
   const char *val_utf8;
   const char *val_utf8_lf;
-  apr_pool_t *scratch_pool = svn_pool_create(pool);
 
   if (value == NULL)
     {
@@ -1793,16 +1886,19 @@ svn_error_t *
       SVN_ERR(svn_utf_cstring_to_utf8(&val_utf8, value->data, scratch_pool));
     }
 
-  SVN_ERR(svn_subst_translate_cstring2(val_utf8,
-                                       &val_utf8_lf,
-                                       "\n",  /* translate to LF */
-                                       FALSE, /* no repair */
-                                       NULL,  /* no keywords */
-                                       FALSE, /* no expansion */
-                                       scratch_pool));
+  if (translated_to_utf8)
+    *translated_to_utf8 = (strcmp(value->data, val_utf8) != 0);
 
-  *new_value = svn_string_create(val_utf8_lf, pool);
-  svn_pool_destroy(scratch_pool);
+  SVN_ERR(translate_cstring(&val_utf8_lf,
+                            translated_line_endings,
+                            val_utf8,
+                            "\n",  /* translate to LF */
+                            FALSE, /* no repair */
+                            NULL,  /* no keywords */
+                            FALSE, /* no expansion */
+                            scratch_pool));
+
+  *new_value = svn_string_create(val_utf8_lf, result_pool);
   return SVN_NO_ERROR;
 }
 
Index: subversion/libsvn_subr/deprecated.c
===================================================================
--- subversion/libsvn_subr/deprecated.c	(revision 1042741)
+++ subversion/libsvn_subr/deprecated.c	(working copy)
@@ -250,6 +250,16 @@ svn_subst_stream_translated_to_normal_form(svn_str
 }
 
 svn_error_t *
+svn_subst_translate_string(svn_string_t **new_value,
+                           const svn_string_t *value,
+                           const char *encoding,
+                           apr_pool_t *pool)
+{
+  return svn_subst_translate_string2(new_value, NULL, NULL, value,
+                                     encoding, pool, pool);
+}
+
+svn_error_t *
 svn_subst_stream_detranslated(svn_stream_t **stream_p,
                               const char *src,
                               svn_subst_eol_style_t eol_style,

Reply via email to