On 16.08.2012 14:30, Corinna Vinschen wrote:
On Aug 16 14:11, Thomas Wolff wrote:
Hi Corinna,

On 16.08.2012 11:33, Corinna Vinschen wrote:
Hi Thomas,

thanks for the patch.   I have a few minor nits:

On Aug 14 22:56, Thomas Wolff wrote:
...
+         char cprabuf [8 + 1]; /* need this length for surrogates */
+         if (len < 8)
+           {
+             _ptr = cprabuf;
+             _len = 8;
+           }
8?  Why 8?  The size appears to be rather artificial.  The code should
use MB_CUR_MAX instead.
MB_CUR_MAX does not work because its value is 1 at this point
So what about MB_LEN_MAX then?  There's no problem using a multiplier,
but a symbolic constant is always better than a numerical constant.
I've now used _MB_LEN_MAX from newlib.h, rather than MB_LEN_MAX from limits.h (note the "_" distinction :) ), because the latter, by its preceding comment, reserves the option to be changed into a dynamic function in the future, which could then possibly have the same problems as MB_CUR_MAX.

About the surrogates problem, I think I've found a solution:
I've added an explicit test to avoid processing of split surrogate pairs (to that loop...); this seems to work now.

+             /* If using read-ahead buffer, copy to class read-ahead buffer
+                and deliver first byte. */
+             if (_ptr == cprabuf)
+               {
+                 puts_readahead (cprabuf, ret);
+                 * (char *) ptr = get_readahead ();
+                 ret = 1;
(*) Ok, that works, but wouldn't it be more efficient to do that in
a tiny loop along the lines of

                  int x;
                  ret = 0;
                   while (ret < len && (x = get_readahead ()) >= 0)
                    ptr++ = x;
                    ret++;

?
I can add it if you prefer; I just didn't think it's worth the
effort and concerning efficiency, after that prior trial-and-error
count-down-loop...
Yeah, that's a valid point.  But maybe we shouldn't make it slower
than necessary?  If you have a good idea how to avoid the other
loop, don't hesitate to submit a patch.
Added the loop to use up the caller's buffer.
About avoiding the trial-and-error loop, I think that would require digging into sys_mbstowcs (which doesn't even seem to behave as documented).

------
Thomas
--- sav/fhandler_clipboard.cc   2012-07-08 02:36:47.000000000 +0200
+++ ./fhandler_clipboard.cc     2012-08-16 16:08:23.782692300 +0200
@@ -222,6 +222,7 @@ fhandler_dev_clipboard::read (void *ptr,
   UINT formatlist[2];
   int format;
   LPVOID cb_data;
+  int rach;
 
   if (!OpenClipboard (NULL))
     {
@@ -243,12 +244,24 @@ fhandler_dev_clipboard::read (void *ptr,
       cygcb_t *clipbuf = (cygcb_t *) cb_data;
 
       if (pos < clipbuf->len)
-       {
+       {
          ret = ((len > (clipbuf->len - pos)) ? (clipbuf->len - pos) : len);
          memcpy (ptr, clipbuf->data + pos , ret);
          pos += ret;
        }
     }
+  else if ((rach = get_readahead ()) >= 0)
+    {
+      /* Deliver from read-ahead buffer. */
+      char * out_ptr = (char *) ptr;
+      * out_ptr++ = rach;
+      ret = 1;
+      while (ret < len && (rach = get_readahead ()) >= 0)
+       {
+         * out_ptr++ = rach;
+         ret++;
+       }
+    }
   else
     {
       wchar_t *buf = (wchar_t *) cb_data;
@@ -256,25 +269,54 @@ fhandler_dev_clipboard::read (void *ptr,
       size_t glen = GlobalSize (hglb) / sizeof (WCHAR) - 1;
       if (pos < glen)
        {
+         /* If caller's buffer is too small to hold at least one 
+            max-size character, redirect algorithm to local 
+            read-ahead buffer, finally fill class read-ahead buffer 
+            with result and feed caller from there. */
+         char * conv_ptr = (char *) ptr;
+         size_t conv_len = len;
+#define cprabuf_len _MB_LEN_MAX        /* newlib's max MB_CUR_MAX of all 
encodings */
+         char cprabuf [cprabuf_len];
+         if (len < cprabuf_len)
+           {
+             conv_ptr = cprabuf;
+             conv_len = cprabuf_len;
+           }
+
          /* Comparing apples and oranges here, but the below loop could become
             extremly slow otherwise.  We rather return a few bytes less than
             possible instead of being even more slow than usual... */
-         if (glen > pos + len)
-           glen = pos + len;
+         if (glen > pos + conv_len)
+           glen = pos + conv_len;
          /* This loop is necessary because the number of bytes returned by
             sys_wcstombs does not indicate the number of wide chars used for
             it, so we could potentially drop wide chars. */
          while ((ret = sys_wcstombs (NULL, 0, buf + pos, glen - pos))
                  != (size_t) -1
-                && ret > len)
+                && (ret > conv_len 
+                       /* Skip separated high surrogate: */
+                    || ((buf [pos + glen - 1] & 0xFC00) == 0xD800 && glen - 
pos > 1)))
             --glen;
          if (ret == (size_t) -1)
            ret = 0;
          else
            {
-             ret = sys_wcstombs ((char *) ptr, (size_t) -1,
+             ret = sys_wcstombs ((char *) conv_ptr, (size_t) -1,
                                  buf + pos, glen - pos);
              pos = glen;
+             /* If using read-ahead buffer, copy to class read-ahead buffer
+                and deliver first byte. */
+             if (conv_ptr == cprabuf)
+               {
+                 puts_readahead (cprabuf, ret);
+                 char * out_ptr = (char *) ptr;
+                 ret = 0;
+                 while (ret < len && (rach = get_readahead ()) >= 0)
+                   {
+                     * out_ptr++ = rach;
+                     ret++;
+                   }
+               }
            }
        }
     }

Reply via email to