Re: /dev/clipboard pasting with small read() buffer

2012-08-17 Thread Thomas Wolff

On 16.08.2012 18:22, Corinna Vinschen wrote:

On Aug 16 09:24, Eric Blake wrote:

On 08/16/2012 08:20 AM, Thomas Wolff wrote:


MB_CUR_MAX does not work because its value is 1 at this point

So what about MB_LEN_MAX then?  There's no problem using a multiplier,
but a symbolic constant is always better than a numerical constant.

I've now used _MB_LEN_MAX from newlib.h, rather than MB_LEN_MAX from
limits.h (note the "_" distinction :) ),
because the latter, by its preceding comment, reserves the option to be
changed into a dynamic function in the future, which could then possibly
have the same problems as MB_CUR_MAX.

POSIX requires MB_LEN_MAX to be a constant, only MB_CUR_MAX can be
dynamic.  We cannot change MB_LEN_MAX to be dynamic in the future.

...also, Cygwin's include/limits.h doesn't mention to convert to
a function.
Not sure how to interpret exactly what it mentions. Anyway, my updated 
patch (using MB_LEN_MAX) proposes a change here as well.

--
Thomas
diff -rup sav/fhandler_clipboard.cc ./fhandler_clipboard.cc
--- sav/fhandler_clipboard.cc   2012-07-08 02:36:47.0 +0200
+++ ./fhandler_clipboard.cc 2012-08-17 10:34:41.96875 +0200
@@ -222,6 +222,7 @@ fhandler_dev_clipboard::read (void *ptr,
   UINT formatlist[2];
   int format;
   LPVOID cb_data;
+  int rach;
 
   if (!OpenClipboard (NULL))
 {
@@ -243,12 +244,24 @@ fhandler_dev_clipboard::read (void *ptr,
   cygcb_t *clipbuf = (cygcb_t *) cb_data;
 
   if (pos < clipbuf->len)
-   {
+   {
  ret = ((len > (clipbuf->len - pos)) ? (clipbuf->len - pos) : len);
  memcpy (ptr, clipbuf->data + pos , ret);
  pos += ret;
}
 }
+  else if ((rach = get_readahead ()) >= 0)
+{
+  /* Deliver from read-ahead buffer. */
+  char * out_ptr = (char *) ptr;
+  * out_ptr++ = rach;
+  ret = 1;
+  while (ret < len && (rach = get_readahead ()) >= 0)
+   {
+ * out_ptr++ = rach;
+ ret++;
+   }
+}
   else
 {
   wchar_t *buf = (wchar_t *) cb_data;
@@ -256,25 +269,54 @@ fhandler_dev_clipboard::read (void *ptr,
   size_t glen = GlobalSize (hglb) / sizeof (WCHAR) - 1;
   if (pos < glen)
{
+ /* If caller's buffer is too small to hold at least one 
+max-size character, redirect algorithm to local 
+read-ahead buffer, finally fill class read-ahead buffer 
+with result and feed caller from there. */
+ char * conv_ptr = (char *) ptr;
+ size_t conv_len = len;
+#define cprabuf_len MB_LEN_MAX /* max MB_CUR_MAX of all encodings */
+ char cprabuf [cprabuf_len];
+ if (len < cprabuf_len)
+   {
+ conv_ptr = cprabuf;
+ conv_len = cprabuf_len;
+   }
+
  /* Comparing apples and oranges here, but the below loop could become
 extremly slow otherwise.  We rather return a few bytes less than
 possible instead of being even more slow than usual... */
- if (glen > pos + len)
-   glen = pos + len;
+ if (glen > pos + conv_len)
+   glen = pos + conv_len;
  /* This loop is necessary because the number of bytes returned by
 sys_wcstombs does not indicate the number of wide chars used for
 it, so we could potentially drop wide chars. */
  while ((ret = sys_wcstombs (NULL, 0, buf + pos, glen - pos))
  != (size_t) -1
-&& ret > len)
+&& (ret > conv_len 
+   /* Skip separated high surrogate: */
+|| ((buf [pos + glen - 1] & 0xFC00) == 0xD800 && glen - 
pos > 1)))
 --glen;
  if (ret == (size_t) -1)
ret = 0;
  else
{
- ret = sys_wcstombs ((char *) ptr, (size_t) -1,
+ ret = sys_wcstombs ((char *) conv_ptr, (size_t) -1,
  buf + pos, glen - pos);
  pos = glen;
+ /* If using read-ahead buffer, copy to class read-ahead buffer
+and deliver first byte. */
+ if (conv_ptr == cprabuf)
+   {
+ puts_readahead (cprabuf, ret);
+ char * out_ptr = (char *) ptr;
+ ret = 0;
+ while (ret < len && (rach = get_readahead ()) >= 0)
+   {
+ * out_ptr++ = rach;
+ ret++;
+   }
+   }
}
}
 }
diff -rup sav/include/limits.h ./include/limits.h
--- sav/include/limits.h2011-07-21 22:21:49.0 +0200
+++ ./include/limits.h  2012-08-16 17:48:34.847141100 +0200
@@ -36,8 +36,7 @@ details. */
 
 /* Maximum length of a multibyte character.  */
 #ifndef MB_LEN_MAX
-/* TODO: This is newlib's max value.  We should probably rather define our
-   own _mbtowc_r and _wctomb_r functions which are only codepage dependent. */
+/* Use value from newlib although 

Re: /dev/clipboard pasting with small read() buffer

2012-08-17 Thread Corinna Vinschen
On Aug 17 10:44, Thomas Wolff wrote:
> On 16.08.2012 18:22, Corinna Vinschen wrote:
> >On Aug 16 09:24, Eric Blake wrote:
> >>On 08/16/2012 08:20 AM, Thomas Wolff wrote:
> >>
> >MB_CUR_MAX does not work because its value is 1 at this point
> So what about MB_LEN_MAX then?  There's no problem using a multiplier,
> but a symbolic constant is always better than a numerical constant.
> >>>I've now used _MB_LEN_MAX from newlib.h, rather than MB_LEN_MAX from
> >>>limits.h (note the "_" distinction :) ),
> >>>because the latter, by its preceding comment, reserves the option to be
> >>>changed into a dynamic function in the future, which could then possibly
> >>>have the same problems as MB_CUR_MAX.
> >>POSIX requires MB_LEN_MAX to be a constant, only MB_CUR_MAX can be
> >>dynamic.  We cannot change MB_LEN_MAX to be dynamic in the future.
> >...also, Cygwin's include/limits.h doesn't mention to convert to
> >a function.
> Not sure how to interpret exactly what it mentions.

This is from the time I was working on the extended locale support
in Cygwin 1.7.  I have not the faintest idea anymore what I was trying
to say with this comment.

>  Anyway, my
> updated patch (using MB_LEN_MAX) proposes a change here as well.

Thanks.  I dropped the hint that 4 is enough.  I'm not so sure about
that.  Linux, for instance, defines MB_LEN_MAX as 16.

Other than that, patch applied.


Thanks,
Corinna

-- 
Corinna Vinschen  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader  cygwin AT cygwin DOT com
Red Hat


Re: /dev/clipboard pasting with small read() buffer

2012-08-17 Thread Thomas Wolff

On 17.08.2012 11:22, Corinna Vinschen wrote:

...

  Anyway, my updated patch (using MB_LEN_MAX) proposes a change here as well.

Thanks.  I dropped the hint that 4 is enough.  I'm not so sure about
that.  Linux, for instance, defines MB_LEN_MAX as 16.
SunOS defines it as 5. 
http://www.kernel.org/doc/man-pages/online/pages/man3/MB_LEN_MAX.3.html 
says in glibc it is typically 6 (which would be needed for original 
UTF-8 covering 31-bit ISO-10646).



Other than that, patch applied.

Thanks
Thomas