/dev/clipboard corrupted

Discussion:

Thomas Wolff

2012-07-03 16:29:19 UTC

[taking this thread to cygwin-developers]

You know, we just love STCs. Send you small test program here, plus a
short instruction how you created the clipboard content and how to

call

the testcase to see the problem.

Sure, so here it is. Open clipboard.txt with notepad, ^A^C to copy
all, then run the program to see bytes skipped.
Actually it seems to skip as many bytes per read() as there were
additional UTF-8 bytes (more bytes than characters) in the preceding
read block.
Checking the code again, variable pos seems to be used both as an
index into the clipboard (WCHAR) and an offset to the resulting
string length (char) which would explain the effect (not having
checked all the details though as I'm not familiar with the used
APIs).

Thanks for the testcase. I applied a patch which is supposed to fix the
problem. It should be in the next developer snapshot. Please give it
a try.

The patch (loaded from CVS) seems to almost fix the issue but also
another bug crept in.

* Looking at the code is quite confusing as long as I assumed
sys_wcstombs would work like wcstombs;
the latter is obviously designed to convert complete nul-terminated
wcs strings only as
there is no way to control the number of wcs characters, neither as
consumed in the result
(as your new comment also mentions) nor as setting a limit -
the latter is apparently different, telling from the comment in
strfuncs.cc.
I had tried a patch using wctomb instead, as follows, but it didn't
work;
maybe for some reason the standard functions cannot be used in this
context?
int outlen = 0;
/* Make sure buffer has room for max. length UTF-8 (or GB18030/UHC)
plus final NUL;
this does not work if the total buffer is shorter,
so some read-ahead will be needed for a complete solution */
while (outlen < (int) len - 4 && pos < (int) glen /* IS THIS
CORRECT? */ )
{
int ret1 = wctomb ((char *) ptr + outlen, buf [pos]);
if (ret1 == -1)
{
((char *) ptr) [outlen] = 0x7F; /* ?? */
ret1 = 1;
}
pos ++; /* clipboard buffer position */
outlen += ret1; /* output size */
}
ret = outlen;

* The current (CVS) code will not work if even the first character to be
converted
needs more bytes than the buffer provides, e.g. if the application
calls read() with length 1 only.
Some extra buffering would be needed to make it work then.

* I assume the current code will also fail in non-UTF-8 locales;
if the wcs block being converted contains a non-convertible character,
it would abort since wcstombs returns -1
(assuming here that sys_wcstombs behaves alike in this respect)
and not even deliver the characters before the failing one.

* I had previously observed that with a read size of n only n-1 bytes
would be delivered
and thought this was on purpose because wcstombs appends a final nul
to its result.
Now n bytes are returned (if available) and in fact the byte behind
the read() buffer is
overwritten (see modified test program).

------
Thomas

Corinna Vinschen

2012-07-04 08:15:03 UTC

Permalink

Post by Thomas Wolff
[taking this thread to cygwin-developers]

You know, we just love STCs. Send you small test program here, plus a
short instruction how you created the clipboard content and how to

call

the testcase to see the problem.

Thanks for the testcase. I applied a patch which is supposed to fix the
problem. It should be in the next developer snapshot. Please give it
a try.

You didn't tell how it fails. You *could* use the standard functions,
but be aware that they are locale-dependent and your testcase does not
call setlocale, so it's running in the C locale. Also, don't use
wctomb. Ever. It's not thread-safe. Use wcrtomb with an explicit
local shift state variable instead. Also, len - 4 should be
len - MB_CUR_MAX to be independent of the max length of the current
codeset.

Post by Thomas Wolff
int outlen = 0;
/* Make sure buffer has room for max. length UTF-8 (or GB18030/UHC)
plus final NUL;
this does not work if the total buffer is shorter,
so some read-ahead will be needed for a complete solution */
while (outlen < (int) len - 4 && pos < (int) glen /* IS THIS
CORRECT? */ )
{
int ret1 = wctomb ((char *) ptr + outlen, buf [pos]);
if (ret1 == -1)
{
((char *) ptr) [outlen] = 0x7F; /* ?? */
ret1 = 1;
}
pos ++; /* clipboard buffer position */
outlen += ret1; /* output size */
}
ret = outlen;
* The current (CVS) code will not work if even the first character
to be converted
needs more bytes than the buffer provides, e.g. if the
application calls read() with length 1 only.
Some extra buffering would be needed to make it work then.

Yes, indeed. I'll have a look.

Post by Thomas Wolff
* I assume the current code will also fail in non-UTF-8 locales;
if the wcs block being converted contains a non-convertible character,
it would abort since wcstombs returns -1
(assuming here that sys_wcstombs behaves alike in this respect)

It doesn't. In fact, sys_cp_wcstombs was designed to never fail
to convert a string. See the source code in strfunc.cc, line 447ff.

Post by Thomas Wolff
and not even deliver the characters before the failing one.
* I had previously observed that with a read size of n only n-1
bytes would be delivered
and thought this was on purpose because wcstombs appends a final
nul to its result.
Now n bytes are returned (if available) and in fact the byte
behind the read() buffer is
overwritten (see modified test program).

It's not Cygwin overwriting the byte, your testcase is...

Post by Thomas Wolff
------
Thomas
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
int
main (int argc, char * * argv)
{
char * fn = "/dev/clipboard";
int fd = open (fn, O_RDONLY | O_BINARY, 0);
if (fd < 0) {
exit (fd);
}
int out_tty = isatty (1);
int filebuflen = 100;
argc --;
if (argc > 0) {
int ret = sscanf (argv [1], "%d", & filebuflen);
}
char * filebuf = malloc (filebuflen + 1);
filebuf [filebuflen] = 77;
fprintf (stderr, "filebuflen %d (overflow byte %d)\n", filebuflen, filebuf [filebuflen]);
int n;
do {
n = read (fd, filebuf, filebuflen);
if (out_tty) {
filebuf [n] = 0;

Hmm, what if n == filebuf?

Post by Thomas Wolff
printf ("read %d bytes: <%s> (overflow byte %d)\n", n, filebuf, filebuf [filebuflen]);
} else {
fprintf (stderr, "read %d bytes\n", n);
write (1, filebuf, n);
}
} while (n > 0);
close (fd);
}

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat

Corinna Vinschen

2012-07-04 08:27:07 UTC