--- Begin Message ---
Hi, Martin!
I have ported rsync to OS/390 Open Edition, currently named z/OS Unix
System Services. Using my port, it is possible to transfer text files
between IBM mainframe and UNIX machines, having automatic code page
conversion which is for what I need rsync. Please see the attached
file for detailed explanations as well as the patch itself.
Best regards,
Hartmut Schaefer
--
----------------------------------------------------------------------
Zuercher Kantonalbank ZKB Internet : [EMAIL PROTECTED]
Neue Hard 9 Telefon : ++41 01-292 71 91
Postfach Fax : ++41 01-292 86 51
CH-8010 Zuerich
----------------------------------------------------------------------
IBM Mainframe port for rsync
I have ported rsync to OS/390 Open Edition, currently named z/OS Unix
System Services. Using my port, it is possible to transfer text files
between IBM mainframe and UNIX machines, having automatic code page
conversion which is for what I need rsync. Unfortunately, the latter
prevents using this rsync port to transfer binary files, so it would
be desirable to make the conversion of file contents into an runtime
choice. I'll discuss this below.
In the port, I had to resolve the following issues:
1. getting it build from the original package
2. getting the rsync protocol functioning between EBCDIC and ASCII
machines
3. getting automatic character set conversion of file data (see above)
4. issues I could not resolve
The main challenge of the port was that IBM mainframe represents
characters in EBCDIC (in our case of Unix System Services using
character set IBM-1047), instead of ASCII, so that, along with
other issues, protocol string data must be translated to have the
two parties understand each other. I defined the wire protocol to
use code page ISO8859-1 as it is currently by implication (for
UNIX machines only; Windows uses another code page but the part
relevant for the communication to work is the ASCII common subset,
so only file names may get changed).
Below you find detailed explanations of the above issues and a
discussion of the implications of the modifications, followed
by the patch divided by the same section headings.
1. getting it build
-------------------
1.1 ./configure'ing
-------------------
I don't know autoconfig enough to decide if and where changes are to
be made so that ./configure would run without parameters. Here is how
I invoked it:
./configure \
CC="c89" \
CPP="c89 -E" \
CFLAGS="-D_POSIX_SOURCE -D_XOPEN_SOURCE_EXTENDED -D_ALL_SOURCE"
1.2 compiler arguments
----------------------
The compiler requests the source file to be the last parameter on the
command line since he does not allow flags and non-flag parameters to
mix. The flags must come first as is common on UNIX. There is a flag
allowing this but I rather decided to correct the compiler command
line.
1.3 getaddrinfo
---------------
Although the function is defined in the header files, the linker did
not find it, what was correctly detected by ./configure, resulting in
the rsync provided code to be activated. The latter did not compile
due to some constants that are probably related to recent TCP/IP
developments. I habe solved this problem by out-#ifdef-ing the
offending code and some tweaking, which is not especially clean but
works. It appears to me that the difficulties might be caused by the
patch rsync_zoong_tru64.diff - at least I found there stuff as
modifications like that which prevented the code from compiling on
OS/390.
You might want to check what I have done here before incorporating
these changes.
2. getting the rsync protocol functioning
-----------------------------------------
2.1 character set translation of protocol data
----------------------------------------------
Since IBM mainframe uses EBCDIC for character representation, string
literals (protocol tokens) as well as string data (e.g. file names)
are in EBCDIC and must be translated to and from ASCII when written
to or read from the protocol stream. This translation must be done at
the application level since also binary and file data gets transferred
over the link.
By having different functions read_buf() and read_sbuf() resp.
write_buf() and write_sbuf() for reading and writing binary and string
data, rsync was already prepared to handle string data specially. I
exploited this to put the character set conversion into the *_sbuf()
routines.
As it turned out, there were several places in the code where strings
were handled through the *_buf() routines, I corrected this. As a
result, write_sbuf() had to be made non-static.
I had to add conversion also to read_unbuffered() before the rprintf()
that uses line as string to get things right. I don't fully understand
that since read_unbuffered() gets used through readfd() also to read
binary data. This seems to have to do with multiplexing, meaning
execution reaches my place only when *text* is transferred (tag ==
FERROR || tag == FINFO), but I didn't understand how multiplexing
works.
The complementary translation went into io_multiplex_write(), which is
somewhat asymmetric, but it seems that both places are the right ones
to put my code page conversion. There is another conversion in
err_list_add() which was very difficult for me to find out, but I have
the feeling that I got it right. Since in the beginning I got file mode
errors I could check that the error messages show up correctly in
both directions. May be, you want to check this with your overall
knowledge of how rsync and the rsync protocol works.
The conversion routines themselves, translate_buffer_from_isolatin()
and translate_buffer_to_isolatin(), have gone into a separate new
source file cpconv.c containing also the conversion tables. Both
functions translate a given buffer in place what could cause problems
if it were further used by the program after the transfer. I have not
checked all source code for this but did not encounter any problems
with this in-place conversion.
Both functions do exactly the reverse action of one another, that is,
applying both transformations to some buffer in any order yields the
buffer unmodified. I regard this property mandatory for not breaking
the rsync algorithm if the transformation is applied to file contents
also (see 3. below).
Note: My patch, as provided, adds cpconv.o to the object modules list
for rsync, regardless of if it is used at all. Since all use of the
conversion functions is encapsuled in #ifdefs, they are invoked only
for the concerned platforms which is currently __MVS__. (With the
appropriate conversion table it could also be Windows, so file names
and user IDs would get translated correctly.) All other platforms
will carry this for nothing. While the small size of the conversion
code will not hurt much, it would be clever to somehow ./configure
the inclusion of cpconv.c into the compilation only for the relevant
platforms. Again, since I don't know how the magics of autoconf work,
I could not make the necessary modifications.
2.2 file mode translation
-------------------------
rsync 2.5.5 transfers the file modes in plain binary, exactly as they
are represented on the platform it is running on according to the
definitions in sys/mode.h or sys/modes.h. This causes problems if
different platforms use different definitions as is the case with
OS390/OE and AIX. Judging from the existence of the two functions
from_wire_mode() and to_wire_mode(), it appears to me that similar
problems have arised already in the past and have been dealt with by
providing these functions in their current implementation. This
implementation, however, does not define a true portable wire mode
as it would be necessary to not depend on system specifics. This way,
rsync failed to communicate between OS/390 and AIX.
I changed from_wire_mode() and to_wire_mode() to translate the system
specific mode flags to explicitly defined ones to go over the wire. I
would have to do this for all flags to get really portable code but
since the lower 12 bits didn't make difficulties I let them unchanged.
Probably, concerning the lower 12 bits there is no system defining
them another way than the rest of the world, so there is no special
reason to extend the conversion functions to define portable values
for them as well if only cleanness.
I defined the portable wire modes to be the same values as the AIX
flags, so compatibility of the protocol to the unchanged rsync should
be given. (It worked at least with my AIX rsync.)
There may be, anyway, problems with my code you might want to check
for:
1. I decided that the previous translation contained in the two
functions should be obsolete now and removed it. However, I
did not fully understand what it does, so there may arise
problems.
2. The mode flag constants I refer to in from_wire_mode() might
not be defined in the same way on all systems. The code would
not compile if they are defined with a leading underscore. The
solution would be for instance to add the definitions needed
by the code following this schema:
#if ! defined(S_IFLNK)
#define S_IFLNK _S_IFLNK
#endif
However, I did not want to clutter the code in this way if not
necessary, so I left this out.
2.3 clean_flist()
-----------------
As it turned out, files were mixed up in the transfer as a result of
rsync doing a sort of the file list *after* it has been transferred
between both systems on every system *separately*. Since EBCDIC
character codes sort differently from ASCII character codes, both
peers ended up with different file lists, making the indices sent
by the receiving peer point to the wrong file (or directory!) of the
sending peer.
I resolved the problem by introducing a horrible hack into u_strcmp()
(util.c) to convert EBCDIC characters to ASCII before comparison. I
did not try to make this more elegant since I believe the correct
solution for the problem is to have the list cleaned *before* it
gets transmissed to the other peer. However, I feel not competent
enough to change the relevant code, also, it would probably introduce
a protocol incompatibility, causing more work to be done. I added a
verbose comment to u_strcmp().
Note: Cleaning the file list, as I have read in your comments, is
indeed to trivial. Have you thought about examining the inode number
to remove duplicates?
2.4 authentication
------------------
User IDs and passwords must be translated from EBCDIC to ASCII to have
the authentication code work. Since the challenge does not get base64
*de*coded on the receiver side but the ASCII representation goes into
generate_hash() directly, we have to convert it to ASCII too. This is
a bit quirky, and it would be cleaner to base64-decode the challenge
before using it.
3. getting automatic character set conversion of file data
----------------------------------------------------------
3.1 What I have done
--------------------
The two modifications in fileio.c have rsync see file data on disk as
ASCII instead of EBCDIC (ISO8859-1 instead of IBM-1047). As it turns
out, this is fully transparent to the rsync algorithm and transparently
causes automatic code page conversion on the OS/390 side. (As noted
above, for this to be the case that is to not break the rsync algorithm
both conversion must be exactly the reverse of one another, as is the
case in the code I provide.)
The patch for fileio.c is the *only* modification causing an effective
change in rsync behaviour that can be *observed* by the user, since
everything up to this point was only to enable the rsync *protocol*
run across system boundaries. Without the patch to fileio.c rsync will
transfer files unmodified which is the normal and expected behaviour.
I have not made a documentation change since I don't expect you to
incorporate *this* part of the patch as it would break normal rsync
behaviour which is binary transfer. (The other modifications, as
stated above, don't show up to the user.)
3.2 Introducing character set conversion the clean way (discussion)
-------------------------------------------------------------------
While I haven't done it the clean way, automatic code page conversion
when transferring text files is of significant value. Indeed, for *my*
intended use of rsync - editing source code comfortably on UNIX and
transferring them to the mainframe in one step ready for compilation -
it was a Must. I'm much less interested in transferring binary files;
if I would have to, I would use FTP. Of course, not being able to
transfer binary files is a significant shortcoming, so in a real
distribution it would be necessary to make the file data conversion
a runtime choice, not a compile time choice.
I have thought about how to do this, and it appears to me that the
conversion must be done on the side running on the system owning the
character set (mainframe for EBCDIC, Windows for 1252 etc.) - whether
this is the client or the server and whether it is the sender or the
receiver (this is the case with my implementation). It would not be
appropriate if every possible translation would have to be carried
with every possible platform implementation, or even worse, if the
user is allowed to provide his own conversion tables to be read and
used by rsync and he would have to provide them twice - on both ends
of the communication link - for the case he occasionally swaps server
and client role or sender and receiver role.
There are two implications from this, independent of how the fact
that translation should take place will be specified:
1. The peers must be able to communicate this fact to one another
(protocol extension).
2. When translation takes place, every peer does only his part of
the translation, they both communicate using a common character
representation. (Currently, I have defined this to be ISO8859-1,
but it would be better unicode although this prevents buffer
conversion in place and makes things much more complicated.)
Interestingly, the problem of text conversion exists independently of
the fact if we provide translation for converting file contents: File
and user name transfer is *text* transfer and as such has to undergo
code page translation to be correct if non-ACSII characters can be
used. The problem is minor, of course, as long as file names and user
IDs are normally composed of ASCII charachers only. But we can see
from this, that we are confronted with not only one but three
different code pages on every side of the transfer:
a) the code page the compiler uses
The string literals are in this code page. Since it is known at
compile time, automatic translation can be implemented without
adding run time parameters to rsync.
b) the code page the operating system uses
File names and user IDs are in this code page. Theoretically,
this code page can be either known at compile time or queried
from the operating system. In practice, probably a default will
be compiled into rsync and options or environment variables
will allow the user to specify a different code page.
c) the code page the file data is in
The application for which the transferred files are intended uses
this code page. This is the conversion the user might want to
specify. The default would be the code page used by the operating
system (see above)
(In my implementation all three code pages are the same (IBM-1047), and
indeed for most situations it will be sufficient to provide exactly one
code page translation per platform which is used for a, b and c. (This
translation were the one built into the program.) But usefulness of the
program improves greatly by allowing the user to specify different
translations explicitly while providing the corresponding
translation tables himself.)
Since code pages are used on both sides of the communication (with or
without a common character representation on the wire), we see 4 (four)
code pages a user might want to specify for a transfer. In most cases,
however, he will specify 1 or 2 code pages (for the file data only,
leaving file name and user ID translation to the default), if any at
all (having file data translation use the default code pages).
Independent of if the code pages can be specified also with the
rsyncd.conf modules or only on the command line, the protocol must be
able to communicate the relevant instructions to the other peer. In the
simplest case of providing only *one* (built in) code page translation
per platform (as is the case in my implementation), the only extension
to the rsync protocol would be to allow for a text transfer mode in
addition to the current binary-only transfer.
Since introducing code page conversion inadvertendly means doing a text
transfer, CR/LF conversion follows immediately. This makes, independent
of using or not using unicode on the wire, the in-place conversion I'm
exploiting finally impossible, making things more difficult. (Speaking
of supporting many different code pages with disjunct character sets,
unicode would be the natural solution anyway.) As I understand it, the
rsync algorithm should not be influenced by the translation changing the
length of the file, since in text transfer mode all computations would
be done on the on-the-wire representation (unicode or ISO-Latin 1/UNIX).
Probably, map_ptr() would become quite more complicated to get reading
the file sections to be transferred right. On the other side, providing
character set conversion without having *true* text mode transfer (that
is, including CR/LF conversion) would seem inconsequent.
The implications of
1. extending the rsync protocol, and
2. not being able co convert buffers in place
kept me from trying to implement the text transfer mode cleanly as a
runtime choice. I didn't feel able to check all the implications of
the necessary modifications. However, the modifications from the
patch below might serve as a good preparation for rsync to obtain
multi character set capabilities - at least, in terms of *talking* to
an EBCDIC machine while keeping the transfer binary, and later possibly
by adding true text mode transfer with or without user selectable code
pages.
4. unresolved issues
--------------------
For some reason, when trying to write from OS/390 to a read-only
module on AIX, I don't get a proper error message but
building file list ... done
rsync: error writing 3001 unbuffered bytes - exiting: EDC5140I Broken pipe.
rsync error: error in rsync protocol data stream (code 12) at ./io.c(472)
In the opposite direction the error message comes fine. May be, I
have missed one conversion?
Below you find the patch, divided into sections 1...3 according to
the discussion above:
1. getting it build
-------------------
diff -cr rsync-2.5.5-orig/Makefile.in rsync-2.5.5-new/Makefile.in
*** rsync-2.5.5-orig/Makefile.in Mon Mar 25 05:36:56 2002
--- rsync-2.5.5-new/Makefile.in Wed May 15 11:32:50 2002
***************
*** 29,35 ****
ZLIBOBJ=zlib/deflate.o zlib/infblock.o zlib/infcodes.o zlib/inffast.o \
zlib/inflate.o zlib/inftrees.o zlib/infutil.o zlib/trees.o \
zlib/zutil.o zlib/adler32.o
! OBJS1=rsync.o generator.o receiver.o cleanup.o sender.o exclude.o util.o main.o
checksum.o match.o syscall.o log.o backup.o
OBJS2=options.o flist.o io.o compat.o hlink.o token.o uidlist.o socket.o fileio.o
batch.o \
clientname.o
DAEMON_OBJ = params.o loadparm.o clientserver.o access.o connection.o authenticate.o
--- 29,35 ----
ZLIBOBJ=zlib/deflate.o zlib/infblock.o zlib/infcodes.o zlib/inffast.o \
zlib/inflate.o zlib/inftrees.o zlib/infutil.o zlib/trees.o \
zlib/zutil.o zlib/adler32.o
! OBJS1=rsync.o generator.o receiver.o cleanup.o sender.o exclude.o util.o main.o
checksum.o match.o syscall.o log.o backup.o cpconv.o
OBJS2=options.o flist.o io.o compat.o hlink.o token.o uidlist.o socket.o fileio.o
batch.o \
clientname.o
DAEMON_OBJ = params.o loadparm.o clientserver.o access.o connection.o authenticate.o
***************
*** 45,51 ****
# note that the -I. is needed to handle config.h when using VPATH
.c.o:
@OBJ_SAVE@
! $(CC) -I. -I$(srcdir) $(CFLAGS) -c $< @CC_SHOBJ_FLAG@
@OBJ_RESTORE@
all: rsync
--- 45,51 ----
# note that the -I. is needed to handle config.h when using VPATH
.c.o:
@OBJ_SAVE@
! $(CC) -I. -I$(srcdir) $(CFLAGS) @CC_SHOBJ_FLAG@ -c $<
@OBJ_RESTORE@
all: rsync
diff -cr rsync-2.5.5-orig/lib/getaddrinfo.c rsync-2.5.5-new/lib/getaddrinfo.c
*** rsync-2.5.5-orig/lib/getaddrinfo.c Fri Dec 14 06:33:12 2001
--- rsync-2.5.5-new/lib/getaddrinfo.c Tue May 14 14:12:24 2002
***************
*** 259,266 ****
--- 259,277 ----
/* error check for hints */
if (hints->ai_addrlen || hints->ai_canonname ||
hints->ai_addr || hints->ai_next)
+ #if defined(__MVS__)
+ /* hints seem to be unknown to OS390 Open Edition, use EAI_MAX
+*/
+ ERR(EAI_MAX); /* xxx */
+ #else
ERR(EAI_BADHINTS); /* xxx */
+ #endif
+ #if defined(__MVS__)
+ /* when compiling on OS390 Open Edition AI_MASK appears to be */
+ /* undefined, include expansion literally here */
+ if (hints->ai_flags & ~(AI_PASSIVE | AI_CANONNAME | AI_NUMERICHOST))
+ #else
if (hints->ai_flags & ~AI_MASK)
+ #endif
ERR(EAI_BADFLAGS);
switch (hints->ai_family) {
case PF_UNSPEC:
***************
*** 294,306 ****
--- 305,327 ----
case SOCK_DGRAM:
if (pai->ai_protocol != IPPROTO_UDP &&
pai->ai_protocol != ANY)
+ #if defined(__MVS__)
+ /* hints seem to be unknown to OS390 Open Edition, use
+EAI_MAX */
+ ERR(EAI_MAX); /*xxx*/
+ #else
ERR(EAI_BADHINTS); /*xxx*/
+ #endif
pai->ai_protocol = IPPROTO_UDP;
break;
case SOCK_STREAM:
if (pai->ai_protocol != IPPROTO_TCP &&
pai->ai_protocol != ANY)
+ #if defined(__MVS__)
+ /* hints seem to be unknown to OS390 Open Edition, use
+EAI_MAX */
+ ERR(EAI_MAX); /*xxx*/
+ #else
ERR(EAI_BADHINTS); /*xxx*/
+ #endif
pai->ai_protocol = IPPROTO_TCP;
break;
default:
***************
*** 350,356 ****
--- 371,382 ----
pai->ai_socktype = SOCK_STREAM;
pai->ai_protocol = IPPROTO_TCP;
} else
+ #if defined(__MVS__)
+ /* hints seem to be unknown to OS390 Open
+Edition, use EAI_MAX */
+ ERR(EAI_MAX); /*xxx*/
+ #else
ERR(EAI_PROTOCOL); /*xxx*/
+ #endif
}
}
}
***************
*** 401,410 ****
--- 427,444 ----
switch (afdl[i].a_af) {
case AF_INET:
v4a = ((struct in_addr *)pton)->s_addr;
+ #if ! defined(__MVS__)
+ /* this is unknown on OS390 Open Edition, drop it
+(???) */
if (IN_MULTICAST(v4a) || IN_EXPERIMENTAL(v4a))
pai->ai_flags &= ~AI_CANONNAME;
+ #endif
v4a >>= IN_CLASSA_NSHIFT;
+ #if ! defined(__MVS__)
+ /* this is unknown on OS390 Open Edition, drop it
+(???) */
if (v4a == 0 || v4a == IN_LOOPBACKNET)
+ #else
+ if (v4a == 0)
+ #endif
pai->ai_flags &= ~AI_CANONNAME;
break;
#ifdef INET6
diff -cr rsync-2.5.5-orig/lib/getnameinfo.c rsync-2.5.5-new/lib/getnameinfo.c
*** rsync-2.5.5-orig/lib/getnameinfo.c Wed Dec 5 14:40:03 2001
--- rsync-2.5.5-new/lib/getnameinfo.c Tue May 14 14:08:56 2002
***************
*** 135,144 ****
--- 135,152 ----
switch (sa->sa_family) {
case AF_INET:
v4a = ((struct sockaddr_in *)sa)->sin_addr.s_addr;
+ #if ! defined(__MVS__)
+ /* this is unknown on OS390 Open Edition, drop it (???) */
if (IN_MULTICAST(v4a) || IN_EXPERIMENTAL(v4a))
flags |= NI_NUMERICHOST;
+ #endif
v4a >>= IN_CLASSA_NSHIFT;
+ #if ! defined(__MVS__)
+ /* this is unknown on OS390 Open Edition, drop it (???) */
if (v4a == 0 || v4a == IN_LOOPBACKNET)
+ #else
+ if (v4a == 0)
+ #endif
flags |= NI_NUMERICHOST;
break;
#ifdef INET6
2. getting the rsync protocol functioning
-----------------------------------------
diff -cr rsync-2.5.5-orig/authenticate.c rsync-2.5.5-new/authenticate.c
*** rsync-2.5.5-orig/authenticate.c Thu Jan 24 03:33:45 2002
--- rsync-2.5.5-new/authenticate.c Thu May 16 08:20:53 2002
***************
*** 253,258 ****
--- 253,266 ----
return NULL;
}
+ #if defined(__MVS__)
+ /* translate module password read from config file to ASCII */
+ translate_buffer_to_isolatin( (unsigned char *) secret, strlen( secret ) );
+ /* The other side doesn't base64-decode the challenge but uses the ASCII */
+ /* representation. We must do equally, so translate the b64_challenge. */
+ translate_buffer_to_isolatin( (unsigned char *) b64_challenge, strlen(
+b64_challenge ) );
+ #endif
+
generate_hash(secret, b64_challenge, pass2);
memset(secret, 0, sizeof(secret));
***************
*** 281,286 ****
--- 289,302 ----
if (!pass || !*pass) {
pass = "";
}
+
+ #if defined(__MVS__)
+ /* translate console read user password to ASCII */
+ translate_buffer_to_isolatin( (unsigned char *) pass, strlen( pass ) );
+ /* The other side has worked with the ASCII base64 representation of */
+ /* the challenge. We must do equally, so translate it back to ASCII */
+ translate_buffer_to_isolatin( (unsigned char *) challenge, strlen( challenge )
+);
+ #endif
generate_hash(pass, challenge, pass2);
io_printf(fd, "%s %s\n", user, pass2);
diff -cr rsync-2.5.5-orig/cpconv.c rsync-2.5.5-new/cpconv.c
*** rsync-2.5.5-orig/cpconv.c Thu May 16 10:34:41 2002
--- rsync-2.5.5-new/cpconv.c Thu May 16 08:31:31 2002
***************
*** 0 ****
--- 1,108 ----
+ /*
+ * This table translates ISO8859-1 to EBCDIC-1047.
+ *
+ * It was generated by piping the 256 character codes
+ * 0x00...0xff through iconv -f ISO8859-1 -t IBM-1047
+ */
+ static unsigned char isolatin_to_ebcdic[] = {
+ 0x00, 0x01, 0x02, 0x03, 0x37, 0x2d, 0x2e, 0x2f,
+ 0x16, 0x05, 0x15, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
+ 0x10, 0x11, 0x12, 0x13, 0x3c, 0x3d, 0x32, 0x26,
+ 0x18, 0x19, 0x3f, 0x27, 0x1c, 0x1d, 0x1e, 0x1f,
+ 0x40, 0x5a, 0x7f, 0x7b, 0x5b, 0x6c, 0x50, 0x7d,
+ 0x4d, 0x5d, 0x5c, 0x4e, 0x6b, 0x60, 0x4b, 0x61,
+ 0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
+ 0xf8, 0xf9, 0x7a, 0x5e, 0x4c, 0x7e, 0x6e, 0x6f,
+ 0x7c, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
+ 0xc8, 0xc9, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6,
+ 0xd7, 0xd8, 0xd9, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6,
+ 0xe7, 0xe8, 0xe9, 0xad, 0xe0, 0xbd, 0x5f, 0x6d,
+ 0x79, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96,
+ 0x97, 0x98, 0x99, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6,
+ 0xa7, 0xa8, 0xa9, 0xc0, 0x4f, 0xd0, 0xa1, 0x07,
+ 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x06, 0x17,
+ 0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x09, 0x0a, 0x1b,
+ 0x30, 0x31, 0x1a, 0x33, 0x34, 0x35, 0x36, 0x08,
+ 0x38, 0x39, 0x3a, 0x3b, 0x04, 0x14, 0x3e, 0xff,
+ 0x41, 0xaa, 0x4a, 0xb1, 0x9f, 0xb2, 0x6a, 0xb5,
+ 0xbb, 0xb4, 0x9a, 0x8a, 0xb0, 0xca, 0xaf, 0xbc,
+ 0x90, 0x8f, 0xea, 0xfa, 0xbe, 0xa0, 0xb6, 0xb3,
+ 0x9d, 0xda, 0x9b, 0x8b, 0xb7, 0xb8, 0xb9, 0xab,
+ 0x64, 0x65, 0x62, 0x66, 0x63, 0x67, 0x9e, 0x68,
+ 0x74, 0x71, 0x72, 0x73, 0x78, 0x75, 0x76, 0x77,
+ 0xac, 0x69, 0xed, 0xee, 0xeb, 0xef, 0xec, 0xbf,
+ 0x80, 0xfd, 0xfe, 0xfb, 0xfc, 0xba, 0xae, 0x59,
+ 0x44, 0x45, 0x42, 0x46, 0x43, 0x47, 0x9c, 0x48,
+ 0x54, 0x51, 0x52, 0x53, 0x58, 0x55, 0x56, 0x57,
+ 0x8c, 0x49, 0xcd, 0xce, 0xcb, 0xcf, 0xcc, 0xe1,
+ 0x70, 0xdd, 0xde, 0xdb, 0xdc, 0x8d, 0x8e, 0xdf
+ };
+
+ /*
+ * This table translates EBCDIC-1047 to ISO8859-1.
+ *
+ * It was generated by piping the 256 character codes
+ * 0x00...0xff through iconv -f IBM-1047 -t ISO8859-1
+ */
+ static unsigned char ebcdic_to_isolatin[] = {
+ 0x00, 0x01, 0x02, 0x03, 0x9c, 0x09, 0x86, 0x7f,
+ 0x97, 0x8d, 0x8e, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
+ 0x10, 0x11, 0x12, 0x13, 0x9d, 0x0a, 0x08, 0x87,
+ 0x18, 0x19, 0x92, 0x8f, 0x1c, 0x1d, 0x1e, 0x1f,
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x17, 0x1b,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x05, 0x06, 0x07,
+ 0x90, 0x91, 0x16, 0x93, 0x94, 0x95, 0x96, 0x04,
+ 0x98, 0x99, 0x9a, 0x9b, 0x14, 0x15, 0x9e, 0x1a,
+ 0x20, 0xa0, 0xe2, 0xe4, 0xe0, 0xe1, 0xe3, 0xe5,
+ 0xe7, 0xf1, 0xa2, 0x2e, 0x3c, 0x28, 0x2b, 0x7c,
+ 0x26, 0xe9, 0xea, 0xeb, 0xe8, 0xed, 0xee, 0xef,
+ 0xec, 0xdf, 0x21, 0x24, 0x2a, 0x29, 0x3b, 0x5e,
+ 0x2d, 0x2f, 0xc2, 0xc4, 0xc0, 0xc1, 0xc3, 0xc5,
+ 0xc7, 0xd1, 0xa6, 0x2c, 0x25, 0x5f, 0x3e, 0x3f,
+ 0xf8, 0xc9, 0xca, 0xcb, 0xc8, 0xcd, 0xce, 0xcf,
+ 0xcc, 0x60, 0x3a, 0x23, 0x40, 0x27, 0x3d, 0x22,
+ 0xd8, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67,
+ 0x68, 0x69, 0xab, 0xbb, 0xf0, 0xfd, 0xfe, 0xb1,
+ 0xb0, 0x6a, 0x6b, 0x6c, 0x6d, 0x6e, 0x6f, 0x70,
+ 0x71, 0x72, 0xaa, 0xba, 0xe6, 0xb8, 0xc6, 0xa4,
+ 0xb5, 0x7e, 0x73, 0x74, 0x75, 0x76, 0x77, 0x78,
+ 0x79, 0x7a, 0xa1, 0xbf, 0xd0, 0x5b, 0xde, 0xae,
+ 0xac, 0xa3, 0xa5, 0xb7, 0xa9, 0xa7, 0xb6, 0xbc,
+ 0xbd, 0xbe, 0xdd, 0xa8, 0xaf, 0x5d, 0xb4, 0xd7,
+ 0x7b, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47,
+ 0x48, 0x49, 0xad, 0xf4, 0xf6, 0xf2, 0xf3, 0xf5,
+ 0x7d, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f, 0x50,
+ 0x51, 0x52, 0xb9, 0xfb, 0xfc, 0xf9, 0xfa, 0xff,
+ 0x5c, 0xf7, 0x53, 0x54, 0x55, 0x56, 0x57, 0x58,
+ 0x59, 0x5a, 0xb2, 0xd4, 0xd6, 0xd2, 0xd3, 0xd5,
+ 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
+ 0x38, 0x39, 0xb3, 0xdb, 0xdc, 0xd9, 0xda, 0x9f
+ };
+
+ /*
+ * Convert buffer in place
+ */
+ static void translate_buffer( unsigned char * pConversionTable, unsigned char *
+pBuffer, int pLength )
+ {
+ unsigned char * lStop = pBuffer + pLength;
+
+ while (pBuffer < lStop)
+ {
+ *pBuffer = pConversionTable[*pBuffer];
+ ++pBuffer;
+ }
+ }
+
+ /*
+ * These are the exported functions
+ */
+ void translate_buffer_from_isolatin( unsigned char * pBuffer, int pLength )
+ {
+ translate_buffer( isolatin_to_ebcdic, pBuffer, pLength );
+ }
+
+ void translate_buffer_to_isolatin( unsigned char * pBuffer, int pLength )
+ {
+ translate_buffer( ebcdic_to_isolatin, pBuffer, pLength );
+ }
diff -cr rsync-2.5.5-orig/exclude.c rsync-2.5.5-new/exclude.c
*** rsync-2.5.5-orig/exclude.c Mon Feb 18 20:10:28 2002
--- rsync-2.5.5-new/exclude.c Fri May 10 14:12:39 2002
***************
*** 290,300 ****
exit_cleanup(RERR_UNSUPPORTED);
}
write_int(f,l+2);
! write_buf(f,"+ ",2);
} else {
write_int(f,l);
}
! write_buf(f,pattern,l);
}
write_int(f,0);
--- 290,300 ----
exit_cleanup(RERR_UNSUPPORTED);
}
write_int(f,l+2);
! write_sbuf(f,"+ ");
} else {
write_int(f,l);
}
! write_sbuf(f,pattern);
}
write_int(f,0);
diff -cr rsync-2.5.5-orig/flist.c rsync-2.5.5-new/flist.c
*** rsync-2.5.5-orig/flist.c Thu Mar 14 22:20:20 2002
--- rsync-2.5.5-new/flist.c Tue May 14 11:24:51 2002
***************
*** 265,282 ****
static int to_wire_mode(mode_t mode)
{
! if (S_ISLNK(mode) && (_S_IFLNK != 0120000)) {
! return (mode & ~(_S_IFMT)) | 0120000;
! }
! return (int) mode;
}
static mode_t from_wire_mode(int mode)
{
! if ((mode & (_S_IFMT)) == 0120000 && (_S_IFLNK != 0120000)) {
! return (mode & ~(_S_IFMT)) | _S_IFLNK;
}
! return (mode_t) mode;
}
--- 265,301 ----
static int to_wire_mode(mode_t mode)
{
! int lPortableFlags;
!
! if (S_ISLNK(mode)) lPortableFlags = 0120000;
! else if (S_ISBLK(mode)) lPortableFlags = 0060000;
! else if (S_ISCHR(mode)) lPortableFlags = 0020000;
! else if (S_ISSOCK(mode)) lPortableFlags = 0140000;
! else if (S_ISFIFO(mode)) lPortableFlags = 0010000;
! else if (S_ISDIR(mode)) lPortableFlags = 0040000;
! else if (S_ISREG(mode)) lPortableFlags = 0100000;
! else lPortableFlags = 0;
!
! return (lPortableFlags | (((int) mode) & 07777));
}
static mode_t from_wire_mode(int mode)
{
! mode_t lPlatformFlags;
!
! switch (mode & (07777 ^ -1))
! {
! case 0120000: lPlatformFlags = S_IFLNK; break;
! case 0060000: lPlatformFlags = S_IFBLK; break;
! case 0020000: lPlatformFlags = S_IFCHR; break;
! case 0140000: lPlatformFlags = S_IFSOCK; break;
! case 0010000: lPlatformFlags = S_IFIFO; break;
! case 0040000: lPlatformFlags = S_IFDIR; break;
! case 0100000: lPlatformFlags = S_IFREG; break;
! default: lPlatformFlags = 0; break;
}
!
! return (mode_t) (lPlatformFlags | (mode & 07777));
}
***************
*** 381,387 ****
write_int(f, l2);
else
write_byte(f, l2);
! write_buf(f, fname + l1, l2);
write_longint(f, file->length);
if (!(flags & SAME_TIME))
--- 400,406 ----
write_int(f, l2);
else
write_byte(f, l2);
! write_sbuf(f, fname + l1);
write_longint(f, file->length);
if (!(flags & SAME_TIME))
***************
*** 403,409 ****
#if SUPPORT_LINKS
if (preserve_links && S_ISLNK(file->mode)) {
write_int(f, strlen(file->link));
! write_buf(f, file->link, strlen(file->link));
}
#endif
--- 422,428 ----
#if SUPPORT_LINKS
if (preserve_links && S_ISLNK(file->mode)) {
write_int(f, strlen(file->link));
! write_sbuf(f, file->link);
}
#endif
diff -cr rsync-2.5.5-orig/io.c rsync-2.5.5-new/io.c
*** rsync-2.5.5-orig/io.c Fri Mar 22 06:14:44 2002
--- rsync-2.5.5-new/io.c Thu May 16 08:20:52 2002
***************
*** 307,312 ****
--- 307,316 ----
read_loop(fd, line, remaining);
line[remaining] = 0;
+ #if defined(__MVS__)
+ translate_buffer_from_isolatin( (unsigned char *) line, remaining );
+ #endif
+
rprintf((enum logcode) tag, "%s", line);
remaining = 0;
}
***************
*** 377,382 ****
--- 381,391 ----
void read_sbuf(int f,char *buf,size_t len)
{
read_buf (f,buf,len);
+
+ #if defined(__MVS__)
+ translate_buffer_from_isolatin( (unsigned char *) buf, len );
+ #endif
+
buf[len] = 0;
}
***************
*** 610,618 ****
}
/* write a string to the connection */
! static void write_sbuf(int f,char *buf)
{
! write_buf(f, buf, strlen(buf));
}
--- 619,633 ----
}
/* write a string to the connection */
! void write_sbuf(int f,char *buf)
{
! int lLength = strlen(buf);
!
! #if defined(__MVS__)
! translate_buffer_to_isolatin( (unsigned char *) buf, lLength );
! #endif
!
! write_buf(f, buf, lLength );
}
***************
*** 633,639 ****
{
while (maxlen) {
buf[0] = 0;
! read_buf(f, buf, 1);
if (buf[0] == 0)
return 0;
if (buf[0] == '\n') {
--- 648,654 ----
{
while (maxlen) {
buf[0] = 0;
! read_sbuf(f, buf, 1);
if (buf[0] == 0)
return 0;
if (buf[0] == '\n') {
***************
*** 694,699 ****
--- 709,719 ----
io_flush();
stats.total_written += (len+4);
+
+ #if defined(__MVS__)
+ translate_buffer_to_isolatin( (unsigned char *) buf, len );
+ #endif
+
mplex_write(multiplex_out_fd, code, buf, len);
return 1;
}
diff -cr rsync-2.5.5-orig/log.c rsync-2.5.5-new/log.c
*** rsync-2.5.5-orig/log.c Mon Feb 18 20:51:12 2002
--- rsync-2.5.5-new/log.c Thu May 16 08:20:54 2002
***************
*** 94,99 ****
--- 94,104 ----
el->buf = malloc(len+4);
if (!el->buf) exit_cleanup(RERR_MALLOC);
memcpy(el->buf+4, buf, len);
+
+ #if defined(__MVS__)
+ translate_buffer_to_isolatin( (unsigned char *) el->buf+4, len );
+ #endif
+
SIVAL(el->buf, 0, ((code+MPLEX_BASE)<<24) | len);
el->len = len+4;
el->written = 0;
diff -cr rsync-2.5.5-orig/proto.h rsync-2.5.5-new/proto.h
*** rsync-2.5.5-orig/proto.h Mon Mar 25 04:51:17 2002
--- rsync-2.5.5-new/proto.h Thu May 16 08:20:52 2002
***************
*** 55,60 ****
--- 55,62 ----
int daemon_main(void);
void setup_protocol(int f_out,int f_in);
int claim_connection(char *fname,int max_connections);
+ void translate_buffer_from_isolatin( unsigned char * pBuffer, int pLength );
+ void translate_buffer_to_isolatin( unsigned char * pBuffer, int pLength );
int check_exclude(char *name, struct exclude_struct **local_exclude_list,
STRUCT_STAT *st);
void add_exclude_list(const char *pattern, struct exclude_struct ***list, int
include);
***************
*** 107,112 ****
--- 109,115 ----
void write_int(int f,int32 x);
void write_longint(int f, int64 x);
void write_buf(int f,char *buf,size_t len);
+ void write_sbuf(int f,char *buf);
void write_byte(int f,unsigned char c);
int read_line(int f, char *buf, size_t maxlen);
void io_printf(int fd, const char *format, ...);
diff -cr rsync-2.5.5-orig/uidlist.c rsync-2.5.5-new/uidlist.c
*** rsync-2.5.5-orig/uidlist.c Mon Mar 1 22:16:50 1999
--- rsync-2.5.5-new/uidlist.c Fri May 10 14:15:48 2002
***************
*** 203,209 ****
int len = strlen(list->name);
write_int(f, list->id);
write_byte(f, len);
! write_buf(f, list->name, len);
list = list->next;
}
--- 203,209 ----
int len = strlen(list->name);
write_int(f, list->id);
write_byte(f, len);
! write_sbuf(f, list->name);
list = list->next;
}
***************
*** 218,224 ****
int len = strlen(list->name);
write_int(f, list->id);
write_byte(f, len);
! write_buf(f, list->name, len);
list = list->next;
}
write_int(f, 0);
--- 218,224 ----
int len = strlen(list->name);
write_int(f, list->id);
write_byte(f, len);
! write_sbuf(f, list->name);
list = list->next;
}
write_int(f, 0);
diff -cr rsync-2.5.5-orig/util.c rsync-2.5.5-new/util.c
*** rsync-2.5.5-orig/util.c Wed Mar 20 02:09:49 2002
--- rsync-2.5.5-new/util.c Thu May 16 08:20:55 2002
***************
*** 853,863 ****
--- 853,903 ----
const uchar *s1 = (const uchar *)cs1;
const uchar *s2 = (const uchar *)cs2;
+ #if defined(__MVS__)
+ uchar c1;
+ uchar c2;
+ #endif
+
while (*s1 && *s2 && (*s1 == *s2)) {
s1++; s2++;
}
+ #if defined(__MVS__)
+ /*
+ *This is a horrible hack.* The problem is that send_file_list() and
+ recv_file_list() (flist.c) *separately* sort the file list *after*
+ it has been transmitted between the two peers. This way, if the sort
+ yields different results on different platforms, the files get mixed
+ up in the transfer. As stated above, this has already caused problems
+ which were addressed by providing an own function u_strcmp() for string
+ comparison which is it's only purpose. Alas, this implementation
+ depends on the layout of the character set used by every platform.
+ Particulary, it yields different results on an EBCDIC machine as
+ opposed to an ASCII machine. (In EBCDIC, the character values of the
+ uppercase letters are greater than those of the lowercase letters,
+ and those of the numbers are the highest of all.)
+
+ The problem would better be corrected in the correct place, that is
+ doing the sort of the file list (duplicate removal, as I understand
+ the goal of the procedure) exactly once *before* its transmission.
+ This would, of course, introduce an incompatibility with existing
+ rsync servers that *do* the sort after the transmission. This would
+ have to be handled by establishing a new protocol version.
+
+ For now, I'll keep compatibility with existing rsync servers by
+ leaving the sort where it is and converting EBCDIC to ASCII before
+ comparing characters.
+
+ Hartmut Schaefer, May 14, 2002
+ */
+ c1 = *s1;
+ translate_buffer_to_isolatin( &c1, 1 );
+ c2 = *s2;
+ translate_buffer_to_isolatin( &c2, 1 );
+ return (int)c1 - (int)c2;
+ #else
return (int)*s1 - (int)*s2;
+ #endif
}
static OFF_T last_ofs;
3. getting automatic character set conversion of file data
----------------------------------------------------------
diff -cr rsync-2.5.5-orig/fileio.c rsync-2.5.5-new/fileio.c
*** rsync-2.5.5-orig/fileio.c Sat Jan 26 00:07:34 2002
--- rsync-2.5.5-new/fileio.c Thu May 16 08:20:51 2002
***************
*** 75,80 ****
--- 75,84 ----
{
int ret = 0;
+ #if defined(__MVS__)
+ translate_buffer_from_isolatin( (unsigned char *) buf, len );
+ #endif
+
if (!sparse_files) {
return write(f,buf,len);
}
***************
*** 193,198 ****
--- 197,207 ----
has changed mid transfer! */
memset(map->p+read_offset+nread, 0, read_size - nread);
}
+
+ #if defined(__MVS__)
+ translate_buffer_to_isolatin( (unsigned char *) map->p + read_offset,
+read_size );
+ #endif
+
map->p_fd_offset += nread;
}
--- End Message ---