OK, I'm brand new to this group, brand new to rsync, brand new to unix in general. I'm trying to play catch up with this discussion so there are likely many misconceptions that I have about these issues.

My goal is to create a tool that does backup and restore only transferring changes. It will connect to a server running Linux from Mac OS X and preserve all metadata without the user ever knowing there is an issue. I've found the rsync algorithm is a good start and it sounds like you all have the same idea.

I don't think I like the idea of the MacBinary solution, in that I can see some configuration of the tool that the user will have to worry about. We obviously don't want the overhead of flattening files without forks or files that have FileInfo that can be determined from other metadata strategies. The user might have to maintain a list of files they use... How do I handle this file or that (á la mac cvs tools).

I see another user experience issue with the MacBinary solution and the protocol change. What do the files look like when they get backed up? If I connect to the server via the finder am I going to see a bunch of files that are 'archived' or do I get the real deal. As a user I wouldn't use rsync if I couldn't just go and grab the files that got backed up. Not that running the file through stuffit is a big deal but it going to seems a bit clunky even if the solution is in fact much more extensible. What format is this new protocol going to produce? Will the only way to get to the files be to use the rsync client? Sorry, that's just not acceptable.

The only solution left is to pre-process the file by splitting it before before creating the change lists so that comparisons can be made if the file is split on the server. There will have to be some intelligence about what method of splitting is used on the server but I'm positive that couldn't be too hard to determine. Directory metadata just has to be handled in another file as well, isn't that what .DSInfo files are? I'm starting to think that what I'm proposing is more of a combination of 2) and 3). Wouldn't it be great if we could support ACL's as well. Please tell me if I'm way off base here.

One other question that I'm sure will show my ignorance of Darwin development. What is the issue with using the high level API's if the output is compatible with the other platforms running rsync. What is the advantage of trying for posix purity or code at the "Darwin level" if the code is only going to be used on Macs running the higher level stuff anyway? If you don't have a forked file system why would you care if you don't know how to handle forks?

I'm planning on taking this project on full time and we would all benefit if we can all agree on a direction.

Lets get this thing going,
Terrence Geernaert

Mark Valence wrote:

1) convert (on the fly) all files to MacBinary before comparing/sending them to the destination. MacBinary is a well documented way to package an HFS file into a single data file. The benefits with this method are compatibility with existing rsync versions that are not MacBinary aware, while the drawbacks are speed, maintainability, and that directory metadata is not addressed at all.

2) Treat the two forks and metadata as three separate files for the purposes of comparison/sending, and then reassemble them on the destination. Same drawbacks and benefits of the MacBinary route. This would also take more memory (potentially three times the number of files in the flist).

3) Change the protocol and implementation to handle arbitrary metadata and multiple forks. This could be made sort-of compatible with existing rsync's by using various tricks, but the most efficient way would be to alter the protocol. Benefits are that this would make the protocol extensible. Metadata can be "tagged" so that you could add any values needed, and ignore those tags that are not understood or supported. Any number of forks could be supported, which gives a step up in supporting NTFS where a file can have any number of "data streams". In fact, forks and metadata could all be done in the same way in the protocol.

Reply via email to