Any clue as to how to tackle this problem, or any trick around it?

I really do not understand the problem here. But you might be able to
detect sparse files compartaring the size vs the number of blocks it uses.

Without making a bit writing out of it. Let say that the problem is for now a storage capacity problem on the destinations servers, a timing one in the extended transfer process and the additional bandwidth required at some of the destination point and the volumes of files. Let just say that if it was syncing 100K files, it would be a piece of cake, but it's much bigger.

Just for example, a source file that is sparse badly, don't really have allocated disk block yet, but when copy over, via scp, or rsync will actually use that space on the destination servers. All the servers are identical (or suppose to be anyway) but what is happening is the copy of them are running out of space at time in the copy process. Like when it is copying them, it may easy use twice the amount of space in the process and sadly filling up the destinations then then the sync process stop making the distribution of the load unusable. I need to increase the capacity yes, except that it will take me times to do so.

Sparse file for database example is a very good thing, but not for everything however.

The problem is not the sparse file at the source. It sure can stay as is. It's just offset pointers anyway.

The problem is in the sync process between multiple servers using the Internet to sync them and the bandwidth waisted as well as the lack of space available at the destination. Plus because the copy is different in size, then the sync process see it as different files and as such will copy them again.

Or it can be copy using -S with rsync, however this process will inflate the file at the destination and run out of space during the process and make them smaller at the end. Plus this obviously take a lots more time and as such, the timely sync process that was good for a long time now, well... Let say, not reliable. Let say, sync without concern for sparse is done just in a few minutes, but then use lots more space on the destination. Doing it with -S to address the capacity issue fix that, but then it takes a HUGE amount of time more and sadly there is useless transfer of null data cause from the sparse source empty space.

I can manage, I find ways to use ls -laR, or du -k and do diff's between them and fine the files that are getting out of wack, replace them and then continue, but this really is painful.

Obviously when the capacity will be there, it will be a none issue, however I am sadly not at that point yet and it will take me some time.

Not sure if that explain it any better, I hope so.

But I was looking if it was possible to identify these files in a more efficient way.

If not, I will just deal with it.

It's just going to be painful for sometime that's all.

The issue is really in the transfer process and at the final destination. Not at the source.

I hope it make more sense explaining it this way, if not I apologists for the lack of better thinking at the moment in explaining it.

Best,

Daniel

Reply via email to