On 9/12/22 10:35, J wrote: > Hi, > > I'm trying to clone a machine. > > The source and target machines are identical in every way, they have the > same hardware and memory. > > I am using a live-dvd to boot the source machine, and then run the > following command from a third machine (a workstation) to create the image.
If you can run both machines at once with a network connection I usually just boot the recipient from a livecd, run an ssh server on the donor, and ssh -C root@donor "cat /dev/sda" > /dev/thingy (The other direction works fine too but I find it's usually easier to get an ssh server up on an established machine...) > ssh root@source "dd if=/dev/sda bs=128M conv=sparse | gzip -9 -" | dd > of=raw.img.gz Ok, I'll bite. How do you sparse a pipe? > The source hard-disk is a 500 GB disk. > > This succeeds and produces a file that is ~29 GB in size. > > But when I try to restore this image onto the target machine, the operation > never completes. Does "zcat | wc" complete without error and say you have a valid file of the right size? > I boot the target machine with the live-dvd and issue the following commands > > mkdir /mnt/remotefs > sshfs user@workstation:/ /mnt/remotefs/ -C > cat remotefs/raw.img.gz | gunzip | dd of=/dev/sda Too many steps, find out where the failure is. Does this: gunzip < remotefs/raw.img.gz | sha1sum Complete and produce the same hash as sha1sum on the original machine you exported from? If not, dd of=/dev/sda on target is irrelevant. > By monitoring the network traffic, it appears that at first, there is data > transfer happening. > > After some time though, the traffic seems to stop, and the target-machine > appears to have locked-up. If you can ssh to the other machine, why do you need a FUSE mount to transfer data from it...? It could be sshfs hanging (never used that), in which case sha1sum should tell you if all the data arrived intact. For years I used a dumb little program something like: #include <stdio.h> int main(int argc, char *argv[]) { char buf[4096]; long long ll; int len; while (0<(len = read(0, buf, sizeof(buf)))) { dprintf(2, "%lld\r", ll += len); if (len != write(1, buf, len)) exit(1); } return 0; } And stuck that in the pipeline ala ssh source | count > /dev/thingy to give me a basic progress indicator. (When I sent a slightly fluffier version to the busybox mailing list they bikeshedded it into "pipe_progress", which is apparently now a separate package? But anyway, useful for this sort of thing...) > What I suspect is that the target machine may in fact be receiving the data > and passing it into gunzip, You can stick the above program _before_ gunzip too. If the number stops moving, it's hung. > but (according to my limited understanding of > all of this) gunzip must first receive ALL the data (29 GB) before it > finishes and pipes its output to dd. Nope. It outputs results almost immediately, compression and decompression both. Is largest internal data structure is a 32k ring buffer. Maybe you're thinking of xz or something? > The target (and source) machines only have 8 GB of RAM, so the target > machine's memory must be exceeded. Nah, most likely sshfs is going "boing" or your output device is bad. (Very small chance your network stack is using a marginal obsolete driver and dropping an interrupt or something, but that's unlikely and current kernels have timeouts to recover from that anyway...) > What am I doing wrong? Combining too many steps at once so you can't figure out which one's failing. > Is there a better way to restore the image? Yes. > Thank you Rob