Re: dd and gzip/gunzip with large disk images

Rob Landley Mon, 12 Sep 2022 14:28:49 -0700

On 9/12/22 10:35, J wrote:
> Hi,
> 
> I'm trying to clone a machine.
> 
> The source and target machines are identical in every way, they have the
> same hardware and memory.
> 
> I am using a live-dvd to boot the source machine, and then run the
> following command from a third machine (a workstation) to create the image.


If you can run both machines at once with a network connection I usually just
boot the recipient from a livecd, run an ssh server on the donor, and ssh -C
root@donor "cat /dev/sda" > /dev/thingy

(The other direction works fine too but I find it's usually easier to get an ssh
server up on an established machine...)

> ssh root@source "dd if=/dev/sda bs=128M conv=sparse | gzip -9 -" | dd
> of=raw.img.gz

Ok, I'll bite. How do you sparse a pipe?

> The source hard-disk is a 500 GB disk.
> 
> This succeeds and produces a file that is ~29 GB in size.
> 
> But when I try to restore this image onto the target machine, the operation
> never completes.

Does "zcat | wc" complete without error and say you have a valid file of the
right size?

> I boot the target machine with the live-dvd and issue the following commands
> 
> mkdir /mnt/remotefs
> sshfs user@workstation:/ /mnt/remotefs/ -C
> cat remotefs/raw.img.gz | gunzip | dd of=/dev/sda

Too many steps, find out where the failure is. Does this:

  gunzip < remotefs/raw.img.gz | sha1sum

Complete and produce the same hash as sha1sum on the original machine you
exported from? If not, dd of=/dev/sda on target is irrelevant.

> By monitoring the network traffic, it appears that at first, there is data
> transfer happening.
> 
> After some time though, the traffic seems to stop, and the target-machine
> appears to have locked-up.

If you can ssh to the other machine, why do you need a FUSE mount to transfer
data from it...?

It could be sshfs hanging (never used that), in which case sha1sum should tell
you if all the data arrived intact.

For years I used a dumb little program something like:

#include <stdio.h>

int main(int argc, char *argv[])
{
  char buf[4096];
  long long ll;
  int len;

  while (0<(len = read(0, buf, sizeof(buf)))) {
    dprintf(2, "%lld\r", ll += len);
    if (len != write(1, buf, len)) exit(1);
  }

  return 0;
}

And stuck that in the pipeline ala ssh source | count > /dev/thingy to give me a
basic progress indicator. (When I sent a slightly fluffier version to the
busybox mailing list they bikeshedded it into "pipe_progress", which is
apparently now a separate package? But anyway, useful for this sort of thing...)

> What I suspect is that the target machine may in fact be receiving the data
> and passing it into gunzip,

You can stick the above program _before_ gunzip too. If the number stops moving,
it's hung.

> but (according to my limited understanding of
> all of this) gunzip must first receive ALL the data (29 GB) before it
> finishes and pipes its output to dd.

Nope. It outputs results almost immediately, compression and decompression both.
Is largest internal data structure is a 32k ring buffer. Maybe you're thinking
of xz or something?

> The target (and source) machines only have 8 GB of RAM, so the target
> machine's memory must be exceeded.

Nah, most likely sshfs is going "boing" or your output device is bad. (Very
small chance your network stack is using a marginal obsolete driver and dropping
an interrupt or something, but that's unlikely and current kernels have timeouts
to recover from that anyway...)

> What am I doing wrong?

Combining too many steps at once so you can't figure out which one's failing.

> Is there a better way to restore the image?

Yes.

> Thank you

Rob

Re: dd and gzip/gunzip with large disk images

Reply via email to