On 8/7/20 6:24 PM, Leslie Rhorer wrote:
On 8/7/2020 6:23 PM, David Christensen wrote:
??Filesystem?????????? Size?? Used Avail Use% Mounted on
Your editor seems to replace multiple spaces with two question marks for
each leading space (?). Please disable the feature if you can.
The NAS array
Now it is 8 x 8 + 8
The backup system array is 8 @ 8 TB data drives and 1 @ 8 TB hot spare
I assume you filled your 16 drive rack with 8 TB drives (?). Is there a
reason why you did not use a smaller number of larger drives, partially
fill the rack, and leave open bays for future expansion and/or
additional servers?
I don't feel a need for LVM on the data arrays. I use the
entire, unpartitioned drive for /RAID.
I was leading in to LVM's ability to add capacity, but you seem to have
solved this with mdadm (see below).
Are you concerned [about bit rot]?
Yes. I have routines that compare the data on the main array and
the backup array via checksum. When needed, the backups supply a third
vote. The odds of two bits flipping at the very same spot are
astronomically low. There has been some bit rot, but so far it has been
manageable.
I had similar experiences and used similar methods in the past. BSD's
mtree(8) is built for this purpose, but lacks a cache. The Debian
version is behind FreeBSD (even when built from Sid source) and lacks
key features. I resorted to writing a Perl script with caching. ZFS
and replication made all of that unnecessary.
To add a drive:
`mdadm /dev/md0 --add /dev/sdX`
`mdadm -v /dev/md0 --grow --raid-devices=Y`
Note if an internal bitmap is set, it must be removed prior to
growing the array. It can be added back once the grow operation is
complete.
To increase the drive size, replace any smaller drives with larger
drives one at a time:
`mdadm /dev/md0 --add /dev/sdX`
`mdadm /dev/md0 --fail /dev/sdY`
Once all the drives are larger than the current device size used by
the array:
`mdadm /dev/md0 --grow`
Nice. :-)
Have you considered putting the additional files on another server
that is not backed up, only archived?
They should no longer be needed. Once I confirm that (in a few
minutes from now, actually), they will be deleted. If any of the files
in question turn out to be necessary, I will do that very thing.
If DAR maintains a catalog of archive media and the files they contain,
this would facilitate a data retention policy of "some files only exist
on archive media".
22E+12 bytes in 2.8 days is ~90 MB/s.?? That is a fraction of 4 Gbps
and an even smaller fraction of 10 Gbps.?? Have you identified the
bottleneck?
That should have been about 15 hours or so. The transfer
rate for a large file is close to 4Gbps, which is about the best I would
expect from this hardware. It's good enough.
22E+12 bytes in 15 hours is ~408 MB/s. That makes more sense.
Are you using hot-swap for the archive drives???
Yes on the hot swap. I just use a little eSATA docking station
attached to an eSATA port on the motherboard. 'Definitely a poor man's
solution.
My 2011 desktop motherboard with dual eSATA ports (150 Mbps?) gives very
satisfactory performance.
If you have two HDD hot-swap bays, can DAR leap-frog destination media?
I believe it can, yes. A script to handle that should be pretty
simple. I have never done so.
The
script I use right now pauses and waits for the user to replace the
drive and press <Enter>. It would be trivial to have the script
continue with a different device ID instead of pausing. Iterating
through a list of IDs is hardly any more difficult.
Hmm. You have given me an idea. Thanks!
YW. :-) Let us know if you can reduce the time to create an archive set.
If you have many HDD hot-swap bays, can DAR write in parallel??? With
leap-frog?
No, I don't think so, at least not in general. I suppose one could
create a front-end process which divides up the source and passes the
individual chunks to multiple DAR processes. A Python script should be
able to handle it pretty well.
I have pondered writing a script to read a directory and create a set of
hard link trees, each tree of size N bytes or less; filtered, sorted,
and grouped by configurable parameters. If anyone knows of a such a
utility, please reply.
In my experience, HDD's that are stored for long periods have the bad
habit of failing within hours of being put back into service.?? Does
this concern you?
No, not really. If a target drive fails during a backup, I can
just isolate the existing portion and then start a new backup on the
isolate. A failed drive during a restore could be a bitch, but that's
pretty unlikely. Something like dd_rescue could be a great help.
As I understand ddrescue, it is designed for multiple copies of some
content (e.g. a file or a raw device) that were originally identical,
each copy was damaged in a different area, and none of the damaged areas
overlap. ddrescue can then scan all the copies, identify the undamaged
areas, and assemble a correct version.
As I understand DAR, it uses a specialized binary format with
compression, hashing, encryption, etc.. If you burn one archive media
set using DAR, retain a few previous archive media sets, later need to
do a restore, and one drive from the most recent archive media set is
bad, I am uncertain if ddrescue will be of any help.
What is your data destruction policy?
You mean for live data? I don't have one. Do you mean for the
backups? There is no formal one.
Likewise. It's a conundrum.
For an enterprise system, ZFS is the top contender, in my
book. These are for my own use, and my business is small, however. If
I ever get to the point where I have more than 10 employees, I will no
doubt switch to ZFS.
Let me put it this way: if a business has the need for a separate
IT manager, his filesystem of choice for the file server(s) is pretty
much without question ZFS. For a small business or for personal use the
learning curve may be a bit more than the non-IT user might want to tackle.
Or not. I certainly would not discourage anyone who wants to take
on the challenge.
Migrating my SOHO servers from Linux, md, LVM, ext4, and btrfs to
FreeBSD and ZFS has been a non-trivial undertaking. I've learn a lot
and I think my data is better protected, but I still have more work to
do for disaster preparedness. You have an order of magnitude more data,
backups, and archives than I do. If and when you decide to try ZFS, I
suggest that you break off a piece and work with that.
David