[gentoo-user] Making sense of reported ZFS disk usage

2015-10-31 Thread Remy Blank
I'm trying to make sense of the disk usage reported by "zfs list".
Here's what I get:

$ zfs list \
-o name,used,avail,refer,usedbydataset,usedbychildren,usedbysnapshots \
-t all

NAME  USED  AVAIL  REFER  USEDDS  USEDCHILD  USEDSNAP
pool/data58.0G   718G  46.7G   46.7G  0 11.3G
pool/data@2015-10-03 0  -  46.5G   -  - -
pool/data@2015-10-04 0  -  46.5G   -  - -
pool/data@2015-10-05 0  -  46.5G   -  - -
pool/data@2015-10-06 0  -  46.5G   -  - -
pool/data@2015-10-07 0  -  46.5G   -  - -
pool/data@2015-10-08 0  -  46.5G   -  - -
pool/data@2015-10-09 0  -  46.5G   -  - -
pool/data@2015-10-10 0  -  46.5G   -  - -
pool/data@2015-10-11 0  -  46.5G   -  - -
pool/data@2015-10-12 0  -  46.5G   -  - -
pool/data@2015-10-13  734M  -  46.7G   -  - -
pool/data@2015-10-14 0  -  46.7G   -  - -
pool/data@2015-10-15 0  -  46.7G   -  - -
pool/data@2015-10-16 0  -  46.7G   -  - -
pool/data@2015-10-17 0  -  46.7G   -  - -
pool/data@2015-10-18 0  -  46.7G   -  - -
pool/data@2015-10-19 0  -  46.7G   -  - -
pool/data@2015-10-20 0  -  46.7G   -  - -
pool/data@2015-10-21 0  -  46.7G   -  - -
pool/data@2015-10-22 0  -  46.7G   -  - -
pool/data@2015-10-23 0  -  46.7G   -  - -
pool/data@2015-10-24 0  -  46.7G   -  - -
pool/data@2015-10-25 0  -  46.7G   -  - -
pool/data@2015-10-26 0  -  46.7G   -  - -
pool/data@2015-10-27 0  -  46.7G   -  - -
pool/data@2015-10-28 0  -  46.7G   -  - -
pool/data@2015-10-29  755M  -  46.7G   -  - -
pool/data@2015-10-30  757M  -  46.7G   -  - -
pool/data@2015-10-31 0  -  46.7G   -  - -

What I don't understand: I have 29 snapshots, only three of them use
~750M, but in total they take 11.3G. Where do the excess 9.1G come from?

I'm using sys-fs/zfs-0.6.5.3, in case that matters.

-- Remy




Re: [gentoo-user] Making sense of reported ZFS disk usage

2015-10-31 Thread Jeremi Piotrowski
On Sat, Oct 31, 2015 at 10:37:55AM +0100, Remy Blank wrote:
> I'm trying to make sense of the disk usage reported by "zfs list".
> Here's what I get:
> 
> $ zfs list \
> -o name,used,avail,refer,usedbydataset,usedbychildren,usedbysnapshots \
> -t all
> 
> NAME  USED  AVAIL  REFER  USEDDS  USEDCHILD  USEDSNAP
> pool/data58.0G   718G  46.7G   46.7G  0 11.3G
> pool/data@2015-10-03 0  -  46.5G   -  - -
> ...
> pool/data@2015-10-12 0  -  46.5G   -  - -
> pool/data@2015-10-13  734M  -  46.7G   -  - -
> pool/data@2015-10-14 0  -  46.7G   -  - -
> ...
> pool/data@2015-10-28 0  -  46.7G   -  - -
> pool/data@2015-10-29  755M  -  46.7G   -  - -
> pool/data@2015-10-30  757M  -  46.7G   -  - -
> pool/data@2015-10-31 0  -  46.7G   -  - -
> 
> What I don't understand: I have 29 snapshots, only three of them use
> ~750M, but in total they take 11.3G. Where do the excess 9.1G come from?
> 

I'm going to go out on a limb and assume that zfs works in a similar way
to btrfs here (my quick googling shows that atleast in this case that may be
true). You then have to understand the numbers in the following way:

USEDSNAP refers to _data_ that is not in pool/data but in the snapshots.
The value for USED is _data_ that is only present in *this one* snapshot,
and not in any other snapshots or in pool/data. _data_ that is shared
between atleast two snapshots is not shown as USED because removing one of
the snapshots would not free it (it is still referenced by another
snapshot).

So in your case you have 3 snapshots which each have 750 MB exclusively,
and the remaining ~9 GB is in some way shared between all snapshots. So if
you were to delete any one of the 3 snapshots, you would free 750 MB. If
you were to delete all snapshots you would free 11.3 GB. But deleting any
one snapshot can change the USED count of any other snapshot.

This is one of the problems with copy-on-write filesystems - they make
disk space accounting more complicated especially with snapshots. Perhaps
zfs has something similar to btrfs qgroups which would allow you to group
snapshots in arbitrary ways to find out how much any group of snapshots
uses. Here's the example output of 'btrfs qgroup show' on my machine:

qgroupid rfer excl parent  child
   --  -
0/5   0.00GiB  0.00GiB --- ---
0/262 6.37GiB  0.03GiB --- ---
0/265 3.52GiB  2.38GiB 1/0 ---
0/270 6.38GiB  0.16GiB --- ---
0/275 0.00GiB  0.00GiB --- ---
0/276 4.38GiB  0.35GiB 1/0 ---
0/277 0.00GiB  0.00GiB --- ---
0/278 4.98GiB  0.40GiB 1/1 ---
0/279 4.62GiB  0.12GiB 1/0 ---
0/285 5.59GiB  0.01GiB 1/0 ---
0/286 5.69GiB  0.01GiB 1/0 ---
0/289 6.34GiB  0.42GiB 1/1 ---
0/290 6.35GiB  0.01GiB 1/0 ---
0/291 6.38GiB  0.15GiB 1/1 ---
1/0  10.02GiB  3.68GiB --- 
0/265,0/276,0/279,0/285,0/286,0/290
1/1   7.20GiB  0.98GiB --- 0/278,0/289,0/291

0/262 is /
0/270 is /home
1/0 contains all snapshots of /
1/1 contains all snapshots of /home 

but I could also have grouped a subgroup of the snapshots in some other
way to find out how much space they take exclusively and how much space
they would free if they were to be deleted.



Re: [gentoo-user] Making sense of reported ZFS disk usage

2015-10-31 Thread Rich Freeman
On Sat, Oct 31, 2015 at 10:25 AM, Jeremi Piotrowski
 wrote:
>
> This is one of the problems with copy-on-write filesystems - they make
> disk space accounting more complicated especially with snapshots.

Indeed, it is one of the problems with copy-on-write anything.  Shared
memory is a similar situation - do you count glibc in RAM one time or
many?

It also gets complicated with compression and dynamic mirroring and
such, though at least in these cases a better prediction could
probably be made than is often done.

Free space usage in these cases really needs to distinguish between
shared/exclusive space, but in the case of the former it will probably
never be easy to display in a concise manner just what you need to do
to reclaim that space.  Obviously shared space can only be reclaimed
if all its references are deleted.

The flip-side of this is that copying data is really cheap.  Alias cp
to cp --reflink=auto and you can use a copy in many situations where
you might have previously used a hard/symbolic link.  Obviously all
three of those do different things, but before I had reflinks I found
myself using hard/symbolic links when they were less than ideal.


-- 
Rich



[gentoo-user] Re: Making sense of reported ZFS disk usage

2015-10-31 Thread Remy Blank
Jeremi Piotrowski wrote on 2015-10-31 15:25:
> USEDSNAP refers to _data_ that is not in pool/data but in the snapshots.
> The value for USED is _data_ that is only present in *this one* snapshot,
> and not in any other snapshots or in pool/data. _data_ that is shared
> between atleast two snapshots is not shown as USED because removing one of
> the snapshots would not free it (it is still referenced by another
> snapshot).

Indeed, this makes sense. The man page says about "used":

  As the file  system  changes, space  that  was  previously shared
becomes unique to the snapshot, and counted in the snapshot's space  used.

So "used" means "used exclusively by this snapshot". And the missing
9.1G is data shared by two or more snapshots. I hadn't understood that
before. Thank you.

I guess I was looking for something like "how much data is shared
between this snapshot and the next one", but since there's no link
between snapshots (only between the dataset and the snapshots), ZFS
can't provide it in "zfs list".

But maybe there's a way to get this information? How can I find the
amount of data shared between two snapshots of the same dataset?

-- Remy




[gentoo-user] Re: Making sense of reported ZFS disk usage

2015-10-31 Thread Remy Blank
Remy Blank wrote on 2015-10-31 16:35:
> I guess I was looking for something like "how much data is shared
> between this snapshot and the next one", but since there's no link
> between snapshots (only between the dataset and the snapshots), ZFS
> can't provide it in "zfs list".
> 
> But maybe there's a way to get this information? How can I find the
> amount of data shared between two snapshots of the same dataset?

Some more manpage reading shows that the "written" property is exactly
what I'm looking for:

  The amount of referenced space written to this dataset since the
previous snapshot.

Thanks for the hints!

-- Remy




[gentoo-user] Calamares installer

2015-10-31 Thread James
Hello,


Well it looks as though some brilliant folks have indeed solved the
installer thirst for numerous distros, by creating a generic framework that
works with many linux distros.:: [1] A '' version is in portage
(thanks!) But other versions are found in the overlays  "eix -R calamares"

Calamares has one 'killer feature' on disk partitioning::

"Calamares is a distribution-independent system installer, with an advanced
partitioning feature for both manual and automated partitioning operations.
It is the first installer with an automated “Replace Partition” option,
which makes it easy to reuse a partition over and over for distribution
testing. Calamares is designed to be customizable by distribution
maintainers without need for cumbersome patching, thanks to third party
branding and external modules support." [2]


Best of all:: Calamares is purported to be very 'openrc' friendly! 
For you kids, that's '420 friendly' :: very cool? That's actually
the literal, Klingon translation.

Has anyone tried this out (calamares)? 

If so, can you put your (iso) somewhere for wider testing?

It even seems our pals @sabayon have caught the calamares bug:: [3]


enjoy,
James


[1] https://blogs.gentoo.org/johu/

[2] http://calamares.io

[3]
http://emka.web.id/linux/2015/linux-news-today-gentoo-based-sabayon-15-10-officially-released-with-kde-plasma-5-and-calamares/