Bill Sommerfeld wrote:

On Wed, 2006-05-03 at 11:52, eric kustarz wrote:
The following case is about to go to PSARC.  Comments are welcome.

and just so folks outside sun can see what a fast-track review looks
like, I'll send my comments here rather than wait for it to show up at
PSARC..
(I don't thing absolutely everything in here needs to be in the initial
version integrated.  This is mostly food for thought more than anything
else).

1) I don't see a change to the on-disk format document.  During
commitment review of ZFS we asked that the on-disk format document be
logged as part of case materials (but let it be delivered later as it
was a work in progress; we just wanted the document archived for
posterity).

Yep, i'll make sure to update the on-disk document, and get it logged with PSARC.

2) structured or un-structured log information?  there are at least two
fields now, and potentially more later (if you add the hostname,
username, etc.).  If it's unstructured, why?  if it's structured, how do
I get at individual fields of each record?

Its really just a simple plain text file of strings of the following entries:

<string type><string value><delimiter>

The delimiter is the '\0' character. So right now we record time (string type 't') and command (string type 'c').

That will look like:
<'\0'><'t'><time encoded as string><'\0'><'c'><"zfs pool create hewitt c1t1d0s3"><'\0'>

The extra NULL character preceding the time string is done so i can always find the beginning of a "record" (where a record contains several entries). This is important since its a ring buffer and data is overwritten (though the 'zpool create' never is).

So to find individual fields (entries) you need to parse through the strings.

In the future we will have something like:
<'\0'><'t'><time string><'\0'><'c'><"zfs pool create hewitt c1t1d0s3"><'\0'><'h'><"super_server"><'\0'><'u'><"ek110237"><'\0'>

3) more specific version of #2: is there any ambiguity in either the
on-disk format or in the "zpool history" output if the command arguments
contain whitespace?

(IMHO, it's more critical to get this right on-disk than in the command,
which is part of why I asked #1.  I think you want to log the argv[]
array rather than a single string for the command...)
Hmm, i'm just taking the argv array and taking each of its entries to form one string and pass that down. So that's going to potentially change the whitespace. Why does that matter?

4) In part, you're reinventing a simple variant of the solaris audit
subsystem which is specialized for zfs.  I understand perfectly why
you're reinventing this wheel rather than trying to use audit directly. But have you looked at audit to see if you're forgetting to log something important? given that there are RBAC profiles for ZFS, we already have a crude form
of delegated administration present in the system.

I haven't looked at audit, do you have a pointer?

The nice thing is that if we do forget something or invent something later that we would like to log, we can add it.

5) I assume that new zfs and zpool subcommands will need to specify
whether or not they create a zpool history entry; is there a general
rule here going forward?  Working backwards from your list of "logged"
vs "not logged", how about:
- A command which could cause a permanent change to the configuration
of the pool MUST be logged.
- Events which are operationally interesting (such as the start of a
scrub) MAY be logged.

The rule is if it modifies state, then we log it. But you're right, we may change that when we add internal events (a scrub might or might not actually modify state) - but i would like events either logged or not (no MAYBEs).

6) could something be logged at boot time when the pool is brought up
and likewise during a clean shutdown?   seems like those might be of
value in looking into trouble around an unclean shutdown.

Sure thing, we'll look into that when we see what internal events we want to log.

Nits:

if a subcommand was successfully written to disk, but the command
history log I/O failed, then a counter of failed log write I/Os is incremented.

I'm assuming you mean "if a subcommand successfully executes".

So its really both - the subcommand successfully executes when its actually written to disk and txg group is synced. This section is all about handling write failures, hence the way i wrote it. Basically, what happens if the subcommand write I/O succeeds, but the log command write I/O doesn't, what do we do...

   zpool attach
   zpool replace
   zpool scrub

is that logged at the start of the operation, the completion of the
operation, or both?  I'd think I'd want to see "scrub complete" or
"resilver complete" in the log so I could know (a) how to tune a scrub
interval (b) know how long I was operating with reduced redundancy..

Still thinking about this one. I'm leaning towards logging just when the scrub starts in this first phase (where the first phase is all about knowing what the admin did). When we add internal events, then we can log when the scrub finishes.

zpool destroy

"neat trick".  I'm assuming that you have to do a subsequent "zpool
import -f -D" to actually see this...

heh, yep.

eric

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to