There are several problems I can see:

- This is what the original '-f' flag is for.  I think a better approach
  is to expand the default message of 'zpool import' with more
  information, such as which was the last host to access the pool and
  when.  The point of '-f' is that you have recognized that the pool
  is potentially in use, but as an administrator you've made a higher
  level determination that it is in fact safe to import.

- You are going to need a flag to override this behavior for clustering
  situations.  Forcing the user to always wait 5 minutes is
  unacceptable.

- By creating a new flag (lets say '-F'), you are just going to
  introduce more complexity, and customers will get equally used to
  issuing 'zpool import -fF', and now you're back to the same problem
  all over again.

- A pool which is in use on another host but inactive for more than 5
  minutes will fail this check (since no transactions will have been
  pushed), but could potentially write data after the pool has been
  imported.

- This breaks existing behavior.  The CLI utilities are documented as
  commmitted (a.k.a stable), and breaking existing customer scripts
  isn't acceptable.

This seems to take the wrong approach to the root problem.  Depending on
how you look at it, the real root problem is either:

a) ZFS is not a clustered filesystem, and actively using the same pool
   on multiple systems (even opening said pool) will corrupt data.

b) 'zpool import' doesn't present enough information for an
   administrator to reliably determine if a pool is actually in use
   on multiple systems.

The former is obviously a ton of work and something we're thinking about
but won't address any time soon.  The latter can be addressed by
presenting more useful information when 'zpool import' is run without
the '-f' flag.

- Eric

On Wed, Sep 13, 2006 at 12:14:06PM -0500, James Dickens wrote:
> I filed this RFE earlier, since there is no way for non sun personel
> to see this RFE for a while I am posting it here, and asking for
> feedback from the community.
> 
> [Fwd: CR 6470231 Created P5 opensolaris/triage-queue Add an inuse
> check that is inforced even if import -f is used.]   Inbox
> Assign a GTD Label to this Conversation: [Show]
> Statuses:             Next Action, Action, Waiting On, SomeDay, Finished
> Contexts:             Car, Desk, Email, Home, Office, Phone, Waiting
> References:           ProjectHome, Reference
> Misc.:
> *Synopsis*: Add an inuse check that is inforced even if import -f is used.
> http://bt2ws.central.sun.com/CrPrint?id=6470231
> 
> 
> *Change Request ID*: 6470231
> 
> *Synopsis*: Add an inuse check that is inforced even if import -f is used.
> 
> Product: solaris
> Category: opensolaris
> Subcategory: triage-queue
> Type: RFE
> Subtype:
> Status: 1-Dispatched
> Substatus:
> Priority: 5-Very Low
> Introduced In Release:
> Introduced In Build:
> Responsible Manager: [EMAIL PROTECTED]
> Responsible Engineer:
> Initial Evaluator: [EMAIL PROTECTED]
> Keywords: opensolaris
> 
> === *Description* 
> ============================================================
> Category
>  kernel
> Sub-Category
>  zfs
> Description
>  Currently many people have been trying to import ZFS pools on
> multiple systems at once. currently this is unsupported, and causes
> massive data corruption to the pool.
> If ZFS refuses to import any zfs pool that was used in the last 5
> minutes, that was not cleanly exported. This prevents the filesystem
> from being mounted on multiple systems at once.
> Frequency
>  Always
> Regression
>  No
> Steps to Reproduce
>  import the same storage pool on more than one machine or domain.
> 
> Expected Result
>  #zpool import -f datapool1
> Error: ZFS pool datapool1 is currently imported on another system, and
> was accessed less than 5 minutes ago, ZFS currently does not currently
> support concurrent access. If this filesystem is no longer in use on
> the other system please export the filesystem from the other system or
> try again in 5 minutes.
> 
> Actual Result
>  #zpool import -f datapool1
> #
> a few minutes later the system crashes because of concurrent use.
> Error Message(s)
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to