Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-20 Thread Theodore Ts'o
On Wed, Mar 16, 2016 at 10:18:19PM -0700, Gregory Farnum wrote: > would've been nice if they were upstream. What *is* a big deal for > FileStore (and would be easy to take advantage of) is the thematically > similar O_NOMTIME flag, which is also about reducing metadata updates > and got blocked on

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Martin K. Petersen
> "Jeff" == Jeff Moyer writes: Jeff> TRIM/UNMAP isn't just supported on solid state devices, though. I Jeff> do recall some enterprise thinly provisioned storage that would Jeff> take ages to discard large regions. I think that caused us to Jeff> change the defaults for mkfs, right? I thin

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Gregory Farnum
On Wed, Mar 16, 2016 at 5:33 PM, Eric Sandeen wrote: > I may have lost the thread at this point, with poor Darrick's original > patch submission devolving into a long thread about a NO_HIDE_STALE patch > used at Google, but I don't *think* Ceph ever asked for NO_HIDE_STALE. > > At least I can't fi

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Theodore Ts'o
On Thu, Mar 17, 2016 at 02:00:18PM -0700, Chris Mason wrote: > > Thinking more, my guess is that google will just keep doing what they > are already doing ;) But there could be a flag in sysfs dedicated to > trim-for-fallocate so admins can see what their devices are reporting. > readonly in main

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread NeilBrown
On Wed, Mar 16 2016, Theodore Ts'o wrote: > On Wed, Mar 16, 2016 at 09:33:13AM +1100, Dave Chinner wrote: >> >> Stale data escaping containment is a security issue. Enabling >> generic kernel mechanisms to *enable containment escape* is >> fundamentally wrong, and relying on userspace to Do The R

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Chris Mason
On Thu, Mar 17, 2016 at 02:49:06PM -0600, Andreas Dilger wrote: > On Mar 17, 2016, at 12:35 PM, Chris Mason wrote: > > > > On Thu, Mar 17, 2016 at 10:47:29AM -0700, Linus Torvalds wrote: > >> On Wed, Mar 16, 2016 at 10:18 PM, Gregory Farnum wrote: > >>> > >>> So we've not asked for NO_HIDE_STAL

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Linus Torvalds
On Wed, Mar 16, 2016 at 10:18 PM, Gregory Farnum wrote: > > So we've not asked for NO_HIDE_STALE on the mailing lists, but I think > it was one of the problems Sage had using xfs in his BlueStore > implementation and was a big part of why it moved to pure userspace. > FileStore might use NO_HIDE_S

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Jeff Moyer
"Theodore Ts'o" writes: > I do think that using TRIM in various causes where we are doing an > fallocate does make sense for non-rotational devices. In general TRIM > should be fast enough that that I'd be surprised that people would be > complaining --- especially since most of the time, falloc

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Theodore Ts'o
On Wed, Mar 16, 2016 at 03:45:49PM -0600, Andreas Dilger wrote: > > Clearly, the performance hit of unwritten extent conversion is large > > enough to tempt people to ask for no-hide-stale. But I'd rather hear > > that directly from a developer, Ceph or otherwise. > > I suspect that this gets sig

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Linus Torvalds
On Thu, Mar 17, 2016 at 10:50 AM, Ric Wheeler wrote: >> >> That argues against worrying about this all in the kernel unless there >> are other users. > > Just a note, when Greg says "user space solution", Ceph is looking at > writing directly to raw block devices which is kind of a through back to

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Chris Mason
On Tue, Mar 15, 2016 at 05:51:17PM -0700, Chris Mason wrote: > On Tue, Mar 15, 2016 at 07:30:14PM -0500, Eric Sandeen wrote: > > On 3/15/16 7:06 PM, Linus Torvalds wrote: > > > On Tue, Mar 15, 2016 at 4:52 PM, Dave Chinner wrote: > > >> > > > >> > It is pretty clear that the onus is on the patch s

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Chris Mason
On Thu, Mar 17, 2016 at 10:47:29AM -0700, Linus Torvalds wrote: > On Wed, Mar 16, 2016 at 10:18 PM, Gregory Farnum wrote: > > > > So we've not asked for NO_HIDE_STALE on the mailing lists, but I think > > it was one of the problems Sage had using xfs in his BlueStore > > implementation and was a b

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Dave Chinner
On Tue, Mar 15, 2016 at 06:51:39PM -0700, Darrick J. Wong wrote: > On Tue, Mar 15, 2016 at 06:52:24PM -0400, Theodore Ts'o wrote: > > On Wed, Mar 16, 2016 at 09:33:13AM +1100, Dave Chinner wrote: > > > > > > Stale data escaping containment is a security issue. Enabling > > > generic kernel mechani

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Andreas Dilger
On Mar 15, 2016, at 7:51 PM, Darrick J. Wong wrote: > > On Tue, Mar 15, 2016 at 06:52:24PM -0400, Theodore Ts'o wrote: >> On Wed, Mar 16, 2016 at 09:33:13AM +1100, Dave Chinner wrote: >>> >>> Stale data escaping containment is a security issue. Enabling >>> generic kernel mechanisms to *enable c

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Linus Torvalds
On Thu, Mar 17, 2016 at 11:52 PM, Gregory Farnum wrote: > > I wasn't really involved in this stuff but I gather from looking at > http://www.spinics.net/lists/xfs/msg36869.html that any durability > command other than fdatasync is going to write out the mtime updates > to the inodes on disk. Given

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Darrick J. Wong
On Thu, Mar 17, 2016 at 12:01:16PM +1100, Dave Chinner wrote: > On Tue, Mar 15, 2016 at 06:51:39PM -0700, Darrick J. Wong wrote: > > On Tue, Mar 15, 2016 at 06:52:24PM -0400, Theodore Ts'o wrote: > > > On Wed, Mar 16, 2016 at 09:33:13AM +1100, Dave Chinner wrote: > > > > > > > > Stale data escapin

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Andreas Dilger
On Mar 17, 2016, at 12:35 PM, Chris Mason wrote: > > On Thu, Mar 17, 2016 at 10:47:29AM -0700, Linus Torvalds wrote: >> On Wed, Mar 16, 2016 at 10:18 PM, Gregory Farnum wrote: >>> >>> So we've not asked for NO_HIDE_STALE on the mailing lists, but I think >>> it was one of the problems Sage had

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-19 Thread Theodore Ts'o
On Wed, Mar 16, 2016 at 07:33:55PM -0500, Eric Sandeen wrote: > I may have lost the thread at this point, with poor Darrick's original > patch submission devolving into a long thread about a NO_HIDE_STALE patch > used at Google, but I don't *think* Ceph ever asked for NO_HIDE_STALE. > > At least I

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-18 Thread Ric Wheeler
On 03/16/2016 06:23 PM, Chris Mason wrote: On Tue, Mar 15, 2016 at 05:51:17PM -0700, Chris Mason wrote: On Tue, Mar 15, 2016 at 07:30:14PM -0500, Eric Sandeen wrote: On 3/15/16 7:06 PM, Linus Torvalds wrote: On Tue, Mar 15, 2016 at 4:52 PM, Dave Chinner wrote: It is pretty clear that the onu

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-18 Thread Eric Sandeen
On 3/16/16 7:15 PM, Theodore Ts'o wrote: > On Wed, Mar 16, 2016 at 03:45:49PM -0600, Andreas Dilger wrote: >>> Clearly, the performance hit of unwritten extent conversion is large >>> enough to tempt people to ask for no-hide-stale. But I'd rather hear >>> that directly from a developer, Ceph or o

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-18 Thread Ric Wheeler
On 03/17/2016 01:47 PM, Linus Torvalds wrote: On Wed, Mar 16, 2016 at 10:18 PM, Gregory Farnum wrote: So we've not asked for NO_HIDE_STALE on the mailing lists, but I think it was one of the problems Sage had using xfs in his BlueStore implementation and was a big part of why it moved to pure u

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-18 Thread Gregory Farnum
On Thu, Mar 17, 2016 at 10:47 AM, Linus Torvalds wrote: > On Wed, Mar 16, 2016 at 10:18 PM, Gregory Farnum wrote: >> >> So we've not asked for NO_HIDE_STALE on the mailing lists, but I think >> it was one of the problems Sage had using xfs in his BlueStore >> implementation and was a big part of

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Darrick J. Wong
On Tue, Mar 15, 2016 at 06:52:24PM -0400, Theodore Ts'o wrote: > On Wed, Mar 16, 2016 at 09:33:13AM +1100, Dave Chinner wrote: > > > > Stale data escaping containment is a security issue. Enabling > > generic kernel mechanisms to *enable containment escape* is > > fundamentally wrong, and relying

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Chris Mason
On Tue, Mar 15, 2016 at 07:30:14PM -0500, Eric Sandeen wrote: > On 3/15/16 7:06 PM, Linus Torvalds wrote: > > On Tue, Mar 15, 2016 at 4:52 PM, Dave Chinner wrote: > >> > > >> > It is pretty clear that the onus is on the patch submitter to > >> > provide justification for inclusion, not for the rev

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Eric Sandeen
On 3/15/16 7:06 PM, Linus Torvalds wrote: > On Tue, Mar 15, 2016 at 4:52 PM, Dave Chinner wrote: >> > >> > It is pretty clear that the onus is on the patch submitter to >> > provide justification for inclusion, not for the reviewer/Maintainer >> > to have to prove that the solution is unworkable.

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Dave Chinner
On Tue, Mar 15, 2016 at 04:14:32PM -0700, Linus Torvalds wrote: > On Tue, Mar 15, 2016 at 4:06 PM, Linus Torvalds > wrote: > > > > And yes, "keep the patch entirely inside google" is obviously one good > > way to limit the interface. But if there are really other groups that > > want to explore th

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Linus Torvalds
On Tue, Mar 15, 2016 at 4:52 PM, Dave Chinner wrote: > > It is pretty clear that the onus is on the patch submitter to > provide justification for inclusion, not for the reviewer/Maintainer > to have to prove that the solution is unworkable. I agree, but quite frankly, performance is a good justi

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Dave Chinner
On Tue, Mar 15, 2016 at 04:06:10PM -0700, Linus Torvalds wrote: > On Tue, Mar 15, 2016 at 3:33 PM, Dave Chinner wrote: > > > >> There's no "group based containment wall" that is some kind of > >> absolute protection border. > > > > Precisely my point - it's being pitched as a generic containment >

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Linus Torvalds
On Tue, Mar 15, 2016 at 4:06 PM, Linus Torvalds wrote: > > And yes, "keep the patch entirely inside google" is obviously one good > way to limit the interface. But if there are really other groups that > want to explore this, then that sounds like a pretty horrible model > too. Side note: I reall

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Linus Torvalds
On Tue, Mar 15, 2016 at 3:33 PM, Dave Chinner wrote: > >> There's no "group based containment wall" that is some kind of >> absolute protection border. > > Precisely my point - it's being pitched as a generic containment > mechanism, but it really isn't. No it hasn't. It has been pitched as "C

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Theodore Ts'o
On Wed, Mar 16, 2016 at 09:33:13AM +1100, Dave Chinner wrote: > > Stale data escaping containment is a security issue. Enabling > generic kernel mechanisms to *enable containment escape* is > fundamentally wrong, and relying on userspace to Do The Right Thing > is even more of a gamble, IMO. We a

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Eric Sandeen
On 3/15/16 3:14 PM, Dave Chinner wrote: > What we are missing is actual numbers that show that exposing stale > data is a /significant/ win for these applications that are > demanding it. And then we need evidence proving that the problem is > actually systemic and not just a hack around a bad impl

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Dave Chinner
On Tue, Mar 15, 2016 at 01:43:01PM -0700, Linus Torvalds wrote: > On Tue, Mar 15, 2016 at 1:14 PM, Dave Chinner wrote: > > > > Root can still change the group id of a file that has exposed stale > > data and hence make it visible outside of the group based > > containment wall. > > Ok, Dave, now

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Theodore Ts'o
On Tue, Mar 15, 2016 at 01:43:01PM -0700, Linus Torvalds wrote: > Put another way: this is not about theoretical leaks - because those > are totally irrelevant (in theory, the original discard writer had > access to all that stale data anyway). This is about making it a > practical interface that d

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Linus Torvalds
On Tue, Mar 15, 2016 at 1:14 PM, Dave Chinner wrote: > > Root can still change the group id of a file that has exposed stale > data and hence make it visible outside of the group based > containment wall. Ok, Dave, now you're just being ridiculous. The issue has never been - and *should* never b

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-15 Thread Dave Chinner
On Mon, Mar 14, 2016 at 10:46:03AM -0400, Theodore Ts'o wrote: > On Mon, Mar 14, 2016 at 06:34:00AM -0400, Ric Wheeler wrote: > > I think that once we enter this mode, the local file system has effectively > > ceded its role to prevent stale data exposure to the upper layer. In effect, > > this cea

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-14 Thread Theodore Ts'o
On Mon, Mar 14, 2016 at 06:34:00AM -0400, Ric Wheeler wrote: > I think that once we enter this mode, the local file system has effectively > ceded its role to prevent stale data exposure to the upper layer. In effect, > this ceases to become a normal file system for any enabled process if we > cont

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-14 Thread Ric Wheeler
On 03/13/2016 07:30 PM, Dave Chinner wrote: On Fri, Mar 11, 2016 at 04:44:16PM -0800, Linus Torvalds wrote: On Fri, Mar 11, 2016 at 4:35 PM, Theodore Ts'o wrote: At the end of the day it's about whether you trust the userspace program or not. There's a big difference between "give the user ro

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-13 Thread Dave Chinner
On Fri, Mar 11, 2016 at 04:44:16PM -0800, Linus Torvalds wrote: > On Fri, Mar 11, 2016 at 4:35 PM, Theodore Ts'o wrote: > > > > At the end of the day it's about whether you trust the userspace > > program or not. > > There's a big difference between "give the user rope", and "tie the > rope in a

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-12 Thread Thomas Schoebel-Theuer
On 03/12/2016 08:19 AM, Theodore Ts'o wrote: On Fri, Mar 11, 2016 at 04:44:16PM -0800, Linus Torvalds wrote: There's a big difference between "give the user rope", and "tie the rope in a noose and put a banana peel so that the user might stumble into the rope and hang himself", though. [...]

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Theodore Ts'o
On Fri, Mar 11, 2016 at 04:44:16PM -0800, Linus Torvalds wrote: > On Fri, Mar 11, 2016 at 4:35 PM, Theodore Ts'o wrote: > > > > At the end of the day it's about whether you trust the userspace > > program or not. > > There's a big difference between "give the user rope", and "tie the > rope in a

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Linus Torvalds
On Fri, Mar 11, 2016 at 4:35 PM, Theodore Ts'o wrote: > > At the end of the day it's about whether you trust the userspace > program or not. There's a big difference between "give the user rope", and "tie the rope in a noose and put a banana peel so that the user might stumble into the rope and h

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Theodore Ts'o
On Sat, Mar 12, 2016 at 09:30:47AM +1100, Dave Chinner wrote: > It's all well and good to restrict access to the fallocate() call to > limit who can expose stale data, but it doesn't remove the fact it > is easy for stale data to unintentionally escape the privileged > group once it has been expose

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Linus Torvalds
On Fri, Mar 11, 2016 at 2:30 PM, Dave Chinner wrote: > On Fri, Mar 11, 2016 at 10:25:30AM -0800, Linus Torvalds wrote: >> >> So you'd have to explicitly say "my setup is ok with hole punching". > > Except it's not hole punching that is the problem. [..] > The problem here is > preallocation of unw

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Dave Chinner
On Fri, Mar 11, 2016 at 10:25:30AM -0800, Linus Torvalds wrote: > On Fri, Mar 11, 2016 at 9:30 AM, Andy Lutomirski wrote: > > > > What if we had an ioctl to do these data-leaking operations that took, > > as an extra parameter, an fd to the block device node. They allow > > access if the fd point

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Linus Torvalds
On Fri, Mar 11, 2016 at 9:30 AM, Andy Lutomirski wrote: > > What if we had an ioctl to do these data-leaking operations that took, > as an extra parameter, an fd to the block device node. They allow > access if the fd points to the right inode and has FMODE_READ (and LSM > checks say it's okay).

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Andy Lutomirski
On Fri, Mar 11, 2016 at 9:23 AM, Linus Torvalds wrote: > On Fri, Mar 11, 2016 at 5:59 AM, One Thousand Gnomes > wrote: >> >> > > We can do the security check at the filesystem level, because we have >> > > sb->s_bdev->bd_inode, and if you have read and write permissions to >> > > that inode, you

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Linus Torvalds
On Fri, Mar 11, 2016 at 5:59 AM, One Thousand Gnomes wrote: > > > > We can do the security check at the filesystem level, because we have > > > sb->s_bdev->bd_inode, and if you have read and write permissions to > > > that inode, you might as well have permission to create a unsafe hole. > > Not i

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread Theodore Ts'o
On Fri, Mar 11, 2016 at 01:59:52PM +, One Thousand Gnomes wrote: > > > We can do the security check at the filesystem level, because we have > > > sb->s_bdev->bd_inode, and if you have read and write permissions to > > > that inode, you might as well have permission to create a unsafe hole. >

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-11 Thread One Thousand Gnomes
> > We can do the security check at the filesystem level, because we have > > sb->s_bdev->bd_inode, and if you have read and write permissions to > > that inode, you might as well have permission to create a unsafe hole. Not if you don't have access to a block device node to open it, or there are

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-10 Thread Ric Wheeler
On 03/11/2016 12:03 AM, Linus Torvalds wrote: On Thu, Mar 10, 2016 at 6:58 AM, Ric Wheeler wrote: What was objectionable at the time this patch was raised years back (not just to me, but to pretty much every fs developer at LSF/MM that year) centered on the concern that this would be viewed as

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-10 Thread Theodore Ts'o
On Thu, Mar 10, 2016 at 10:33:49AM -0800, Linus Torvalds wrote: > On Thu, Mar 10, 2016 at 6:58 AM, Ric Wheeler wrote: > > > > What was objectionable at the time this patch was raised years back (not > > just to me, but to pretty much every fs developer at LSF/MM that year) > > centered on the conc

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-10 Thread Linus Torvalds
On Thu, Mar 10, 2016 at 6:58 AM, Ric Wheeler wrote: > > What was objectionable at the time this patch was raised years back (not > just to me, but to pretty much every fs developer at LSF/MM that year) > centered on the concern that this would be viewed as a "performance" mode > and we get pressur

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-10 Thread Ric Wheeler
On 03/10/2016 04:38 AM, Theodore Ts'o wrote: On Wed, Mar 09, 2016 at 02:20:31PM -0800, Gregory Farnum wrote: I really am sensitive to the security concerns, just know that if it's a permanent blocker you're essentially blocking out a growing category of disk users (who run on an awfully large nu

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-09 Thread Theodore Ts'o
On Wed, Mar 09, 2016 at 02:20:31PM -0800, Gregory Farnum wrote: > I really am sensitive to the security concerns, just know that if it's > a permanent blocker you're essentially blocking out a growing category > of disk users (who run on an awfully large number of disks!). Or they just have to use

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-09 Thread Gregory Farnum
On Thu, Mar 3, 2016 at 3:10 PM, Dave Chinner wrote: > On Thu, Mar 03, 2016 at 05:39:52PM -0500, Theodore Ts'o wrote: >> On Thu, Mar 03, 2016 at 01:54:54PM -0500, Martin K. Petersen wrote: >> > > "Christoph" == Christoph Hellwig writes: >> > >> > Christoph> - FALLOC_FL_PUNCH_HOLE assures zero

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Thomas Schoebel-Theuer
On 03/03/2016 11:56 PM, Dave Chinner wrote: > That "new kind of write command" would enable delayed allocation > algorithms to continue to work at the filesystem level on block > devices that freespace management completely is offloaded to... > Cheers, Dave. This would advocate a uniform /interna

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Theodore Ts'o
On Fri, Mar 04, 2016 at 10:10:50AM +1100, Dave Chinner wrote: > You can tempt all you want, but it does not change the basic fact > that it is dangerous and compromises system security. As such, it > does not belong in upstream kernels. Especially in this day and age > where ensuring the fundamenta

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Dave Chinner
On Thu, Mar 03, 2016 at 05:39:52PM -0500, Theodore Ts'o wrote: > On Thu, Mar 03, 2016 at 01:54:54PM -0500, Martin K. Petersen wrote: > > > "Christoph" == Christoph Hellwig writes: > > > > Christoph> - FALLOC_FL_PUNCH_HOLE assures zeroes are returned, but > > Christoph> space is deallocated a

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Dave Chinner
On Thu, Mar 03, 2016 at 01:54:54PM -0500, Martin K. Petersen wrote: > > "Christoph" == Christoph Hellwig writes: > > Christoph> - FALLOC_FL_PUNCH_HOLE assures zeroes are returned, but > Christoph> space is deallocated as much as possible - > Christoph> FALLOC_FL_ZERO_RANGE assures zeroes are

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Theodore Ts'o
On Thu, Mar 03, 2016 at 01:54:54PM -0500, Martin K. Petersen wrote: > > "Christoph" == Christoph Hellwig writes: > > Christoph> - FALLOC_FL_PUNCH_HOLE assures zeroes are returned, but > Christoph> space is deallocated as much as possible - > Christoph> FALLOC_FL_ZERO_RANGE assures zeroes are

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Martin K. Petersen
> "Christoph" == Christoph Hellwig writes: Christoph> - FALLOC_FL_PUNCH_HOLE assures zeroes are returned, but Christoph> space is deallocated as much as possible - Christoph> FALLOC_FL_ZERO_RANGE assures zeroes are returned, AND blocks Christoph> are actually allocated That works for me. I

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Theodore Ts'o
On Thu, Mar 03, 2016 at 09:55:38AM -0800, Linus Torvalds wrote: > But that essentially says that we shouldn't expose this interface at > all (unless we trust our white-lists - I'm sure they are getting > better, but if nobody has ever really _relied_ on the zeroing behavior > of trim, then I guess

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Linus Torvalds
On Thu, Mar 3, 2016 at 10:01 AM, Martin K. Petersen wrote: >> "Linus" == Linus Torvalds writes: > > Linus> .. but the flag doesn't even set that. Even if you avoid TRIM, > Linus> there is absolutely zero guarantees that WRITE_SAME would do > Linus> "real storage blocks full of zeroes backing

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Martin K. Petersen
> "Linus" == Linus Torvalds writes: Linus> On Thu, Mar 3, 2016 at 9:02 AM, Theodore Ts'o wrote: >> >> There is a massive bug in the SATA specs about trim, which is that it >> is considered advisory. So the storage device can throw it away >> whenever it feels like it. (In practice, when i

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Darrick J. Wong
On Thu, Mar 03, 2016 at 10:09:24AM -0800, Christoph Hellwig wrote: > On Thu, Mar 03, 2016 at 01:01:11PM -0500, Martin K. Petersen wrote: > > That's not entirely true. Writing the blocks may cause them to be > > allocated on the storage device (depending on which flags we feed it in > > WRITE SAME).

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Christoph Hellwig
On Thu, Mar 03, 2016 at 01:01:11PM -0500, Martin K. Petersen wrote: > That's not entirely true. Writing the blocks may cause them to be > allocated on the storage device (depending on which flags we feed it in > WRITE SAME). > > The filesystems people were wanted the following semantics: > > - d

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Martin K. Petersen
> "Linus" == Linus Torvalds writes: Linus> .. but the flag doesn't even set that. Even if you avoid TRIM, Linus> there is absolutely zero guarantees that WRITE_SAME would do Linus> "real storage blocks full of zeroes backing the LBAs they just Linus> wrote out". That's not entirely true. Wri

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Christoph Hellwig
On Thu, Mar 03, 2016 at 09:55:38AM -0800, Linus Torvalds wrote: > Ugh. > > But that essentially says that we shouldn't expose this interface at > all (unless we trust our white-lists - I'm sure they are getting > better, but if nobody has ever really _relied_ on the zeroing behavior > of trim, the

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Linus Torvalds
On Thu, Mar 3, 2016 at 9:02 AM, Theodore Ts'o wrote: > > There is a massive bug in the SATA specs about trim, which is that it > is considered advisory. So the storage device can throw it away > whenever it feels like it. (In practice, when it's too busy doing > other things). Ugh. But that es

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-03 Thread Theodore Ts'o
On Wed, Mar 02, 2016 at 03:49:53PM -0800, Linus Torvalds wrote: > > No. This is not about enabling use of "that idiotic discard behavior", for > > that there's BLKDISCARD. This ioctl does NOT use the handwavy old TRIM > > advisory request thing that could return "fuzzy wuzzy" without violating th

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-02 Thread Linus Torvalds
On Wed, Mar 2, 2016 at 2:56 PM, Darrick J. Wong wrote: > > Oh yes we do. Adding required-zero padding to allow for future increases of > the expressiveness of an ioctl is very common. > > $ egrep -rn '(reserved|padding).*;' include/uapi/ | wc -l > 564 Most of those should be for alignment reason

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-02 Thread Darrick J. Wong
On Wed, Mar 02, 2016 at 10:52:01AM -0800, Linus Torvalds wrote: > On Tue, Mar 1, 2016 at 8:09 PM, Darrick J. Wong > wrote: > > Create a new ioctl to expose the block layer's newfound ability to > > issue either a zeroing discard, a WRITE SAME with a zero page, or a > > regular write with the zero

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-02 Thread Linus Torvalds
On Tue, Mar 1, 2016 at 8:09 PM, Darrick J. Wong wrote: > Create a new ioctl to expose the block layer's newfound ability to > issue either a zeroing discard, a WRITE SAME with a zero page, or a > regular write with the zero page. This BLKZEROOUT2 ioctl takes > {start, length, flags} as parameters

Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-02 Thread Christoph Hellwig
Looks fine, Reviewed-by: Christoph Hellwig

[PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

2016-03-01 Thread Darrick J. Wong
Create a new ioctl to expose the block layer's newfound ability to issue either a zeroing discard, a WRITE SAME with a zero page, or a regular write with the zero page. This BLKZEROOUT2 ioctl takes {start, length, flags} as parameters. So far, the only flag available is to enable the zeroing disc