Re: [PATCH v3 03/27] btrfs: Check and enable HMZONED mode

2019-08-16 Thread Damien Le Moal
On 2019/08/16 16:57, Anand Jain wrote: > > > On 8/16/19 10:23 PM, Damien Le Moal wrote: >> On 2019/08/15 22:46, Anand Jain wrote: >>> On 8/8/19 5:30 PM, Naohiro Aota wrote: HMZONED mode cannot be used together with the RAID5/6 profile for now. Introduce the function btrfs_check_hmzoned_

Re: [PATCH v3 03/27] btrfs: Check and enable HMZONED mode

2019-08-16 Thread Anand Jain
On 8/16/19 10:23 PM, Damien Le Moal wrote: On 2019/08/15 22:46, Anand Jain wrote: On 8/8/19 5:30 PM, Naohiro Aota wrote: HMZONED mode cannot be used together with the RAID5/6 profile for now. Introduce the function btrfs_check_hmzoned_mode() to check this. This function will also check if HM

Re: [PATCH v3 02/27] btrfs: Get zone information of zoned block devices

2019-08-16 Thread Damien Le Moal
On 2019/08/16 16:48, Anand Jain wrote: [...] >>> How many zones do we see in a disk? Not many I presume. >> >> A 15 TB SMR drive with 256 MB zones (which is a failry common value for >> products >> out there) has over 55,000 zones. "Not many" is subjective... I personally >> consider 55000 a large

Re: [PATCH v3 02/27] btrfs: Get zone information of zoned block devices

2019-08-16 Thread Anand Jain
On 8/16/19 10:19 PM, Damien Le Moal wrote: On 2019/08/15 21:47, Anand Jain wrote: On 8/8/19 5:30 PM, Naohiro Aota wrote: If a zoned block device is found, get its zone information (number of zones and zone size) using the new helper function btrfs_get_dev_zonetypes(). To avoid costly run-ti

Re: [RFC PATCH 4/5] fs: export rw_verify_area()

2019-08-16 Thread Josef Bacik
On Thu, Aug 15, 2019 at 02:04:05PM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > I'm adding a Btrfs ioctl to write compressed data, and rather than > duplicating the checks in rw_verify_area(), let's just export it. > > Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik Thanks, J

Re: [PATCH 3/5] Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree

2019-08-16 Thread Josef Bacik
On Thu, Aug 15, 2019 at 02:04:04PM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > Since commit fee187d9d9dd ("Btrfs: do not set EXTENT_DIRTY along with > EXTENT_DELALLOC"), we never set EXTENT_DIRTY in inode->io_tree, so we > can simplify and stop trying to clear it. > > Signed-off-by: Om

Re: [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs

2019-08-16 Thread Josef Bacik
On Thu, Aug 15, 2019 at 02:04:03PM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > In btrfs_file_write_iter(), we treat a write as synchrononous if the > file is marked as synchronous. However, with pwritev2(), a write with > RWF_SYNC or RWF_DSYNC is also synchronous even if the file isn't

Re: [PATCH 1/5] Btrfs: use correct count in btrfs_file_write_iter()

2019-08-16 Thread Josef Bacik
On Thu, Aug 15, 2019 at 02:04:02PM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > generic_write_checks() may modify iov_iter_count(), so we must get the > count after the call, not before. Using the wrong one has a couple of > consequences: > > 1. We check a longer range in check_can_noco

Re: [PATCH 3/3] btrfs: global reserve fallback should use metadata_size

2019-08-16 Thread Josef Bacik
On Fri, Aug 16, 2019 at 04:35:42PM +0100, Filipe Manana wrote: > On Fri, Aug 16, 2019 at 4:08 PM Josef Bacik wrote: > > > > We only use the global reserve fallback for truncates, so use > > For truncates? > I would say only for unlinks, rmdir and removing empty block groups. > Or did some of your

Re: [PATCH 3/3] btrfs: global reserve fallback should use metadata_size

2019-08-16 Thread Filipe Manana
On Fri, Aug 16, 2019 at 4:08 PM Josef Bacik wrote: > > We only use the global reserve fallback for truncates, so use For truncates? I would say only for unlinks, rmdir and removing empty block groups. Or did some of your previous patches changed that, and I missed it, and now only truncates use i

[PATCH 4/5] btrfs: do not account global reserve in can_overcommit

2019-08-16 Thread Josef Bacik
We ran into a problem in production where a box with plenty of space was getting wedged doing ENOSPC flushing. These boxes only had 20% of the disk allocated, but their metadata space + global reserve was right at the size of their metadata chunk. In this case can_overcommit should be allowing al

[PATCH 1/5] btrfs: change the minimum global reserve size

2019-08-16 Thread Josef Bacik
It made sense to have the global reserve set at 16M in the past, but since it is used less nowadays set the minimum size to the number of items we'll need to update the main trees we update during a transaction commit, plus some slop area so we can do unlinks if we need to. In practice this doesn'

[PATCH 3/5] btrfs: use add_old_bytes when updating global reserve

2019-08-16 Thread Josef Bacik
We have some annoying xfstests tests that will create a very small fs, fill it up, delete it, and repeat to make sure everything works right. This trips btrfs up sometimes because we may commit a transaction to free space, but most of the free metadata space was being reserved by the global reserve

[PATCH 0/5] Fix global reserve size and can overcommit

2019-08-16 Thread Josef Bacik
We hit a pretty crappy corner case in production that resulted in boxes slowing down to a crawl. can_overcommit() will not allow us to overcommit if there is not enough "real" space to satisfy the global reserve. This is for hysterical raisins, we used to not be able to allocate block groups a tr

[PATCH 5/5] btrfs: add enospc debug messages for ticket failure

2019-08-16 Thread Josef Bacik
When debugging weird enospc problems it's handy to be able to dump the space info when we wake up all tickets, and see what the ticket values are. This helped me figure out cases where we were enospc'ing when we shouldn't have been. Signed-off-by: Josef Bacik --- fs/btrfs/space-info.c | 32

[PATCH 2/5] btrfs: always reserve our entire size for the global reserve

2019-08-16 Thread Josef Bacik
While messing with the overcommit logic I noticed that sometimes we'd ENOSPC out when really we should have run out of space much earlier. It turns out it's because we'll only reserve up to the free amount left in the space info for the global reserve, but that doesn't make sense with overcommit b

[PATCH 1/3] btrfs: rename the btrfs_calc_*_metadata_size helpers

2019-08-16 Thread Josef Bacik
btrfs_calc_trunc_metadata_size differs from trans_metadata_size in that it doesn't take into account any splitting at the levels, because truncate will never split nodes. However truncate _and_ changing will never split nodes, so rename btrfs_calc_trunc_metadata_size to btrfs_calc_metadata_size.

[PATCH 2/3] btrfs: only reserve metadata_size for inodes

2019-08-16 Thread Josef Bacik
Historically we reserved worst case for every btree operation, and generally speaking we want to do that in cases where it could be the worst case. However for updating inodes we know the inode items are already in the tree, so it will only be an update operation and never an insert operation. Th

[PATCH 3/3] btrfs: global reserve fallback should use metadata_size

2019-08-16 Thread Josef Bacik
We only use the global reserve fallback for truncates, so use calc_metadata_size instead of calc_insert_metadata_size. Signed-off-by: Josef Bacik --- fs/btrfs/transaction.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index f

[PATCH 0/3] Rework the worst case calculations for space reservation

2019-08-16 Thread Josef Bacik
We have two worst case calculations for space reservation, one that takes into account splitting at every level when cow'ing down the btree, and another that doesn't account for splitting at all. The first is used everywhere, and the second is used mostly for truncate. However we also do not spli

Re: [Bug 204371] BUG kmalloc-4k (Tainted: G W ): Object padding overwritten

2019-08-16 Thread Christophe Leroy
Le 16/08/2019 à 16:38, bugzilla-dae...@bugzilla.kernel.org a écrit : https://bugzilla.kernel.org/show_bug.cgi?id=204371 --- Comment #34 from Erhard F. (erhar...@mailbox.org) --- On Fri, 16 Aug 2019 08:22:31 + bugzilla-dae...@bugzilla.kernel.org wrote: https://bugzilla.kernel.org/show_bu

Re: [PATCH v3 03/27] btrfs: Check and enable HMZONED mode

2019-08-16 Thread Damien Le Moal
On 2019/08/15 22:46, Anand Jain wrote: > On 8/8/19 5:30 PM, Naohiro Aota wrote: >> HMZONED mode cannot be used together with the RAID5/6 profile for now. >> Introduce the function btrfs_check_hmzoned_mode() to check this. This >> function will also check if HMZONED flag is enabled on the file syste

[PATCH 8/8] btrfs: remove orig_bytes from reserve_ticket

2019-08-16 Thread Josef Bacik
Now that we do not do partial filling of tickets simply remove orig_bytes, it is no longer needed. Signed-off-by: Josef Bacik --- fs/btrfs/space-info.c | 15 --- fs/btrfs/space-info.h | 1 - 2 files changed, 16 deletions(-) diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c

[PATCH 7/8] btrfs: fix may_commit_transaction to deal with no partial filling

2019-08-16 Thread Josef Bacik
Now that we aren't partially filling tickets we may have some slack space left in the space_info. We need to account for this in may_commit_transaction, otherwise we may choose to not commit the transaction despite it actually having enough space to satisfy our ticket. Calculate the free space we

[PATCH 4/8] btrfs: rework btrfs_space_info_add_old_bytes

2019-08-16 Thread Josef Bacik
If there are pending tickets and we are overcommitted we will simply return free'd reservations to space_info->bytes_may_use if we cannot overcommit any more. This is problematic because we assume any free space would have been added to the tickets, and so if we go from an overcommitted state to n

[PATCH 6/8] btrfs: rework wake_all_tickets

2019-08-16 Thread Josef Bacik
Now that we no longer partially fill tickets we need to rework wake_all_tickets to call btrfs_try_to_wakeup_tickets() in order to see if any subsequent tickets are able to be satisfied. If our tickets_id changes we know something happened and we can keep flushing. Also if we find a ticket that is

[PATCH 5/8] btrfs: refactor the ticket wakeup code

2019-08-16 Thread Josef Bacik
Now that btrfs_space_info_add_old_bytes simply checks if we can make the reservation and updates bytes_may_use, there's no reason to have both helpers in place. Factor out the ticket wakeup logic into it's own helper, make btrfs_space_info_add_old_bytes() update bytes_may_use and then call the wak

[PATCH 0/8][v2] Rework reserve ticket handling

2019-08-16 Thread Josef Bacik
Hello, Just some minor tweaks that needed to be added to fix issues introduced by the next series of enospc fixes. v1->v2: - added "btrfs: fix may_commit_transaction to deal with no partial filling" - fixed "btrfs: refactor the ticket wakeup code" to return true if we find a smaller ticket than

[PATCH 1/8] btrfs: do not allow reservations if we have pending tickets

2019-08-16 Thread Josef Bacik
If we already have tickets on the list we don't want to steal their reservations. This is a preparation patch for upcoming changes, technically this shouldn't happen today because of the way we add bytes to tickets before adding them to the space_info in most cases. Signed-off-by: Josef Bacik --

[PATCH 3/8] btrfs: add space reservation tracepoint for reserved bytes

2019-08-16 Thread Josef Bacik
I noticed when folding the trace_btrfs_space_reservation() tracepoint into the btrfs_space_info_update_* helpers that we didn't emit a tracepoint when doing btrfs_add_reserved_bytes(). I know this is because we were swapping bytes_may_use for bytes_reserved, so in my mind there was no reason to ha

[PATCH 2/8] btrfs: roll tracepoint into btrfs_space_info_update helper

2019-08-16 Thread Josef Bacik
We duplicate this tracepoint everywhere we call these helpers, so update the helper to have the tracepoint as well. Signed-off-by: Josef Bacik --- fs/btrfs/block-group.c| 3 --- fs/btrfs/block-rsv.c | 5 - fs/btrfs/delalloc-space.c | 4 fs/btrfs/extent-tree.c| 9 ---

Re: [PATCH v3 02/27] btrfs: Get zone information of zoned block devices

2019-08-16 Thread Damien Le Moal
On 2019/08/15 21:47, Anand Jain wrote: > On 8/8/19 5:30 PM, Naohiro Aota wrote: >> If a zoned block device is found, get its zone information (number of zones >> and zone size) using the new helper function btrfs_get_dev_zonetypes(). To >> avoid costly run-time zone report commands to test the dev

Re: [PATCH] fstests: btrfs: Check snapshot creation and deletion with dm-logwrites

2019-08-16 Thread Eryu Guan
On Fri, Aug 16, 2019 at 05:47:33PM +0800, Qu Wenruo wrote: [...] > >> +$KILLALL_PROG -q $FSSTRESS_PROG &> /dev/null > > > > You're very inconsistent within the same test :) Using both "> > > /dev/null 2>&1" and "&> /dev/null". > > My bad, I mean 2>&1 > /dev/null. > What I mean is output stderr wh

Re: [PATCH v2] btrfs: transaction: Commit transaction more frequently for BPF

2019-08-16 Thread Qu Wenruo
On 2019/8/16 下午6:03, Filipe Manana wrote: > On Fri, Aug 16, 2019 at 10:53 AM Qu Wenruo wrote: >> >> >> >> On 2019/8/16 下午5:33, Filipe Manana wrote: >>> On Thu, Aug 15, 2019 at 9:36 AM Qu Wenruo wrote: Btrfs has btrfs_end_transaction_throttle() which could try to commit transactio

Re: [PATCH v2] btrfs: transaction: Commit transaction more frequently for BPF

2019-08-16 Thread Filipe Manana
On Fri, Aug 16, 2019 at 10:53 AM Qu Wenruo wrote: > > > > On 2019/8/16 下午5:33, Filipe Manana wrote: > > On Thu, Aug 15, 2019 at 9:36 AM Qu Wenruo wrote: > >> > >> Btrfs has btrfs_end_transaction_throttle() which could try to commit > >> transaction when needed. > >> > >> However under most cases

Re: [PATCH v2] btrfs: transaction: Commit transaction more frequently for BPF

2019-08-16 Thread Qu Wenruo
On 2019/8/16 下午5:33, Filipe Manana wrote: > On Thu, Aug 15, 2019 at 9:36 AM Qu Wenruo wrote: >> >> Btrfs has btrfs_end_transaction_throttle() which could try to commit >> transaction when needed. >> >> However under most cases btrfs_end_transaction_throttle() won't really >> commit transaction,

Re: [PATCH] fstests: btrfs: Check snapshot creation and deletion with dm-logwrites

2019-08-16 Thread Qu Wenruo
On 2019/8/16 下午5:25, Filipe Manana wrote: [...] > > That will also make the test fail on systems with a page size > 4Kb. > So either make it "_notrun" for systems with a page size != 4Kb or, > preferably make the test independent of the page size. > If you want to increase the tree height easily

Re: [PATCH v2] btrfs: transaction: Commit transaction more frequently for BPF

2019-08-16 Thread Filipe Manana
On Thu, Aug 15, 2019 at 9:36 AM Qu Wenruo wrote: > > Btrfs has btrfs_end_transaction_throttle() which could try to commit > transaction when needed. > > However under most cases btrfs_end_transaction_throttle() won't really > commit transaction, due to the hard timing requirement. > > Now introduc

Re: [PATCH] fstests: btrfs: Check snapshot creation and deletion with dm-logwrites

2019-08-16 Thread Filipe Manana
On Wed, Aug 14, 2019 at 11:56 AM Qu Wenruo wrote: > > We have generic dm-logwrites with fsstress test case (generic/482), but > it doesn't cover fs specific operations like btrfs snapshot creation and > deletion. > > Furthermore, that test is not heavy enough to bump btrfs tree height by > its sho