Re: btrfs becomes read-only

2021-01-26 Thread Chris Murphy
On Wed, Jan 27, 2021 at 12:22 AM Alexey Isaev wrote: > > Hello! > > BTRFS volume becomes read-only with this messages in dmesg. > What can i do to repair btrfs partition? > > [Jan25 08:18] BTRFS error (device sdg): parent transid verify failed on > 52180048330752 wanted 132477 found 132432 > [ +0

Re: [PATCH 13/17] md: remove md_bio_alloc_sync

2021-01-26 Thread Song Liu
On Tue, Jan 26, 2021 at 7:17 AM Christoph Hellwig wrote: > > md_bio_alloc_sync is never called with a NULL mddev, and ->sync_set is > initialized in md_run, so it always must be initialized as well. Just > open code the remaining call to bio_alloc_bioset. > > Signed-off-by: Christoph Hellwig Ac

Re: [PATCH 12/17] md: simplify sync_page_io

2021-01-26 Thread Song Liu
On Tue, Jan 26, 2021 at 7:14 AM Christoph Hellwig wrote: > > Use an on-stack bio and biovec for the single page synchronous I/O. > > Signed-off-by: Christoph Hellwig Acked-by: Song Liu > --- > drivers/md/md.c | 26 +- > 1 file changed, 13 insertions(+), 13 deletions(-)

Re: [PATCH 11/17] md: remove bio_alloc_mddev

2021-01-26 Thread Song Liu
On Tue, Jan 26, 2021 at 7:12 AM Christoph Hellwig wrote: > > bio_alloc_mddev is never called with a NULL mddev, and ->bio_set is > initialized in md_run, so it always must be initialized as well. Just > open code the remaining call to bio_alloc_bioset. > > Signed-off-by: Christoph Hellwig Acked

btrfs becomes read-only

2021-01-26 Thread Alexey Isaev
Hello! BTRFS volume becomes read-only with this messages in dmesg. What can i do to repair btrfs partition? [Jan25 08:18] BTRFS error (device sdg): parent transid verify failed on 52180048330752 wanted 132477 found 132432 [  +0.007587] BTRFS error (device sdg): parent transid verify failed on

Re: [PATCH 14/17] md/raid6: refactor raid5_read_one_chunk

2021-01-26 Thread Song Liu
On Tue, Jan 26, 2021 at 7:19 AM Christoph Hellwig wrote: > > Refactor raid5_read_one_chunk so that all simple checks are done > before allocating the bio. > > Signed-off-by: Christoph Hellwig Acked-by: Song Liu Thanks for the clean-up! > --- > drivers/md/raid5.c | 108 +++---

[PATCH] btrfs: fix a bug that btrfs_invalidapge() can double account ordered extent for subpage

2021-01-26 Thread Qu Wenruo
Commit dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could span across a page") make btrfs_invalidapage() to search all ordered extents. The offending code looks like this: again: start = page_start; ordered = btrfs_lookup_ordered_range(inode, start, page_end - start +

Re: [dm-devel] [PATCH 01/17] zonefs: use bio_alloc in zonefs_file_dio_append

2021-01-26 Thread Damien Le Moal
On 2021/01/26 23:58, Christoph Hellwig wrote: > Use bio_alloc instead of open coding it. > > Signed-off-by: Christoph Hellwig > --- > fs/zonefs/super.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c > index bec47f2d074beb..faea2ed3

[PATCH v4 12/12] btrfs: add a trace class for dumping the current ENOSPC state

2021-01-26 Thread Josef Bacik
Often when I'm debugging ENOSPC related issues I have to resort to printing the entire ENOSPC state with trace_printk() in different spots. This gets pretty annoying, so add a trace state that does this for us. Then add a trace point at the end of preemptive flushing so you can see the state of the

[PATCH v4 10/12] btrfs: implement space clamping for preemptive flushing

2021-01-26 Thread Josef Bacik
Starting preemptive flushing at 50% of available free space is a good start, but some workloads are particularly abusive and can quickly overwhelm the preemptive flushing code and drive us into using tickets. Handle this by clamping down on our threshold for starting and continuing to run preempti

[PATCH v4 06/12] btrfs: rename need_do_async_reclaim

2021-01-26 Thread Josef Bacik
All of our normal flushing is asynchronous reclaim, so this helper is poorly named. This is more checking if we need to preemptively flush space, so rename it to need_preemptive_reclaim. Reviewed-by: Nikolay Borisov Signed-off-by: Josef Bacik --- fs/btrfs/space-info.c | 10 +- 1 file c

[PATCH v4 07/12] btrfs: check reclaim_size in need_preemptive_reclaim

2021-01-26 Thread Josef Bacik
If we're flushing space for tickets then we have space_info->reclaim_size set and we do not need to do background reclaim. Reviewed-by: Nikolay Borisov Signed-off-by: Josef Bacik --- fs/btrfs/space-info.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/fs/btrfs/space-info.c b/fs/btrf

[PATCH v4 05/12] btrfs: improve preemptive background space flushing

2021-01-26 Thread Josef Bacik
Currently if we ever have to flush space because we do not have enough we allocate a ticket and attach it to the space_info, and then systematically flush things in the file system that hold space reservations until our space is reclaimed. However this has a latency cost, we must go to sleep and w

[PATCH v4 03/12] btrfs: track ordered bytes instead of just dio ordered bytes

2021-01-26 Thread Josef Bacik
We track dio_bytes because the shrink delalloc code needs to know if we have more DIO in flight than we have normal buffered IO. The reason for this is because we can't "flush" DIO, we have to just wait on the ordered extents to finish. However this is true of all ordered extents. If we have mor

Re: [PATCH v3 04/12] btrfs: introduce a FORCE_COMMIT_TRANS flush operation

2021-01-26 Thread David Sterba
On Thu, Oct 29, 2020 at 06:03:30PM +0100, David Sterba wrote: > On Fri, Oct 09, 2020 at 09:28:21AM -0400, Josef Bacik wrote: > > Sole-y for preemptive flushing, we want to be able to force the > > transaction commit without any of the ambiguity of > > may_commit_transaction(). This is because may_

Re: [PATCH v3 01/12] btrfs: make flush_space take a enum btrfs_flush_state instead of int

2021-01-26 Thread David Sterba
On Fri, Oct 09, 2020 at 09:28:18AM -0400, Josef Bacik wrote: > I got a automated message from somebody who runs clang against our > kernels and it's because I used the wrong enum type for what I passed > into flush_space. Change the argument to be explicitly the enum we're > expecting to make ever

Re: [PATCH v2 0/5] Serious fixes for different error paths

2021-01-26 Thread David Sterba
On Thu, Jan 14, 2021 at 02:02:41PM -0500, Josef Bacik wrote: > v1->v2: > - Rebased onto misc-next, dropping everything that's been merged so far. > - Fixed "btrfs: splice remaining dirty_bg's onto the transaction dirty bg > list" > to handle the btrfs_alloc_path() failure and cleaned up the erro

Re: "bad tree block start" when trying to mount on ARM

2021-01-26 Thread Erik Jensen
On Wed, Jan 20, 2021 at 1:08 AM Erik Jensen wrote: > > On Wed, Jan 20, 2021 at 12:31 AM Qu Wenruo wrote: > > On 2021/1/20 下午4:21, Qu Wenruo wrote: > > > On 2021/1/19 下午5:28, Erik Jensen wrote: > > >> On Mon, Jan 18, 2021 at 9:22 PM Erik Jensen > > >> wrote: > > >>> > > >>> On Mon, Jan 18, 2021 a

Re: [PATCH v13 22/42] btrfs: split ordered extent when bio is sent

2021-01-26 Thread Johannes Thumshirn
On 22/01/2021 16:24, Josef Bacik wrote: >> +em_new = create_io_em(inode, em->start + pre, len, >> + em->start + pre, em->block_start + pre, len, >> + len, len, BTRFS_COMPRESS_NONE, >> + BTRFS_ORDERED_REGULAR); > This bit

[PATCH 1/2] btrfs-progs: tests: Extend cli/003

2021-01-26 Thread Nikolay Borisov
Add a test which ensures that when resize is tried on an image instead of a directory appropriate warning is produced and the command fails. Signed-off-by: Nikolay Borisov --- tests/cli-tests/003-fi-resize-args/test.sh | 7 +++ 1 file changed, 7 insertions(+) diff --git a/tests/cli-tests/00

Re: [PATCH] btrfs: rework the order of btrfs_ordered_extent::flags

2021-01-26 Thread Filipe Manana
On Thu, Jan 21, 2021 at 4:52 PM David Sterba wrote: > > On Thu, Jan 21, 2021 at 02:13:54PM +0800, Qu Wenruo wrote: > > [BUG] > > There is a long existing bug in the last parameter of > > btrfs_add_ordered_extent(), in commit 771ed689d2cd ("Btrfs: Optimize > > compressed writeback and reads") back

Re: [PATCH v13 13/42] btrfs: track unusable bytes for zones

2021-01-26 Thread Johannes Thumshirn
On 25/01/2021 11:37, Johannes Thumshirn wrote: >>> + if (btrfs_is_zoned(fs_info)) >>> + return NULL; >>> + >> This is unrelated to the rest of the changes, seems like something that was >> just >> missed? Should probably be in its own patch. > Hmm probably belongs to another patch, j

Re: [PATCH] btrfs: Remove unused variable

2021-01-26 Thread Johannes Thumshirn
On 24/01/2021 17:05, Nikolay Borisov wrote: > This fixes fs/btrfs/zoned.c:491:6: warning: variable ‘zone_size’ set but not > used [-Wunused-but-set-variable] > 491 | u64 zone_size; > > Which got introduced in 12659251ca5d ("btrfs: implement log-structured > superblock for ZONED mode") > > Si

Re: [RFC][PATCH V5] btrfs: preferred_metadata: preferred device for metadata

2021-01-26 Thread Josef Bacik
On 1/17/21 1:54 PM, Goffredo Baroncelli wrote: Hi all, This is an RFC; I wrote this patch because I find the idea interesting even though it adds more complication to the chunk allocator. The basic idea is to store the metadata chunk in the fasters disks. The fasters disk are marked by the "pr

[PATCH v14 01/42] block: add bio_add_zone_append_page

2021-01-26 Thread Naohiro Aota
From: Johannes Thumshirn Add bio_add_zone_append_page(), a wrapper around bio_add_hw_page() which is intended to be used by file systems that directly add pages to a bio instead of using bio_iov_iter_get_pages(). Cc: Jens Axboe Reviewed-by: Christoph Hellwig Reviewed-by: Josef Bacik Reviewed-

[PATCH v14 02/42] iomap: support REQ_OP_ZONE_APPEND

2021-01-26 Thread Naohiro Aota
A ZONE_APPEND bio must follow hardware restrictions (e.g. not exceeding max_zone_append_sectors) not to be split. bio_iov_iter_get_pages builds such restricted bio using __bio_iov_append_get_pages if bio_op(bio) == REQ_OP_ZONE_APPEND. To utilize it, we need to set the bio_op before calling bio_iov

[PATCH v14 04/42] btrfs: use regular SB location on emulated zoned mode

2021-01-26 Thread Naohiro Aota
The zoned btrfs puts a superblock at the beginning of SB logging zones if the zone is conventional. This difference causes a chicken-and-egg problem for emulated zoned mode. Since the device is a regular (non-zoned) device, we cannot know if the btrfs is regular or emulated zoned while we read the

[PATCH v14 03/42] btrfs: defer loading zone info after opening trees

2021-01-26 Thread Naohiro Aota
This is preparation patch to implement zone emulation on a regular device. To emulate zoned mode on a regular (non-zoned) device, we need to decide an emulating zone size. Instead of making it compile-time static value, we'll make it configurable at mkfs time. Since we have one zone == one device

[PATCH v14 00/42] btrfs: zoned block device support

2021-01-26 Thread Naohiro Aota
This series adds zoned block device support to btrfs. Some of the patches in the previous series are already merged as preparation patches. This series is also available on github. Kernel https://github.com/naota/linux/tree/btrfs-zoned-v14 Userland https://github.com/naota/btrfs-progs/tree/btrfs

[PATCH v14 07/42] btrfs: disallow fitrim in ZONED mode

2021-01-26 Thread Naohiro Aota
The implementation of fitrim is depending on space cache, which is not used and disabled for zoned btrfs' extent allocator. So the current code does not work with zoned btrfs. In the future, we can implement fitrim for zoned btrfs by enabling space cache (but, only for fitrim) or scanning the exten

[PATCH v14 06/42] btrfs: do not load fs_info->zoned from incompat flag

2021-01-26 Thread Naohiro Aota
From: Johannes Thumshirn Don't set the zoned flag in fs_info when encountering the BTRFS_FEATURE_INCOMPAT_ZONED on mount. The zoned flag in fs_info is in a union together with the zone_size, so setting it too early will result in setting an incorrect zone_size as well. Once the correct zone_size

[PATCH v14 08/42] btrfs: allow zoned mode on non-zoned block devices

2021-01-26 Thread Naohiro Aota
From: Johannes Thumshirn Run zoned btrfs mode on non-zoned devices. This is done by "slicing up" the block-device into static sized chunks and fake a conventional zone on each of them. The emulated zone size is determined from the size of device extent. This is mainly aimed at testing parts of t

[PATCH v14 14/42] btrfs: do sequential extent allocation in ZONED mode

2021-01-26 Thread Naohiro Aota
This commit implements a sequential extent allocator for the ZONED mode. This allocator just needs to check if there is enough space in the block group. Therefor the allocator never manages bitmaps or clusters. Also add ASSERTs to the corresponding functions. Actually, with zone append writing, it

[PATCH v14 11/42] btrfs: load zone's allocation offset

2021-01-26 Thread Naohiro Aota
Zoned btrfs must allocate blocks at the zones' write pointer. The device's write pointer position can be mapped to a logical address within a block group. This commit adds "alloc_offset" to track the logical address. This logical address is populated in btrfs_load_block_group_zone_info() from writ

[PATCH v14 09/42] btrfs: implement zoned chunk allocator

2021-01-26 Thread Naohiro Aota
This commit implements a zoned chunk/dev_extent allocator. The zoned allocator aligns the device extents to zone boundaries, so that a zone reset affects only the device extent and does not change the state of blocks in the neighbor device extents. Also, it checks that a region allocation is not o

[PATCH v14 13/42] btrfs: track unusable bytes for zones

2021-01-26 Thread Naohiro Aota
In zoned btrfs a region that was once written then freed is not usable until we reset the underlying zone. So we need to distinguish such unusable space from usable free space. Therefore we need to introduce the "zone_unusable" field to the block group structure, and "bytes_zone_unusable" to the

[PATCH v14 12/42] btrfs: calculate allocation offset for conventional zones

2021-01-26 Thread Naohiro Aota
Conventional zones do not have a write pointer, so we cannot use it to determine the allocation offset if a block group contains a conventional zone. But instead, we can consider the end of the last allocated extent in the block group as an allocation offset. For new block group, we cannot calcul

[PATCH v14 15/42] btrfs: redirty released extent buffers in ZONED mode

2021-01-26 Thread Naohiro Aota
Tree manipulating operations like merging nodes often release once-allocated tree nodes. Btrfs cleans such nodes so that pages in the node are not uselessly written out. On ZONED volumes, however, such optimization blocks the following IOs as the cancellation of the write out of the freed blocks br

[PATCH v14 17/42] btrfs: enable to mount ZONED incompat flag

2021-01-26 Thread Naohiro Aota
This final patch adds the ZONED incompat flag to BTRFS_FEATURE_INCOMPAT_SUPP and enables btrfs to mount ZONED flagged file system. Signed-off-by: Naohiro Aota Reviewed-by: Josef Bacik --- fs/btrfs/ctree.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ctree.h b/f

[PATCH v14 20/42] btrfs: use bio_add_zone_append_page for zoned btrfs

2021-01-26 Thread Naohiro Aota
Zoned device has its own hardware restrictions e.g. max_zone_append_size when using REQ_OP_ZONE_APPEND. To follow the restrictions, use bio_add_zone_append_page() instead of bio_add_page(). We need target device to use bio_add_zone_append_page(), so this commit reads the chunk information to memoiz

[PATCH v14 18/42] btrfs: reset zones of unused block groups

2021-01-26 Thread Naohiro Aota
For an ZONED volume, a block group maps to a zone of the device. For deleted unused block groups, the zone of the block group can be reset to rewind the zone write pointer at the start of the zone. Reviewed-by: Josef Bacik Signed-off-by: Naohiro Aota --- fs/btrfs/block-group.c | 8 ++-- fs

[PATCH v14 19/42] btrfs: extract page adding function

2021-01-26 Thread Naohiro Aota
This commit extract page adding to bio part from submit_extent_page(). The page is added only when bio_flags are the same, contiguous and the added page fits in the same stripe as pages in the bio. Condition checkings are reordered to allow early return to avoid possibly heavy btrfs_bio_fits_in_st

[PATCH v14 21/42] btrfs: handle REQ_OP_ZONE_APPEND as writing

2021-01-26 Thread Naohiro Aota
ZONED btrfs uses REQ_OP_ZONE_APPEND bios for writing to actual devices. Let btrfs_end_bio() and btrfs_op be aware of it. Reviewed-by: Josef Bacik Signed-off-by: Naohiro Aota --- fs/btrfs/disk-io.c | 4 ++-- fs/btrfs/inode.c | 10 +- fs/btrfs/volumes.c | 8 fs/btrfs/volumes.

[PATCH v14 24/42] btrfs: extend btrfs_rmap_block for specifying a device

2021-01-26 Thread Naohiro Aota
btrfs_rmap_block currently reverse-maps the physical addresses on all devices to the corresponding logical addresses. This commit extends the function to match to a specified device. The old functionality of querying all devices is left intact by specifying NULL as target device. We pass block_de

[PATCH v14 26/42] btrfs: save irq flags when looking up an ordered extent

2021-01-26 Thread Naohiro Aota
From: Johannes Thumshirn A following patch will add another caller of btrfs_lookup_ordered_extent() from a bio endio context. btrfs_lookup_ordered_extent() uses spin_lock_irq() which unconditionally disables interrupts. Change this to spin_lock_irqsave() so interrupts aren't disabled and re-enab

[PATCH v14 29/42] btrfs: introduce dedicated data write path for ZONED mode

2021-01-26 Thread Naohiro Aota
If more than one IO is issued for one file extent, these IO can be written to separate regions on a device. Since we cannot map one file extent to such a separate area, we need to follow the "one IO == one ordered extent" rule. The Normal buffered, uncompressed, not pre-allocated write path (used

[PATCH v14 27/42] btrfs: use ZONE_APPEND write for ZONED btrfs

2021-01-26 Thread Naohiro Aota
This commit enables zone append writing for zoned btrfs. When using zone append, a bio is issued to the start of a target zone and the device decides to place it inside the zone. Upon completion the device reports the actual written position back to the host. Three parts are necessary to enable zo

[PATCH v14 34/42] btrfs: implement cloning for ZONED device-replace

2021-01-26 Thread Naohiro Aota
This is 2/4 patch to implement device-replace for ZONED mode. On zoned mode, a block group must be either copied (from the source device to the destination device) or cloned (to the both device). This commit implements the cloning part. If a block group targeted by an IO is marked to copy, we sho

[PATCH v14 28/42] btrfs: enable zone append writing for direct IO

2021-01-26 Thread Naohiro Aota
Likewise to buffered IO, enable zone append writing for direct IO when its used on a zoned block device. Reviewed-by: Josef Bacik Signed-off-by: Naohiro Aota --- fs/btrfs/inode.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index e3

[PATCH v14 30/42] btrfs: serialize meta IOs on ZONED mode

2021-01-26 Thread Naohiro Aota
We cannot use zone append for writing metadata, because the B-tree nodes have references to each other using the logical address. Without knowing the address in advance, we cannot construct the tree in the first place. So we need to serialize write IOs for metadata. We cannot add a mutex around al

[PATCH v14 36/42] btrfs: support dev-replace in ZONED mode

2021-01-26 Thread Naohiro Aota
This is 4/4 patch to implement device-replace on ZONED mode. Even after the copying is done, the write pointers of the source device and the destination device may not be synchronized. For example, when the last allocated extent is freed before device-replace process, the extent is not copied, lea

[PATCH v14 38/42] btrfs: relocate block group to repair IO failure in ZONED

2021-01-26 Thread Naohiro Aota
When btrfs find a checksum error and if the file system has a mirror of the damaged data, btrfs read the correct data from the mirror and write the data to damaged blocks. This repairing, however, is against the sequential write required rule. We can consider three methods to repair an IO failure

[PATCH v14 39/42] btrfs: split alloc_log_tree()

2021-01-26 Thread Naohiro Aota
This is a preparation for the next patch. This commit split alloc_log_tree() to allocating tree structure part (remains in alloc_log_tree()) and allocating tree node part (moved in btrfs_alloc_log_tree_node()). The latter part is also exported to be used in the next patch. Reviewed-by: Josef Bacik

Re: [PATCH] btrfs: fix lockdep warning due to seqcount_mutex_init() with wrong address

2021-01-26 Thread Davidlohr Bueso
On Mon, 25 Jan 2021, David Sterba wrote: IMO it's not, though some kind of annotation could be useful. The patch introducing the seqcount mutex does not mention any warning so it's probably meant only for clarity of the lock nesting or maybe real-time related as there are some comments regarding

Re: [PATCH v4 08/18] btrfs: introduce helper for subpage uptodate status

2021-01-26 Thread Qu Wenruo
On 2021/1/20 下午10:55, Josef Bacik wrote: On 1/16/21 2:15 AM, Qu Wenruo wrote: This patch introduce the following functions to handle btrfs subpage uptodate status: - btrfs_subpage_set_uptodate() - btrfs_subpage_clear_uptodate() - btrfs_subpage_test_uptodate()    Those helpers can only be call

Re: [PATCH v4 04/18] btrfs: make attach_extent_buffer_page() to handle subpage case

2021-01-26 Thread Qu Wenruo
On 2021/1/20 上午6:35, David Sterba wrote: On Tue, Jan 19, 2021 at 04:54:28PM -0500, Josef Bacik wrote: On 1/16/21 2:15 AM, Qu Wenruo wrote: +/* For rare cases where we need to pre-allocate a btrfs_subpage structure */ +static inline int btrfs_alloc_subpage(struct btrfs_fs_info *fs_info, +

[PATCH v5 00/18] btrfs: add read-only support for subpage sector size

2021-01-26 Thread Qu Wenruo
Patches can be fetched from github: https://github.com/adam900710/linux/tree/subpage Currently the branch also contains partial RW data support (still some ordered extent and data csum mismatch problems) Great thanks to David/Nikolay/Josef for their effort reviewing and merging the preparation pat

[PATCH v5 02/18] btrfs: set UNMAPPED bit early in btrfs_clone_extent_buffer() for subpage support

2021-01-26 Thread Qu Wenruo
For the incoming subpage support, UNMAPPED extent buffer will have different behavior in btrfs_release_extent_buffer(). This means we need to set UNMAPPED bit early before calling btrfs_release_extent_buffer(). Currently there is only one caller which relies on btrfs_release_extent_buffer() in it

[PATCH v5 03/18] btrfs: introduce the skeleton of btrfs_subpage structure

2021-01-26 Thread Qu Wenruo
For sectorsize < page size support, we need a structure to record extra status info for each sector of a page. Introduce the skeleton structure, all subpage related code would go to subpage.[ch]. Reviewed-by: Josef Bacik Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David S

[PATCH v5 08/18] btrfs: introduce helpers for subpage uptodate status

2021-01-26 Thread Qu Wenruo
Introduce the following functions to handle subpage uptodate status: - btrfs_subpage_set_uptodate() - btrfs_subpage_clear_uptodate() - btrfs_subpage_test_uptodate() These helpers can only be called when the page has subpage attached and the range is ensured to be inside the page. - btrfs_page

[PATCH v5 06/18] btrfs: support subpage for extent buffer page release

2021-01-26 Thread Qu Wenruo
In btrfs_release_extent_buffer_pages(), we need to add extra handling for subpage. Introduce a helper, detach_extent_buffer_page(), to do different handling for regular and subpage cases. For subpage case, handle detaching page private. For unmapped (dummy or cloned) ebs, we can detach the page

[PATCH v5 17/18] btrfs: integrate page status update for data read path into begin/end_page_read()

2021-01-26 Thread Qu Wenruo
In btrfs data page read path, the page status update are handled in two different locations: btrfs_do_read_page() { while (cur <= end) { /* No need to read from disk */ if (HOLE/PREALLOC/INLINE){ memset();

[PATCH v5 13/18] btrfs: introduce read_extent_buffer_subpage()

2021-01-26 Thread Qu Wenruo
Introduce a helper, read_extent_buffer_subpage(), to do the subpage extent buffer read. The difference between regular and subpage routines are: - No page locking Here we completely rely on extent locking. Page locking can reduce the concurrency greatly, as if we lock one page to read one e

[PATCH v5 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system

2021-01-26 Thread Qu Wenruo
This adds the basic RO mount ability for 4K sector size on 64K page system. Currently we only plan to support 4K and 64K page system. Signed-off-by: Qu Wenruo Signed-off-by: David Sterba --- fs/btrfs/disk-io.c | 24 +--- fs/btrfs/super.c | 7 +++ 2 files changed, 28

[PATCH v5 14/18] btrfs: support subpage in endio_readpage_update_page_status()

2021-01-26 Thread Qu Wenruo
To handle subpage status update, add the following: - Use btrfs_page_*() subpage-aware helpers to update page status Now we can handle both cases well. - No page unlock for subpage metadata Since subpage metadata doesn't utilize page locking at all, skip it. For subpage data locking, it's h

[PATCH v5 12/18] btrfs: support subpage in try_release_extent_buffer()

2021-01-26 Thread Qu Wenruo
Unlike the original try_release_extent_buffer(), try_release_subpage_extent_buffer() will iterate through all the ebs in the page, and try to release each. We can release the full page only after there's no private attached, which means all ebs of that page have been released as well. Signed-off-

[PATCH v5 05/18] btrfs: make grab_extent_buffer_from_page() handle subpage case

2021-01-26 Thread Qu Wenruo
For subpage case, grab_extent_buffer() can't really get an extent buffer just from btrfs_subpage. We have radix tree lock protecting us from inserting the same eb into the tree. Thus we don't really need to do the extra hassle, just let alloc_extent_buffer() handle the existing eb in radix tree.

[PATCH v5 09/18] btrfs: introduce helpers for subpage error status

2021-01-26 Thread Qu Wenruo
Introduce the following functions to handle subpage error status: - btrfs_subpage_set_error() - btrfs_subpage_clear_error() - btrfs_subpage_test_error() These helpers can only be called when the page has subpage attached and the range is ensured to be inside the page. - btrfs_page_set_error()

[PATCH v5 01/18] btrfs: merge PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK to PAGE_START_WRITEBACK

2021-01-26 Thread Qu Wenruo
PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK are two defines used in __process_pages_contig(), to let the function know to clear page dirty bit and then set page writeback. However page writeback and dirty bits are conflicting (at least for sector size == PAGE_SIZE case), this means these two have to b

Re: [PATCH v4 16/18] btrfs: introduce btrfs_subpage for data inodes

2021-01-26 Thread Qu Wenruo
On 2021/1/20 下午11:28, Josef Bacik wrote: On 1/16/21 2:15 AM, Qu Wenruo wrote: To support subpage sector size, data also need extra info to make sure which sectors in a page are uptodate/dirty/... This patch will make pages for data inodes to get btrfs_subpage structure attached, and detached

[PATCH v5 10/18] btrfs: support subpage in set/clear_extent_buffer_uptodate()

2021-01-26 Thread Qu Wenruo
To support subpage in set_extent_buffer_uptodate and clear_extent_buffer_uptodate we only need to use the subpage-aware helpers to update the page bits. Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/extent_io.c | 11 +++ 1 file changed, 7 i

[PATCH v5 16/18] btrfs: introduce btrfs_subpage for data inodes

2021-01-26 Thread Qu Wenruo
To support subpage sector size, data also need extra info to make sure which sectors in a page are uptodate/dirty/... This patch will make pages for data inodes to get btrfs_subpage structure attached, and detached when the page is freed. This patch also slightly changes the timing when set_page_

[PATCH v5 07/18] btrfs: attach private to dummy extent buffer pages

2021-01-26 Thread Qu Wenruo
There are locations where we allocate dummy extent buffers for temporary usage, like in tree_mod_log_rewind() or get_old_root(). These dummy extent buffers will be handled by the same eb accessors, and if they don't have page::private subpage eb accessors could fail. To address such problems, mak

Re: [PATCH v14 01/42] block: add bio_add_zone_append_page

2021-01-26 Thread Jens Axboe
On 1/25/21 7:24 PM, Naohiro Aota wrote: > From: Johannes Thumshirn > > Add bio_add_zone_append_page(), a wrapper around bio_add_hw_page() which > is intended to be used by file systems that directly add pages to a bio > instead of using bio_iov_iter_get_pages(). > > Cc: Jens Axboe > Reviewed-by

[PATCH 17/17] mm: remove get_swap_bio

2021-01-26 Thread Christoph Hellwig
Just reuse the block_device and sector from the swap_info structure, just as used by the SWP_SYNCHRONOUS path. Also remove the checks for NULL returns from bio_alloc as that can't happen for sleeping allocations. Signed-off-by: Christoph Hellwig --- include/linux/swap.h | 1 - mm/page_io.c

[PATCH 16/17] nilfs2: remove cruft in nilfs_alloc_seg_bio

2021-01-26 Thread Christoph Hellwig
bio_alloc never returns NULL when it can sleep. Signed-off-by: Christoph Hellwig --- fs/nilfs2/segbuf.c | 4 1 file changed, 4 deletions(-) diff --git a/fs/nilfs2/segbuf.c b/fs/nilfs2/segbuf.c index 1a8729eded8b14..1e75417bfe6e52 100644 --- a/fs/nilfs2/segbuf.c +++ b/fs/nilfs2/segbuf.c @@

[PATCH 15/17] nfs/blocklayout: remove cruft in bl_alloc_init_bio

2021-01-26 Thread Christoph Hellwig
bio_alloc never returns NULL when it can sleep. Signed-off-by: Christoph Hellwig --- fs/nfs/blocklayout/blocklayout.c | 5 - 1 file changed, 5 deletions(-) diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c index 3be6836074ae92..1a96ce28efb026 100644 --- a/fs/n

[PATCH 14/17] md/raid6: refactor raid5_read_one_chunk

2021-01-26 Thread Christoph Hellwig
Refactor raid5_read_one_chunk so that all simple checks are done before allocating the bio. Signed-off-by: Christoph Hellwig --- drivers/md/raid5.c | 108 +++-- 1 file changed, 45 insertions(+), 63 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md

[PATCH 13/17] md: remove md_bio_alloc_sync

2021-01-26 Thread Christoph Hellwig
md_bio_alloc_sync is never called with a NULL mddev, and ->sync_set is initialized in md_run, so it always must be initialized as well. Just open code the remaining call to bio_alloc_bioset. Signed-off-by: Christoph Hellwig --- drivers/md/md.c | 10 +- 1 file changed, 1 insertion(+), 9

[PATCH 12/17] md: simplify sync_page_io

2021-01-26 Thread Christoph Hellwig
Use an on-stack bio and biovec for the single page synchronous I/O. Signed-off-by: Christoph Hellwig --- drivers/md/md.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index e2b9dbb6e888f6..6a27f52007c871 100644

[PATCH 11/17] md: remove bio_alloc_mddev

2021-01-26 Thread Christoph Hellwig
bio_alloc_mddev is never called with a NULL mddev, and ->bio_set is initialized in md_run, so it always must be initialized as well. Just open code the remaining call to bio_alloc_bioset. Signed-off-by: Christoph Hellwig --- drivers/md/md.c | 12 +--- drivers/md/md.h | 2 -- dr

[PATCH 10/17] drbd: remove drbd_req_make_private_bio

2021-01-26 Thread Christoph Hellwig
Open code drbd_req_make_private_bio in the two callers to prepare for further changes. Also don't bother to initialize bi_next as the bio code already does that that. Signed-off-by: Christoph Hellwig --- drivers/block/drbd/drbd_req.c| 5 - drivers/block/drbd/drbd_req.h| 12

[PATCH 03/17] blk-crypto: use bio_kmalloc in blk_crypto_clone_bio

2021-01-26 Thread Christoph Hellwig
Use bio_kmalloc instead of open coding it. Signed-off-by: Christoph Hellwig --- block/blk-crypto-fallback.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c index 50c225398e4d60..e8327c50d7c9f4 100644 --- a/block/blk-cr

[PATCH 09/17] drbd: remove bio_alloc_drbd

2021-01-26 Thread Christoph Hellwig
Given that drbd_md_io_bio_set is initialized during module initialization and the module fails to load if the initialization fails there is no need to fall back to plain bio_alloc. Signed-off-by: Christoph Hellwig --- drivers/block/drbd/drbd_actlog.c | 2 +- drivers/block/drbd/drbd_bitmap.c |

[PATCH 04/17] block: split bio_kmalloc from bio_alloc_bioset

2021-01-26 Thread Christoph Hellwig
bio_kmalloc shares almost no logic with the bio_set based fast path in bio_alloc_bioset. Split it into an entirely separate implementation. Signed-off-by: Christoph Hellwig --- block/bio.c | 167 ++-- include/linux/bio.h | 6 +- 2 files changed,

[PATCH 08/17] f2fs: remove FAULT_ALLOC_BIO

2021-01-26 Thread Christoph Hellwig
Sleeping bio allocations do not fail, which means that injecting an error into sleeping bio allocations is a little silly. Signed-off-by: Christoph Hellwig --- Documentation/filesystems/f2fs.rst | 1 - fs/f2fs/data.c | 29 - fs/f2fs/f2fs.h

[PATCH 07/17] f2fs: use blkdev_issue_flush in __submit_flush_wait

2021-01-26 Thread Christoph Hellwig
Use the blkdev_issue_flush helper instead of duplicating it. Signed-off-by: Christoph Hellwig --- fs/f2fs/data.c| 3 ++- fs/f2fs/f2fs.h| 1 - fs/f2fs/segment.c | 12 +--- 3 files changed, 3 insertions(+), 13 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 8cbf0

Re: [PATCH 01/17] zonefs: use bio_alloc in zonefs_file_dio_append

2021-01-26 Thread Johannes Thumshirn
On 26/01/2021 16:01, Christoph Hellwig wrote: > Use bio_alloc instead of open coding it. > > Signed-off-by: Christoph Hellwig > --- > fs/zonefs/super.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c > index bec47f2d074beb..faea2ed3

[PATCH 06/17] dm-clone: use blkdev_issue_flush in commit_metadata

2021-01-26 Thread Christoph Hellwig
Use blkdev_issue_flush instead of open coding it. Signed-off-by: Christoph Hellwig --- drivers/md/dm-clone-target.c | 14 +- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/drivers/md/dm-clone-target.c b/drivers/md/dm-clone-target.c index bdb255edc20043..a90bdf9b2ca6bd

[PATCH 05/17] block: use an on-stack bio in blkdev_issue_flush

2021-01-26 Thread Christoph Hellwig
There is no point in allocating memory for a synchronous flush. Signed-off-by: Christoph Hellwig --- block/blk-flush.c | 17 ++--- drivers/md/dm-zoned-metadata.c| 6 +++--- drivers/md/raid5-ppl.c| 2 +- drivers/nvme/target/io-cmd-bdev.c | 2 +- fs/b

Re: [PATCH 02/17] btrfs: use bio_kmalloc in __alloc_device

2021-01-26 Thread Josef Bacik
On 1/26/21 9:52 AM, Christoph Hellwig wrote: Use bio_kmalloc instead of open coding it. Signed-off-by: Christoph Hellwig Reviewed-by: Josef Bacik yay I contributed, Josef

[PATCH 02/17] btrfs: use bio_kmalloc in __alloc_device

2021-01-26 Thread Christoph Hellwig
Use bio_kmalloc instead of open coding it. Signed-off-by: Christoph Hellwig --- fs/btrfs/volumes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 0a6de859eb2226..584ba093cf4966 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volum

[PATCH 01/17] zonefs: use bio_alloc in zonefs_file_dio_append

2021-01-26 Thread Christoph Hellwig
Use bio_alloc instead of open coding it. Signed-off-by: Christoph Hellwig --- fs/zonefs/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index bec47f2d074beb..faea2ed34b4a37 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@

misc bio allocation cleanups

2021-01-26 Thread Christoph Hellwig
Hi Jens, this series contains various cleanups for how bios are allocated or initialized plus related fallout. Diffstat: Documentation/filesystems/f2fs.rst |1 block/bio.c| 167 ++--- block/blk-crypto-fallback.c|2 block/

Re: [PATCH] btrfs: avoid double put of block group when emptying cluster

2021-01-26 Thread Josef Bacik
On 1/26/21 4:02 AM, Nikolay Borisov wrote: On 25.01.21 г. 23:42 ч., Josef Bacik wrote: In __btrfs_return_cluster_to_free_space we will bail doing the cleanup of the cluster if the block group we passed in doesn't match the block group on the cluster. However we drop a reference to block_group

Re: [PATCH] btrfs: avoid double put of block group when emptying cluster

2021-01-26 Thread Nikolay Borisov
On 25.01.21 г. 23:42 ч., Josef Bacik wrote: > In __btrfs_return_cluster_to_free_space we will bail doing the cleanup > of the cluster if the block group we passed in doesn't match the block > group on the cluster. However we drop a reference to block_group, as > the cluster holds a reference to

[PATCH v5 15/18] btrfs: introduce subpage metadata validation check

2021-01-26 Thread Qu Wenruo
For subpage metadata validation check, there are some differences: - Read must finish in one bvec Since we're just reading one subpage range in one page, it should never be split into two bios nor two bvecs. - How to grab the existing eb Instead of grabbing eb using page->private, we have t

[PATCH v5 11/18] btrfs: support subpage in btrfs_clone_extent_buffer

2021-01-26 Thread Qu Wenruo
For btrfs_clone_extent_buffer(), it's mostly the same code of __alloc_dummy_extent_buffer(), except it has extra page copy. So to make it subpage compatible, we only need to: - Call set_extent_buffer_uptodate() instead of SetPageUptodate() This will set correct uptodate bit for subpage and regu

[PATCH v5 04/18] btrfs: make attach_extent_buffer_page() handle subpage case

2021-01-26 Thread Qu Wenruo
For subpage case, we need to allocate additional memory for each metadata page. So we need to: - Allow attach_extent_buffer_page() to return int to indicate allocation failure - Allow manually pre-allocate subpage memory for alloc_extent_buffer() As we don't want to use GFP_ATOMIC under spin