[PATCH 12/12] block: add special APIs for run-time disabling of discard and friends

2024-05-28 Thread Christoph Hellwig
A few drivers optimistically try to support discard, write zeroes and secure erase and disable the features from the I/O completion handler if the hardware can't support them. This disable can't be done using the atomic queue limits API because the I/O completion handlers can't take sleeping locks

[PATCH 11/12] block: remove unused queue limits API

2024-05-28 Thread Christoph Hellwig
Remove all APIs that are unused now that sd and sr have been converted to the atomic queue limits API. Signed-off-by: Christoph Hellwig --- block/blk-settings.c | 190 - include/linux/blkdev.h | 12 --- 2 files changed, 202 deletions(-) diff --git a/bl

[PATCH 10/12] sr: convert to the atomic queue limits API

2024-05-28 Thread Christoph Hellwig
Assign all queue limits through a local queue_limits variable and queue_limits_commit_update so that we can't race updating them from multiple places, and free the queue when updating them so that in-progress I/O submissions don't see half-updated limits. Also use the chance to clean up variable n

[PATCH 09/12] sd: convert to the atomic queue limits API

2024-05-28 Thread Christoph Hellwig
Assign all queue limits through a local queue_limits variable and queue_limits_commit_update so that we can't race updating them from multiple places, and free the queue when updating them so that in-progress I/O submissions don't see half-updated limits. Signed-off-by: Christoph Hellwig --- dri

[PATCH 08/12] sd: cleanup zoned queue limits initialization

2024-05-28 Thread Christoph Hellwig
Consolidate setting zone-related queue limits in sd_zbc_read_zones instead of splitting them between sd_zbc_revalidate_zones and sd_zbc_read_zones, and move the early_zone_information initialization in sd_zbc_read_zones above setting up the queue limits. Signed-off-by: Christoph Hellwig --- driv

[PATCH 07/12] sd: factor out a sd_discard_mode helper

2024-05-28 Thread Christoph Hellwig
Split the logic to pick the right discard mode into a little helper to prepare for further changes. Signed-off-by: Christoph Hellwig --- drivers/scsi/sd.c | 37 - 1 file changed, 20 insertions(+), 17 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/

[PATCH 06/12] sd: simplify the disable case in sd_config_discard

2024-05-28 Thread Christoph Hellwig
Fall through to the main call to blk_queue_max_discard_sectors given that max_blocks has been initialized to zero above instead of duplicating the call. Signed-off-by: Christoph Hellwig --- drivers/scsi/sd.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/scsi/sd.c

[PATCH 05/12] sd: add a sd_disable_write_same helper

2024-05-28 Thread Christoph Hellwig
Add helper to disable WRITE SAME when it is not supported and use it instead of sd_config_write_same in the I/O completion handler. This avoids touching more fields than required in the I/O completion handler and prepares for converting sd to use the atomic queue limits API. Signed-off-by: Chris

[PATCH 04/12] sd: add a sd_disable_discard helper

2024-05-28 Thread Christoph Hellwig
Add helper to disable discard when it is not supported and use it instead of sd_config_discard in the I/O completion handler. This avoids touching more fields than required in the I/O completion handler and prepares for converting sd to use the atomic queue limits API. Signed-off-by: Christoph He

[PATCH 03/12] sd: simplify the ZBC case in provisioning_mode_store

2024-05-28 Thread Christoph Hellwig
Don't reset the discard settings to no-op over and over when a user writes to the provisioning attribute as that is already the default mode for ZBC devices. In hindsight we should have made writing to the attribute fail for ZBC devices, but the code has probably been around for far too long to ch

convert the SCSI ULDs to the atomic queue limits API

2024-05-28 Thread Christoph Hellwig
Hi all, this series converts the SCSI upper level drivers to the atomic queue limits API. The first patch is a bug fix for ubd that later patches depend on and might be worth picking up for 6.10. The second patch changes the max_sectors calculation to take the optimal I/O size into account so th

[PATCH 02/12] block: take io_opt and io_min into account for max_sectors

2024-05-28 Thread Christoph Hellwig
The soft max_sectors limit is normally capped by the hardware limits and an arbitrary upper limit enforced by the kernel, but can be modified by the user. A few drivers want to increase this limit (nbd, rbd) or adjust it up or down based on hardware capabilities (sd). Change blk_validate_limits t

[PATCH 01/12] ubd: untagle discard vs write zeroes not support handling

2024-05-28 Thread Christoph Hellwig
Discard and Write Zeroes are different operation and implemented by different fallocate opcodes for ubd. If one fails the other one can work and vice versa. Split the code to disable the operations in ubd_handler to only disable the operation that actually failed. Fixes: 50109b5a03b4 ("um: Add s

Re: [PATCH 3/5] um: Do a double clone to disable rseq

2024-05-28 Thread Tiwei Bie
On 5/28/24 7:57 PM, Johannes Berg wrote: > On Tue, 2024-05-28 at 18:16 +0800, Tiwei Bie wrote: >> On 5/28/24 4:54 PM, benja...@sipsolutions.net wrote: >>> From: Benjamin Berg >>> >>> Newer glibc versions are enabling rseq support by default. This remains >>> enabled in the cloned child process, po

Re: [PATCH 3/5] um: Do a double clone to disable rseq

2024-05-28 Thread Johannes Berg
On Tue, 2024-05-28 at 18:16 +0800, Tiwei Bie wrote: > On 5/28/24 4:54 PM, benja...@sipsolutions.net wrote: > > From: Benjamin Berg > > > > Newer glibc versions are enabling rseq support by default. This remains > > enabled in the cloned child process, potentially causing the host kernel > > to wr

Re: [PATCH 3/5] um: Do a double clone to disable rseq

2024-05-28 Thread Tiwei Bie
Hi Benjamin, On 5/28/24 6:30 PM, Benjamin Berg wrote: > On Tue, 2024-05-28 at 18:16 +0800, Tiwei Bie wrote: >> On 5/28/24 4:54 PM, benja...@sipsolutions.net wrote: >>> From: Benjamin Berg >>> >>> Newer glibc versions are enabling rseq support by default. This remains >>> enabled in the cloned chi

Re: [PATCH 3/5] um: Do a double clone to disable rseq

2024-05-28 Thread Benjamin Berg
Hi Tiwei, On Tue, 2024-05-28 at 18:16 +0800, Tiwei Bie wrote: > On 5/28/24 4:54 PM, benja...@sipsolutions.net wrote: > > From: Benjamin Berg > > > > Newer glibc versions are enabling rseq support by default. This remains > > enabled in the cloned child process, potentially causing the host kerne

Re: [PATCH 3/5] um: Do a double clone to disable rseq

2024-05-28 Thread Tiwei Bie
On 5/28/24 4:54 PM, benja...@sipsolutions.net wrote: > From: Benjamin Berg > > Newer glibc versions are enabling rseq support by default. This remains > enabled in the cloned child process, potentially causing the host kernel > to write/read memory in the child. > > It appears that this was pure

[PATCH 0/5] Increased address space for 64 bit

2024-05-28 Thread benjamin
From: Benjamin Berg This patchset fixes a few bugs, adds a new method of discovering the host task size and finally adds four level page table support. All of this means the userspace TASK_SIZE is much larger and in turns permits userspace applications that need a lot of virtual addresses to work

[PATCH 3/5] um: Do a double clone to disable rseq

2024-05-28 Thread benjamin
From: Benjamin Berg Newer glibc versions are enabling rseq support by default. This remains enabled in the cloned child process, potentially causing the host kernel to write/read memory in the child. It appears that this was purely not an issue because the used memory area happened to be above T

[PATCH 5/5] um: Add 4 level page table support

2024-05-28 Thread benjamin
From: Benjamin Berg The larger memory space is useful to support more applications inside UML. One example for this is ASAN instrumentation of userspace applications which requires addresses that would otherwise not be available. Signed-off-by: Benjamin Berg --- arch/um/Kconfig

[PATCH 4/5] um: Discover host_task_size from envp

2024-05-28 Thread benjamin
From: Benjamin Berg When loading the UML binary, the host kernel will place the stack at the highest possible address. It will then map the program name and environment variables onto the start of the stack. As such, an easy way to figure out the host_task_size is to use the highest pointer to a

[PATCH 2/5] um: Limit TASK_SIZE to the addressable range

2024-05-28 Thread benjamin
From: Benjamin Berg We may have a TASK_SIZE from the host that is bigger than UML is able to address with a three-level pagetable. Guard against that by clipping the maximum TASK_SIZE to the maximum addressable area. Signed-off-by: Benjamin Berg --- arch/um/kernel/um_arch.c | 7 ++- 1 file

[PATCH 1/5] um: Fix stub_start address calculation

2024-05-28 Thread benjamin
From: Benjamin Berg The calculation was wrong as it only subtracted one and then rounded down for alignment. However, this is incorrect if host_task_size is not already aligned. This probably worked fine because on 64 bit the host_task_size is bigger than returned by os_get_top_address. Signed-