low-level XSAVE helpers for saving and restoring register
states, as well as handling XSAVE buffers.
* Generalizing state data manipuldations: set_rand_data()
* Introducing a generic feature query helper: get_xstate_info()
While doing so, remove unused defines in amx.c.
Signed-off-by: Chang S
tests are already established in
tools/selftests/mm.
This series is based on the tip/master branch. You can also find it in
the following repository:
git://github.com/intel/apx.git selftest-xstate_v1
Thanks,
Chang
[1]
https://lore.kernel.org/all/20211026122523.afb99...@davehans
the error handling by using ksft_exit_fail_msg(), which is
functionally equivalent with err() within the selftest framework.
Signed-off-by: Chang S. Bae
---
This change is a prerequisite for the upcoming xstate selftest, which
requires signal handling for registering and cleaning up handlers
Add xstate testing specifically for those vector register states,
validating kernel's context switching and ensuring ABI compliance.
Use the established xstate testing framework.
Signed-off-by: Chang S. Bae
---
Alternatively, this invocation could be placed directly in
xstate.c::main(). Ho
function that first verifies feature availability
from the kernel and constructs the necessary state information once. The
wrapper then sequentially invokes all tests to ensure consistent
execution.
Update the AMX test to use this unified invocation.
Signed-off-by: Chang S. Bae
---
tools/testing
The established xstate test code is designed to be generic, but certain
xstates require special handling and cannot be tested without additional
adjustments.
Clarify which xstates are currently supported, and enforce testing only
for them.
Signed-off-by: Chang S. Bae
---
tools/testing
the test from dynamic states, remove the permission request
code. In fact, The permission request inside the test wrapper was
redundant.
Additionally, replace fatal_error() with ksft_exit_fail_msg() for
consistency in error handling.
Signed-off-by: Chang S. Bae
---
Expected output:
$ amx_64
ith ksft_exit_fail_msg() for
consistency in error handling.
Signed-off-by: Chang S. Bae
---
Expected out:
$ amx_64
...
[RUN] AMX Tile data: inject xstate via ptrace().
[OK]'xfeatures' in SW reserved area was correctly written
[OK]xstate was correctly updated.
---
tools/testing/selfte
.
Signed-off-by: Chang S. Bae
---
Expected output:
$ amx_64
...
[RUN] AMX Tile data: load xstate and raise SIGUSR1
[OK]'magic1' is valid
[OK]'xfeatures' in SW reserved area is valid
[OK]'xfeatures' in XSAVE header is valid
[OK]xstate delivery was succes
xstate_info to include a name field, providing a human-readable
identifier.
Signed-off-by: Chang S. Bae
---
tools/testing/selftests/x86/amx.c| 2 -
tools/testing/selftests/x86/xstate.h | 60
2 files changed, 60 insertions(+), 2 deletions(-)
diff --git a/tools/testing
在 2024/8/30 3:26, Andrii Nakryiko 写道:
> On Tue, Aug 27, 2024 at 4:34 AM Liao, Chang wrote:
>>
>> Hi, Mark
>>
>> Would you like to discuss this patch further, or do you still believe
>> emulating
>> STP to push FP/LR into the stack in kernel is not a go
Hi, Mark
Would you like to discuss this patch further, or do you still believe emulating
STP to push FP/LR into the stack in kernel is not a good idea?
Thanks.
在 2024/8/21 15:55, Liao, Chang 写道:
> Hi, Mark
>
> My bad for taking so long to rely, I generally agree with your suggestions
在 2024/8/16 0:53, Andrii Nakryiko 写道:
> On Wed, Aug 14, 2024 at 7:58 PM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/15 2:42, Andrii Nakryiko 写道:
>>> On Tue, Aug 13, 2024 at 9:17 PM Liao, Chang wrote:
>>>>
>>>>
>>>>
>>
Hi, Mark
My bad for taking so long to rely, I generally agree with your suggestions to
STP emulation.
在 2024/8/15 17:58, Mark Rutland 写道:
> On Wed, Aug 14, 2024 at 08:03:56AM +, Liao Chang wrote:
>> As Andrii pointed out, the uprobe/uretprobe selftest bench run into a
>> c
在 2024/8/15 0:57, Andrii Nakryiko 写道:
> On Tue, Aug 13, 2024 at 9:17 PM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/13 1:49, Andrii Nakryiko 写道:
>>> On Mon, Aug 12, 2024 at 4:11 AM Liao, Chang wrote:
>>>>
>>>>
>>>>
>&g
在 2024/8/15 2:42, Andrii Nakryiko 写道:
> On Tue, Aug 13, 2024 at 9:17 PM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/13 1:49, Andrii Nakryiko 写道:
>>> On Mon, Aug 12, 2024 at 4:11 AM Liao, Chang wrote:
>>>>
>>>>
>>>>
>&g
s://lore.kernel.org/all/20240727094405.1362496-1-liaocha...@huawei.com
[3] https://lore.kernel.org/all/20240801082407.1618451-1-liaocha...@huawei.com
Liao Chang (2):
uprobes: Remove redundant spinlock in uprobe_deny_signal()
uprobes: Remove the spinlock within handle_singlestep()
include/linux/u
rg
[2] https://lore.kernel.org/all/20240727094405.1362496-1-liaocha...@huawei.com
Acked-by: Oleg Nesterov
Signed-off-by: Liao Chang
---
include/linux/uprobes.h | 1 +
kernel/events/uprobes.c | 8 +---
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/include/linux/uprobes.h b/in
Since clearing a bit in thread_info is an atomic operation, the spinlock
is redundant and can be removed, reducing lock contention is good for
performance.
Acked-by: Oleg Nesterov
Signed-off-by: Liao Chang
---
kernel/events/uprobes.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/kernel
and 'ret'
variants has been significantly reduced. Due to the emulation of 'push'
instruction needs to access userspace memory, it spent more cycles than
the other.
[0]
https://lore.kernel.org/all/caef4bzao4eg6hr2hzxypn+7uer4chs0r99zln02ezz5yruv...@mail.gmail.com/
Signed-off-by:
在 2024/8/13 1:57, Andrii Nakryiko 写道:
> On Mon, Aug 12, 2024 at 5:05 AM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/10 2:40, Andrii Nakryiko 写道:
>>> On Fri, Aug 9, 2024 at 11:34 AM Andrii Nakryiko
>>> wrote:
>>>>
>>>> On Fri,
在 2024/8/13 1:49, Andrii Nakryiko 写道:
> On Mon, Aug 12, 2024 at 4:11 AM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/9 2:26, Andrii Nakryiko 写道:
>>> On Thu, Aug 8, 2024 at 1:45 AM Liao, Chang wrote:
>>>>
>>>> Hi Andrii and Oleg.
>&g
在 2024/8/12 20:07, Oleg Nesterov 写道:
> On 08/09, Liao Chang wrote:
>>
>> Since clearing a bit in thread_info is an atomic operation, the spinlock
>> is redundant and can be removed, reducing lock contention is good for
>> performance.
>
> My ack still
在 2024/8/12 19:29, Oleg Nesterov 写道:
> On 08/09, Liao Chang wrote:
>>
>> --- a/include/linux/uprobes.h
>> +++ b/include/linux/uprobes.h
>> @@ -75,6 +75,7 @@ struct uprobe_task {
>>
>> struct uprobe *active_uprobe;
>>
在 2024/8/10 2:40, Andrii Nakryiko 写道:
> On Fri, Aug 9, 2024 at 11:34 AM Andrii Nakryiko
> wrote:
>>
>> On Fri, Aug 9, 2024 at 12:16 AM Liao, Chang wrote:
>>>
>>>
>>>
>>> 在 2024/8/9 2:26, Andrii Nakryiko 写道:
>>>> On Thu, Aug 8,
在 2024/8/9 2:26, Andrii Nakryiko 写道:
> On Thu, Aug 8, 2024 at 1:45 AM Liao, Chang wrote:
>>
>> Hi Andrii and Oleg.
>>
>> This patch sent by me two weeks ago also aim to optimize the performance of
>> uprobe
>> on arm64. I notice recent discussion
在 2024/8/9 2:26, Andrii Nakryiko 写道:
> On Thu, Aug 8, 2024 at 1:45 AM Liao, Chang wrote:
>>
>> Hi Andrii and Oleg.
>>
>> This patch sent by me two weeks ago also aim to optimize the performance of
>> uprobe
>> on arm64. I notice recent discussion
ing the patch until Andrii's changes are settle down.
>
> Oleg.
>
>
> On 08/08, Liao, Chang wrote:
>>
>> Hi Andrii and Oleg.
>>
>> This patch sent by me two weeks ago also aim to optimize the performance of
>> uprobe
>> on arm64. I
ps://lore.kernel.org/all/20240801082407.1618451-1-liaocha...@huawei.com
Liao Chang (2):
uprobes: Remove redundant spinlock in uprobe_deny_signal()
uprobes: Remove the spinlock within handle_singlestep()
include/linux/uprobes.h | 1 +
kernel/events/uprobes.c | 10 +-
2 files changed
rg
[2] https://lore.kernel.org/all/20240727094405.1362496-1-liaocha...@huawei.com
Signed-off-by: Liao Chang
---
include/linux/uprobes.h | 1 +
kernel/events/uprobes.c | 8 +---
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
Since clearing a bit in thread_info is an atomic operation, the spinlock
is redundant and can be removed, reducing lock contention is good for
performance.
Acked-by: Oleg Nesterov
Signed-off-by: Liao Chang
---
kernel/events/uprobes.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/kernel
在 2024/8/8 18:28, Oleg Nesterov 写道:
> On 08/08, Liao, Chang wrote:
>>
>> - pre_ssout() resets the deny signal flag
>>
>> - uprobe_deny_signal() sets the deny signal flag when TIF_SIGPENDING is
>> cleared.
>>
>> - handle_single
he CC list for broader visibility and potential collaboration.
Thanks.
在 2024/7/27 17:44, Liao Chang 写道:
> The profiling result of single-thread model of selftests bench reveals
> performance bottlenecks in find_uprobe() and caches_clean_inval_pou() on
> ARM64. On my local testing mach
e.
>
[...]
>>
>> (To clarify. In fact I think that a new TIF_ or even PF_ flag makes more
>> sense,
>> afaics it can have more users. But I don't think that uprobes can provide
>> enough
>> justification for that right now)
I also face the same choice when Oleg suggested me to add new flag to track the
denied
flag, due to I haven't encountered scenarios outside of uprobe that would deny
signal,
so I'm not confident of introduce new TIF_ flag without a fully understanding
of potential
potential impacts.
>>
>> Oleg.
>>
--
BR
Liao, Chang
nal flag when TIF_SIGPENDING is
cleared.
- handle_singlestep() check the deny signal flag and restore TIF_SIGPENDING
if necessary.
Does this approach look correct to you,do do you have any other way to
implement the "flag"?
Thanks.
>
> Oleg.
>
> On 08/06, Oleg Nestero
在 2024/8/2 17:24, Oleg Nesterov 写道:
> On 08/02, Liao, Chang wrote:
>>
>>
>> 在 2024/8/1 22:06, Oleg Nesterov 写道:
>>> On 08/01, Liao Chang wrote:
>>>>
>>>> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task
在 2024/8/6 4:01, Andrii Nakryiko 写道:
> On Fri, Aug 2, 2024 at 8:05 AM Andrii Nakryiko
> wrote:
>>
>> On Thu, Aug 1, 2024 at 7:41 PM Liao, Chang wrote:
>>>
>>>
>>>
>>> 在 2024/8/1 5:42, Andrii Nakryiko 写道:
>>>> From:
nt_enable(struct trace_event_call *call,
> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> index 73a6b041bcce..928c73cde32e 100644
> --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> @@ -478,7 +478,8 @@ static void testmod_unregister_uprobe(void)
> mutex_lock(&testmod_uprobe_mutex);
>
> if (uprobe.uprobe) {
> - uprobe_unregister(uprobe.uprobe, &uprobe.consumer);
> + uprobe_unregister_nosync(uprobe.uprobe, &uprobe.consumer);
> + uprobe_unregister_sync();
> uprobe.offset = 0;
> uprobe.uprobe = NULL;
> }
--
BR
Liao, Chang
在 2024/8/1 22:06, Oleg Nesterov 写道:
> On 08/01, Liao Chang wrote:
>>
>> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task
>> *utask, struct pt_regs *regs)
>> int err = 0;
>>
>> uprobe = utask->active_uprobe;
&
在 2024/8/2 0:49, Andrii Nakryiko 写道:
> On Thu, Aug 1, 2024 at 5:23 AM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/1 5:42, Andrii Nakryiko 写道:
>>> To avoid unnecessarily taking a (brief) refcount on uprobe during
>>> breakpoint handling in handle_swbp
vaddr, &is_swbp);
> if (!uprobe) {
> if (is_swbp > 0) {
> /* No matching uprobe; signal SIGTRAP. */
> @@ -2223,6 +2239,7 @@ static void handle_swbp(struct pt_regs *regs)
> */
> instruction_pointer_set(regs, bp_vaddr);
> }
> + srcu_read_unlock(&uprobes_srcu, srcu_idx);
> return;
> }
>
> @@ -2258,12 +2275,12 @@ static void handle_swbp(struct pt_regs *regs)
> if (arch_uprobe_skip_sstep(&uprobe->arch, regs))
> goto out;
>
> - if (!pre_ssout(uprobe, regs, bp_vaddr))
> - return;
> + if (pre_ssout(uprobe, regs, bp_vaddr))
> + goto out;
>
Regardless what pre_ssout() returns, it always reach the label 'out', so the
if block is unnecessary.
> - /* arch_uprobe_skip_sstep() succeeded, or restart if can't singlestep */
> out:
> - put_uprobe(uprobe);
> + /* arch_uprobe_skip_sstep() succeeded, or restart if can't singlestep */
> + srcu_read_unlock(&uprobes_srcu, srcu_idx);
> }
>
> /*
--
BR
Liao, Chang
f which are from 4.5M/s to 6.4M/s and 3.3M/s to 5.1M/s individually.
[1] https://lore.kernel.org/all/20240731214256.3588718-1-and...@kernel.org
[2] https://lore.kernel.org/all/20240727094405.1362496-1-liaocha...@huawei.com
Signed-off-by: Liao Chang
---
include/linux/uprobes.h | 1 +
ke
Since clearing a bit in thread_info is an atomic operation, the spinlock
is redundant and can be removed, reducing lock contention is good for
performance.
Signed-off-by: Liao Chang
---
kernel/events/uprobes.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/kernel/events/uprobes.c b/kernel
was quite useful to understand CMA allocation
latency.
Signed-off-by: Richard Chang
---
* from v1 -
https://lore.kernel.org/linux-mm/20240226100045.2083962-1-richard...@google.com/
* Move the trace event int field to the end of the longs - rostedt
* Do the calculation only when tracing is
was quite useful to understand CMA allocation
latency.
Signed-off-by: Richard Chang
---
include/trace/events/kmem.h | 39 +
mm/internal.h | 3 ++-
mm/page_alloc.c | 30 +++-
mm/page_isolation.c | 2
'SIE' is masked in this new saved irqflag. After kprobe is
serviced, the CPU 'sstatus' is restored with 'SIE' masked.
This overwritten 'sstatus' cause BUG_ON() in __find_get_block.
This bug is already fixed on arm64 by Jisheng Zhang.
Fixes: c22b0bcb1dd
for 33 different real-time signals.
Also, perhaps, force_sig(SIGFAIL) here, instead of return -1 -- to die with
SIGSEGV.
Thanks,
Chang
'SIE', and reach __find_get_block where it requires the
interrupt must be enabled.
Fix this is very trivial, just restore the value of 'sstatus' in pt_regs
with backup one at 2) when the instruction being single stepped cause a
page fault.
Fixes: c22b0bcb1dd02 ("riscv: A
On Mar 26, 2021, at 09:34, Jann Horn wrote:
> On Sun, Feb 21, 2021 at 7:56 PM Chang S. Bae wrote:
>>
>> + if (handle_xfirstuse_event(¤t->thread.fpu))
>> + return;
>
> What happens if handle_xfirstuse_event() fails because vmalloc()
> failed
sp - current->sas_ss_sp > current->sas_ss_size))) {
>
> You could've simply done
>
> if (unlikely(entering_altstack && !on_sig_stack(sp)))
>
> here.
But if sigaltstack()’ed with the SS_AUTODISARM flag, both on_sig_stack() and
sas_ss_flags() return 0 [1]. Then, segfault always here. v5 had the exact
issue before [2].
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/sched/signal.h#n576
[2]
https://lore.kernel.org/lkml/CALCETrXuFrHUU-L=HMofTgEDZk9muPnVtK=ejsthqq01xhb...@mail.gmail.com/
Thanks,
Chang
kernel-induced overflow --
whether alt stack enough for signal delivery itself. The stack is possibly
not enough for the signal handler's use as the kernel does not know for it.
Thanks,
Chang
just what you mean. :)
FWIW, PATCH21 [1] uses the instruction mask to skip writing zeros on sigframe.
Then, XSAVE will clear the xstate_bv for the xtile data state bit.
[1]
https://lore.kernel.org/lkml/20210221185637.19281-22-chang.seok@intel.com/
Thanks,
Chang
In order to reply in plain text, I send the mail from Gmail.
Filipe Manana 於 2021年3月24日 週三 下午8:16寫道:
>
> On Wed, Mar 24, 2021 at 11:15 AM bingjingc wrote:
> >
> > From: BingJing Chang
> >
> > In commit d77815461f04 ("btrfs: Avoid trucating page or punching
On Mar 20, 2021, at 15:13, Thomas Gleixner wrote:
> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
>> +
>> +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */
>> +static inline void xdisable_switch(struct fpu *prev, struct fpu *next)
>> +{
[i]))
> + if (xsave_cpuid_features[i] ||
> !boot_cpu_has(xsave_cpuid_features[i]))
> xfeatures_mask_all &= ~BIT_ULL(i);
>
> Even with the gaps for XTILE the table is smaller, the code is simpler…
True, I will follow your suggestion. Maybe follow-up with a new patch before
posting v5.
Thank you for the suggestion.
Chang
On Mar 20, 2021, at 14:31, Thomas Gleixner wrote:
> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
>>
>> +static void check_xtile_data_against_struct(int size)
>> +{
>> +u32 max_palid, palid, state_size;
>> +u32 eax, eb
On Mar 20, 2021, at 14:26, Thomas Gleixner wrote:
> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
>> In 64-bit mode, include the AMX state components in
>> XFEATURE_MASK_USER_SUPPORTED.
>>
>> The XFD feature will be used to dynamically expand the xstate per-task
On Mar 16, 2021, at 04:52, Borislav Petkov wrote:
> On Mon, Mar 15, 2021 at 11:52:14PM -0700, Chang S. Bae wrote:
>> @@ -272,7 +275,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs
>> *regs, size_t frame_size,
>> * If we are on the alternate signal stack
The SIGSTKSZ constant may not represent enough stack size in some
architectures as the hardware state size grows.
Use getauxval(AT_MINSIGSTKSZ) to increase the stack size.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: linux-kselft...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
The test measures the kernel's signal delivery with different (enough vs.
insufficient) stack sizes.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kselft...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v3:
* Revised test messages
Context Switch")
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: H.J. Lu
Cc: Fenghua Yu
Cc: Dave Martin
Cc: Michael Ellerman
Cc: x...@kernel.org
Cc: libc-al...@sourceware.org
Cc: linux-a...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-ke
ds.
While the kernel allows new source code to discover and use a sufficient
alternate signal stack size, this check is still necessary to protect
binaries with insufficient alternate signal stack size from data
corruption.
Suggested-by: Jann Horn
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
, and helper functions for the
calculation to be used in a new user interface. Set max_frame_size to a
system-wide worst-case value, instead of storing multiple app-specific
values.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Acked-by: H.J. Lu
Cc: x...@kernel.org
Cc: linux-kernel
Define the AT_MINSIGSTKSZ in generic Linux. It is already used as generic
ABI in glibc's generic elf.h, and this define will prevent future namespace
conflicts. In particular, x86 is also using this generic definition.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: Carlos O'
32757-1-chang.seok@intel.com/
Chang S. Bae (6):
uapi: Define the aux vector AT_MINSIGSTKSZ
x86/signal: Introduce helpers to get the maximum signal frame size
x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available
x86/s
sysctl_sched_cfs_bw_burst_enabled is introduced as a
switch for burst. It is enabled by default.
Co-developed-by: Shanpei Chen
Signed-off-by: Shanpei Chen
Signed-off-by: Huaixin Chang
---
include/linux/sched/sysctl.h | 1 +
kernel/sched/core.c | 8 +++---
kernel/sched/fair.c | 58
into cpu.stat file:
nr_burst: number of periods bandwidth burst occurs
burst_time: cumulative wall-time that any cpus has
used above quota in respective periods
Co-developed-by: Shanpei Chen
Signed-off-by: Shanpei Chen
Signed-off-by: Huaixin Chang
---
kernel/sched/core.c | 14
Basic description of usage and effect for CFS Bandwidth Control Burst.
Co-developed-by: Shanpei Chen
Signed-off-by: Shanpei Chen
Signed-off-by: Huaixin Chang
---
Documentation/admin-guide/cgroup-v2.rst | 16 +
Documentation/scheduler/sched-bwc.rst | 64
a group can consume in
a given period is "buffer" which is equivalent to "quota" + "burst in
case that this group has done enough accumulation.
Co-developed-by: Shanpei Chen
Signed-off-by: Shanpei Chen
Signed-off-by: Huaixin Chang
---
kernel/sched/core.c | 97
e present more latency statistics and handle overflow while
accumulating.
Huaixin Chang (4):
sched/fair: Introduce primitives for CFS bandwidth burst
sched/fair: Make CFS bandwidth controller burstable
sched/fair: Add cfs bandwidth burst statistics
sched/fair: Add document for burstable
e not
missing this check.
Thanks,
Chang
On Mar 5, 2021, at 02:43, Borislav Petkov wrote:
> On Sat, Feb 27, 2021 at 08:59:08AM -0800, Chang S. Bae wrote:
>> Historically, signal.h defines MINSIGSTKSZ (2KB) and SIGSTKSZ (8KB), for
>> use by all architectures with sigaltstack(2). Over time, the hardware state
>>
On Mar 1, 2021, at 11:09, Borislav Petkov wrote:
> On Sat, Feb 27, 2021 at 08:59:06AM -0800, Chang S. Bae wrote:
>>
>> diff --git a/include/uapi/linux/auxvec.h b/include/uapi/linux/auxvec.h
>> index abe5f2b6581b..15be98c75174 100644
>> --- a/include/uapi/linux/aux
The test measures the kernel's signal delivery with different (enough vs.
insufficient) stack sizes.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kselft...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v3:
* Revised test messages
072-1-chang.seok@intel.com/
[10]:
https://lore.kernel.org/lkml/20210203172242.29644-1-chang.seok@intel.com/
Chang S. Bae (6):
uapi: Define the aux vector AT_MINSIGSTKSZ
x86/signal: Introduce helpers to get the maximum signal frame size
x86/elf: Support a new ELF aux vector AT_MINSIG
ds.
While the kernel allows new source code to discover and use a sufficient
alternate signal stack size, this check is still necessary to protect
binaries with insufficient alternate signal stack size from data
corruption.
Suggested-by: Jann Horn
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
The SIGSTKSZ constant may not represent enough stack size in some
architectures as the hardware state size grows.
Use getauxval(AT_MINSIGSTKSZ) to increase the stack size.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: linux-kselft...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Context Switch")
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: H.J. Lu
Cc: Fenghua Yu
Cc: Dave Martin
Cc: Michael Ellerman
Cc: x...@kernel.org
Cc: libc-al...@sourceware.org
Cc: linux-a...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-ke
, and helper functions for the
calculation to be used in a new user interface. Set max_frame_size to a
system-wide worst-case value, instead of storing multiple app-specific
values.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Acked-by: H.J. Lu
Cc: x...@kernel.org
Cc: linux-kernel
Define the AT_MINSIGSTKSZ in generic Linux. It is already used as generic
ABI in glibc's generic elf.h, and this define will prevent future namespace
conflicts. In particular, x86 is also using this generic definition.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: Carlos O'
On Fri, Feb 26, 2021 at 1:17 PM Christoph Hellwig wrote:
>
> On Fri, Feb 26, 2021 at 12:17:50PM +0800, Claire Chang wrote:
> > Do you think I should fix this and rebase on the latest linux-next
> > now? I wonder if there are more factor and clean up coming and I
> >
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index fd9c1bd183ac..8b77fd64199e 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -836,6 +836,40 @@ late_initcall(swiotlb_create_default_debugfs);
> #endif
>
> #ifdef CONFIG_DMA_RESTRICTED_POOL
> +struct page *dev_s
27; prefix and an
extra
* '\0' for termination.
*/
#define MAX_XSTATE_MASK_CHARS 24
/**
* fpu__init_parse_early_param() - parse the xstate kernel parameters
*
* Parse them early because fpu__init_system() is executed before
* parse_early_param().
*/
static void __init fpu__init_parse_early_param(void)
Thanks,
Chang
ating-point (BF16) elements.
Here we add AMX to the kernel/user ABI, by enumerating the capability.
E.g., /proc/cpuinfo: amx_tile, amx_bf16, amx_int8
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
arch/x86/include/asm/cpufeatures.
is feature).
Rename XFEATURE_MASK_USER_SUPPORTED to XFEATURE_MASK_USER_ENABLED to be
aligned with the new parameters.
While this cmdline is currently enabled only for AMX, it is intended to be
easily enabled to be useful for future XSAVE-enabled features.
Signed-off-by: Chang S. Bae
Reviewed-
in the signal handler that the signal frame
excludes AMX data when the signaled thread has initialized AMX state.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-kselft...@vger.kernel.org
---
Changes from v3:
* Removed 'no fu
be gaps in the
XCR0 feature bit numbers.
No functional change.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v1:
* Rebased on the upstream kernel (5.10)
---
arch/x86/kernel/fpu/xstate.c | 41
d has initialized its AMX state.
Collect the test cases of validating those operations together, as they
share some common setup for the AMX state.
These test cases do not depend on AMX compiler support, as they employ
userspace-XSAVE directly to access AMX state.
Signed-off-by: Chang S. Bae
Review
. Add a new field to represent the embedded
buffer.
Every child process will set the pointer on its creation. And the initial
task sets it before dealing with soft FPU.
No functional change.
Suggested-by: Borislav Petkov
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.
ptrace() may update xstate data before the target task has taken an XFD
fault and expanded the xstate buffer. Detect this case and allocate a
sufficient buffer to support the request. Also, disable the (now
unnecessary) associated first-use fault.
Signed-off-by: Chang S. Bae
Reviewed-by: Len
helpers to find a component's offset accordingly.
When copying an initial value, explicitly check the init_fpstate coverage.
If not found, reset the memory in the destination. Otherwise, copy values
from init_fpstate.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
C
abled()
xfirstuse_not_detected()
The #NM handler induces the xstate buffer expansion to save the first-used
states.
The XFD feature is enabled only for the compacted format. If the kernel
uses the standard format, the buffer has to be always enough for all the
states.
Signed-off-by: Chang S. Bae
Reviewed-by
embedded
buffer size by excluding the dynamic user states from the maximum size.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v3:
* Updated the changelog. (Borislav Petkov)
* Updated the code comment. (Borislav Petkov
In 64-bit mode, include the AMX state components in
XFEATURE_MASK_USER_SUPPORTED.
The XFD feature will be used to dynamically expand the xstate per-task
buffer on the first use.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
arch
r the address finder.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v3:
* Added the function description in the kernel-doc style. (Borislav Petkov)
* Removed 'no functional change' in the changelog. (Borislav Petkov
user buffer size.
No functional change. Those sizes have no difference, as the buffer is not
dynamic yet.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: k...@vger.kernel.org
---
Changes from v3:
* Added as a new patch to add the
witch. The states are named as 'dynamic'
supervisor states. Some define and helper are not named with dynamic
supervisor states, so rename them.
No functional change.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes fr
Extend copy_xregs_to_kernel() to receive a mask argument of which states to
save, in preparation for dynamic user state handling.
Update KVM to set a valid fpu->state_mask, so it can continue to share with
the core code.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
configure XCOMP_BV for the compacted format.
No functional change.
Signed-off-by: Chang S. Bae
Reviewed-by: Len Brown
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: k...@vger.kernel.org
---
Changes from v3:
* Updated the changelog. (Borislav Petkov)
* Updated the function comment to use
. The first implementation supports
8KB.
Check the XTILEDATA state size dynamically. The feature introduces the new
tile register, TMM. Define one register struct only and read the number of
registers from CPUID. Cross-check the overall size with CPUID again.
Signed-off-by: Chang S. Bae
Reviewed-by
1 - 100 of 1011 matches
Mail list logo