From: Tony Luck
Errata list is included in this document:
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/6th-gen-x-series-spec-update.pdf
with more details in:
https://www.intel.com/content/www/us/en/processors/xeon/scalable/xeon-scalable-spec-update.html
But
From: Tony Luck
No functional change, but lay the ground work for other per-model
quirks.
Signed-off-by: Tony Luck
---
arch/x86/kernel/cpu/intel_rdt.c | 52 ++---
1 file changed, 28 insertions(+), 24 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel_rdt.c
On Sat, Sep 16, 2017 at 12:53:42PM +0900, Sergey Senozhatsky wrote:
> Hello
>
> RFC
>
> On some arches C function pointers are indirect and point to
> a function descriptor, which contains the actual pointer to the code.
> This mostly doesn't matter, except for cases when people
From: Tony Luck
About the only tricky case is trying to move a task into a monitor
group that is a subdirectory of a different control group. But cover
the simple cases too.
Signed-off-by: Tony Luck
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 15 ---
1 file changed, 12 insertion
From: Tony Luck
Can't add a cpu to a monitor group unless it belongs to parent
group. Can't delete cpus from the default group.
Signed-off-by: Tony Luck
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 17 ++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/arch/x86/
From: Tony Luck
Chatting online with Boris to diagnose why his test cases for RDT
weren't working, we came up with either a good idea (in which case
I credit Boris) or a dumb one (in which case this is all my fault).
The basic problem is that there aren't many good error codes for
a file system
From: Tony Luck
Mostly this is about running out of RMIDs or CLOSIDs. Other
errors are various internal errors.
Signed-off-by: Tony Luck
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 28 ++--
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kerne
From: Tony Luck
Commands are given to the resctrl file system by making/removing
directories, or by writing to files. When something goes wrong
the user is generally left wondering why they got:
bash: echo: write error: Invalid argument
Add a new file "last_cmd_status" to the "info" di
From: Tony Luck
Save helpful descriptions of what went wrong when writing a
schemata file.
Signed-off-by: Tony Luck
---
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 61 +++--
1 file changed, 50 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel_rdt_
From: Tony Luck
New file in the "info" directory helps diagnose what went wrong
when using the /sys/fs/resctrl file system
Signed-off-by: Tony Luck
---
Oops ... forgot the Documentation ... here it is.
Documentation/x86/intel_rdt_ui.txt | 11 +++
1 file changed, 11 insertions(+)
diff
On Fri, Sep 08, 2017 at 03:18:30PM +0900, Sergey Senozhatsky wrote:
> if the addr is not in kernel .text, then try dereferencing it and check
> if the dereferenced addr is in kernel .text.
If it really is a function pointer, then we know that it is safe
to dereference. But if it isn't, then maybe
>> if (not_a_function_descriptor(ptr))
>> return ptr;
>
> I'm not sure if it's possible on ia64/ppc64/parisc64
> to reliably detect if it's a function descriptor or not.
Agreed. I don't know how to write this test (without changing the compiler to
put the pointers in a separate s
From: Tony Luck
The ACPI sysfs interface provides a way to read each ACPI table from
userspace via entries in /sys/firmware/acpi/tables/
The BERT table simply provides the size and address of the error
record in BIOS reserved memory and users may want access to this
record.
In an earlier age we
On Fri, Jun 23, 2017 at 10:58:24AM +0200, Borislav Petkov wrote:
> On Fri, Jun 23, 2017 at 09:48:55AM +0100, Colin King wrote:
> > From: Colin Ian King
> > -int sbi_send(int port, int off, int op, u32 *data)
> > +static int sbi_send(int port, int off, int op, u32 *data)
>
> Tony, were those suppo
On Thu, Jun 22, 2017 at 10:07:18PM -0700, Dan Williams wrote:
> On Wed, Jun 21, 2017 at 1:30 PM, Luck, Tony wrote:
> >> Persistent memory does have unpoisoning and would require this inverse
> >> operation - see drivers/nvdimm/pmem.c pmem_clear_poison() and core.c
>
From: Tony Luck
Thomas Gleixner is encouraging us to extend the /sys/fs/resctrl file system
to include monitoring data (LLC occupancy, memory bandwidth) from the
(weird) counters that come as part of "Resource Director Technology".
See Intel Software Developer Manual volume 3, section 17.17.1.
O
On Mon, May 15, 2017 at 12:27:03PM -0700, Luck, Tony wrote:
> From: Tony Luck
>
> Thomas Gleixner is encouraging us to extend the /sys/fs/resctrl file system
> to include monitoring data (LLC occupancy, memory bandwidth) from the
> (weird) counters that come as part of &qu
On Mon, May 15, 2017 at 09:45:58PM -0300, Arnaldo Carvalho de Melo wrote:
> I haven't been following the discussion about the resctrl fs discussion
> to understand why those values couldn't be read via
> sys_perf_event_open(), so can't comment on that, but the implementation
> on the tools/ classes
On Mon, Jun 12, 2017 at 11:54:06AM -0500, Yazen Ghannam wrote:
> - severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
> -
> - if (severity == MCE_DEFERRED_SEVERITY &&
> mce_is_memory_error(&m))
> - if (m.status & MCI_STATUS_ADDRV)
> -
From: Sergei Trofimovich
Starting from gcc-5.4+ gcc generates MLX
instructions in more cases to refer local
symbols:
https://gcc.gnu.org/PR60465
That caused ia64 module loader to choke
on such instructions:
fuse: invalid slot number 1 for IMM64
Linux kernel used to handle only case wher
> for (;;) {
> entry = mce_log_get_idx_check(mcelog.next);
Can't this get even simpler? Do we need the loop? The mutex
will now protect us while we check to see if there is a slot
to stash this new entry. Also just say:
entry = mcelog.next;
> for
>DEFINE(IA64_UPID_SHIFT, 5);
>
> Grepping for IA64_UPID_SHIFT leads us to some assembly
> code implementing fsys_getpid (why is that in assembly?!):
The fast system call path has a whole host of serious restrictions on what it
can
touch. See Documentation/ia64/fsys.txt. Why is getpid() a
> here's a second attempt at a more rigorous simplification: RCU stuff is
> gone and only a single loop scans through the elements.
The dev_mce_log() changes look good now.
You can apply the axe to more bits of mce_chrdev_read() though. Like that
while (!m->finished) {
we hold the mutex
> speaking of upstream, any objections if this patch set will go through
> the printk tree, in one piece?
Seems to be a better idea than trying to coordinate pulls from three
separate "arch/" trees. Fine with me.
-Tony
On Mon, Sep 25, 2017 at 04:04:07PM +0200, Thomas Gleixner wrote:
> On Mon, 18 Sep 2017, Luck, Tony wrote:
> > @@ -208,14 +241,19 @@ ssize_t rdtgroup_schemata_write(struct
> > kernfs_open_file *of,
> > char *tok, *resname;
> > int closid, ret = 0;
> seq_buf_vprintf() is your friend. It takes va_list as last argument.
Reinette spotted that a couple of minutes ahead of you.
/me looks for paper bag to put over my head.
> While at it can you please make it a proper function? No point for inlining
> that.
There was a small point ... I need th
From: Tony Luck
About the only tricky case is trying to move a task into a monitor
group that is a subdirectory of a different control group. But cover
the simple cases too.
Signed-off-by: Tony Luck
---
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 +---
arch/x86/kernel/cpu/intel_rdt_rdtgro
From: Tony Luck
Can't add a cpu to a monitor group unless it belongs to parent
group. Can't delete cpus from the default group.
Signed-off-by: Tony Luck
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 15 ---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/x86/ke
From: Tony Luck
Chatting online with Boris to diagnose why his test cases for RDT
weren't working, we came up with either a good idea (in which case
I credit Boris) or a dumb one (in which case this is all my fault).
The basic problem is that there aren't many good error codes for
a file system
From: Tony Luck
Mostly this is about running out of RMIDs or CLOSIDs. Other
errors are various internal errors.
Signed-off-by: Tony Luck
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 28 ++--
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kerne
From: Tony Luck
New file in the "info" directory helps diagnose what went wrong
when using the /sys/fs/resctrl file system
Signed-off-by: Tony Luck
---
Documentation/x86/intel_rdt_ui.txt | 11 +++
1 file changed, 11 insertions(+)
diff --git a/Documentation/x86/intel_rdt_ui.txt
b/Docu
From: Tony Luck
Commands are given to the resctrl file system by making/removing
directories, or by writing to files. When something goes wrong
the user is generally left wondering why they got:
bash: echo: write error: Invalid argument
Add a new file "last_cmd_status" to the "info" di
From: Tony Luck
Save helpful descriptions of what went wrong when writing a
schemata file.
Signed-off-by: Tony Luck
---
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 49 ++---
1 file changed, 38 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel_rdt_
Tested patch series on ia64 successfully.
Tested-by: Tony Luck
After this goes upstream, you should submit a patch to get rid of
all uses of %pF (70 instances in 35 files) and %pf (63 in 34)
Perhaps break the patch by top-level directory (e.g. get all the %pF
and %pF in the 17 files under drive
On Wed, Oct 11, 2017 at 06:19:39PM -0400, Gargi Sharma wrote:
> pidhash is no longer required as all the information
> can be looked up from idr tree. nr_hashed represented
> the number of pids that had been hashed. Since, nr_hashed and
> PIDNS_HASH_ADDING are no longer relevant, it has been rename
> What does? That does sound broken. How can a cache domain sanely span
> memory controllers?
Think "cluster on die" with cores on the socket split into two clusters, but
still sharing LLC.
-Tony
From: Fenghua Yu
CPUID.(EAX=0x10, ECX=res#):EBX[31:0] reports a bit mask for a resource.
Each set bit within the length of the CBM indicates the corresponding
unit of the resource allocation may be used by other entities in the
platform (e.g. an integrated graphics engine or hardware units outsid
> Since this piece of the ACPI pile is doing RAS, it is perhaps prudent if
> we at least paid attention to it and the direction it takes. So add Tony
> and me as reviewers.
Acked-by: Tony Luck
On Mon, Jul 31, 2017 at 10:15:27AM -0600, Baicar, Tyler wrote:
> I think the better thing to do in this case is still send the ack. If
> ghes_read_estatus() fails, then
> either we are unable to read the estatus or the estatus is empty/invalid.
Right now we silently handle that failure of ghes_rea
From: Tony Luck
The ACPI Boot Error Record Table provides a method for platform
firmware to give information to the operating system about error
that occurred prior to boot (of particular interest are problems
that caused the previous OS instance to crash).
The BERT table simply provides the siz
On Tue, Aug 15, 2017 at 08:35:51AM -0700, Kani, Toshimitsu wrote:
> User apps like ras-mc-ctl works as expected for a given (not-so-great)
> DIMM info from SMBIOS as well. I do not see a probelm from user
> perspective, either.
Won't the user see all their DIMMs reported for each memory controlle
On Tue, Aug 15, 2017 at 11:22:06AM +0100, Punit Agrawal wrote:
> There is already a bert driver which prints the error record. Would it
> make sense to integrate the character device there instead of creating a
> new driver?
Like this? The source code is smaller. But it doesn't offer the option t
From: Tony Luck
Speculative processor accesses may reference any memory that has a
valid page table entry. While a speculative access won't generate
a machine check, it will log the error in a machine check bank. That
could cause escalation of a subsequent error since the overflow bit
will be th
From: Tony Luck
-f shows absolute value from the file each time. -F shows the delta
---
This is a proof-of-concept patch to show how "perf" might be extended to
use the new RDT file system monitoring files. "-f" is useful for the
llc_occupancy
"-F" for the MBM files.
tools/perf/builtin-c2c.c
> > > Hmm... I'm not seeing any implementation that would allow setting
> > > between firmware first, hardware first or "auto", as we've discussed.
> >
> > This is all coming up. As the 0/3 message said, these 3 patches are the
> > bare minimum of reorganizing stuff only and should serve as a base
> Later, we could extend that same behavior to Intel for the common
> errors, at least, so that we can dump at least *some* string explaining
> what the error is.
s/common errors/architectural errors/
That means we don't need to keep updating for every Xeon that documents
some MCi_STATUS.MSCOD bi
>> Provided Tony agrees though... I'd venture a guess and say that he
>> doesn't have a choice, woahahhahaha...
>>
>> :-)))
>
> Well, I guess send this officially with a CC:Tony and see what he says. :-)
That's definitely part of my day job ... so yes, please add me as a reviewer.
-Tony
On Fri, Nov 10, 2017 at 08:48:29AM +0900, Sergey Senozhatsky wrote:
> -Examples::
> -
> - printk("Going to call: %pF\n", gettimeofday);
> - printk("Going to call: %pF\n", p->func);
> - printk("%s: called from %pS\n", __func__, (void *)_RET_IP_);
> - printk("%s: called from %pS\n", _
On Fri, Nov 10, 2017 at 08:48:24AM +0900, Sergey Senozhatsky wrote:
> All Ack-s/Tested-by-s were dropped, since the patch set has been
> reworked. I'm kindly asking arch-s maintainers and developers to test it
> once again. Sorry for any inconveniences and thanks for your help in
> advance.
> Hi Tony, Fenghua,
>
> Can you take a look at this patch and see if it breaks IA64?
David,
Which patch is "this patch". I don't see any link or attachment.
-Tony
On Fri, Dec 08, 2017 at 07:04:48PM +, David Howells wrote:
> Luck, Tony wrote:
>
> > Which patch is "this patch". I don't see any link or attachment.
>
> Sorry, I cc'd a patch which I sent to the ia64 list. The 5th patch on this
> branch:
>
>
> Excellent, thanks! Can I put you down as a Tested-by?
Yes
Tested-by: Tony Luck
From: Tony Luck
If you edit a kernfs backed file with vi(1), you see an ugly error
message when you write the file because vi tries to fsync(2) the
file after writing, which fails.
We have noop_fsync() for this, use it.
Signed-off-by: Tony Luck
---
fs/kernfs/file.c | 1 +
1 file changed, 1 in
> > I wonder whether this is the proper abstraction level. We might as well do
> > the following:
> >
> > rdtresources[] = {
> > {
> > .name = "L3",
> > },
> > {
> > .name = "L3Data",
> > },
> > {
> > .name = "L3Code",
> > },
> >
> > and enable eith
On Mon, Oct 17, 2016 at 09:43:41AM -0700, Yu, Fenghua wrote:
> > > > I wonder whether this is the proper abstraction level. We might as
> > > > well do the following:
> > > >
> > > > rdtresources[] = {
> > > > {
> > > > .name = "L3",
> > > > },
> > > > {
> > > > .na
On Mon, Oct 17, 2016 at 11:14:55PM +0200, Thomas Gleixner wrote:
> > + /* Compute rdt_max_closid across all resources */
> > + rdt_max_closid = 0;
> > + for_each_rdt_resource(r)
> > + rdt_max_closid = max(rdt_max_closid, r->num_closid);
>
> Oh no! This needs to be min().
>
> Assum
On Tue, Oct 18, 2016 at 12:01:01AM +0200, Thomas Gleixner wrote:
> > + /* Don't allow if there are processes in this group */
> > + read_lock(&tasklist_lock);
> > + for_each_process(p) {
> > + if (p->closid == rdtgrp->closid) {
> > + read_unlock(&tasklist_lock);
>
> So how are we going to deal with that in the schematas? Assume the L3=16
> and L2=8 case(no CDP). So effectively any write of L2 to CLOSID=0 will
> affect the setting of L2 in CLOSID=8.
>
> Will the code tell the user that L2 cannot be set for CLOSID >= 8?
>
> Will it print the setting of CLOSID
> How so? CLOSID 9 is using CLOSID 1 L2 settings. Are we just keeping the L2
> setting of CLOSID 1 around and do not reset it to default?
No. When CLOSID 9 arrives at the L2 h/w, it doesn't just take the bits it
likes an discard the high bits to map to L2_CBM[1]. It just turns into
into the maxim
On Tue, Oct 18, 2016 at 01:20:36AM +0200, Thomas Gleixner wrote:
> On Mon, 17 Oct 2016, Fenghua Yu wrote:
> > part0: L3:0=1;1=1 closid0/cbm=1 on cache0 and closid0/cbm=1 on cache1
> > (closid 15 on cache0 combined with 16 different closids on cache1)
> > ...
> > part254: L3:0=;1=7fff cl
> I don't think this is convenient, but it's ok. Now if we create a new thread
> between 1 and 2, the new thread is in group1. The new thread pid isn't in the
> pid list we found in 1, so after 2, the new thread still is in group 1. Truely
> sysadmin can repeat the step 1 & 2 and move the new threa
> Hmm, I don't know how applications are going to use the interface. Nobody
> knows
> it right now. But we do have some candicate workloads which want to configure
> the cache partition at runtime, so it's not just a boot time stuff. I'm
> wondering why we have such limitation. The framework is th
dec);
+ if (likely(old == c))
+ break;
+ c = old;
+ }
+ return dec;
+}
I was about to say "add a cpu_relax()" in the bottom of that loop. But none of
the other
atomic ops that spin on a cmpxchg do that ... so:
Acked-by: Tony Luck
-Tony
> What branch is this based on? I can't find the relevant code/commits
tip tree. branch x86/apic
-Tony
> How to do this? Should I change the line to
>
> + if (static_branch_unlikely(&rdt_enable_key))?
See Documentation/static-keys.txt, there are some examples.
also "git grep static_branch_unlikely" to see existing users
-Tony
> For UE recovery support, current we need mce=2 in command line
> and also disable panic_on_oops with sysctl.
Please explain. I've never given mce=2 on command line, and have
had my kernel recover from thousands of (injected) UE memory errors.
-Tony
On Sun, Aug 21, 2016 at 05:02:41PM -0700, Joe Perches wrote:
> Marking arrays as const makes for smaller data.
Joe,
"a few hundred" seems to be exaggeration.
Before:
$ size drivers/edac/skx_edac.ko
textdata bss dec hex filename
84351024 249483250b drivers/e
Commit-ID: 385ddeac7ed99cf7dc62d76274d55fbd7cae1b5a
Gitweb: http://git.kernel.org/tip/385ddeac7ed99cf7dc62d76274d55fbd7cae1b5a
Author: Luck, Tony
AuthorDate: Fri, 5 Oct 2012 15:05:34 -0700
Committer: H. Peter Anvin
CommitDate: Fri, 5 Oct 2012 15:59:07 -0700
X86 ACPI: Use #ifdef not
Commit-ID: 2412aa1293a4159c610616305c17efd237c8208d
Gitweb: http://git.kernel.org/tip/2412aa1293a4159c610616305c17efd237c8208d
Author: Luck, Tony
AuthorDate: Tue, 16 Apr 2013 11:35:56 -0700
Committer: Thomas Gleixner
CommitDate: Wed, 17 Apr 2013 10:39:37 +0200
ia64: Make sure
Commit-ID: 0841c04d65937ad2808f59c43cb54a92473c8f0e
Gitweb: http://git.kernel.org/tip/0841c04d65937ad2808f59c43cb54a92473c8f0e
Author: Luck, Tony
AuthorDate: Fri, 1 Nov 2013 13:59:52 -0700
Committer: Ingo Molnar
CommitDate: Sun, 3 Nov 2013 10:40:12 +0100
dmi: Avoid unaligned memory
Commit-ID: 9ebddac7ea2a1f4b4ce3335a78312a58dfaadb4d
Gitweb: http://git.kernel.org/tip/9ebddac7ea2a1f4b4ce3335a78312a58dfaadb4d
Author: Luck, Tony
AuthorDate: Fri, 8 Nov 2013 14:03:33 -0800
Committer: Ingo Molnar
CommitDate: Mon, 11 Nov 2013 10:21:29 +0100
ACPI, x86: Fix extended error
Commit-ID: 985c78d3ff8e9c74450fa2bb08eb55e680d999ca
Gitweb: https://git.kernel.org/tip/985c78d3ff8e9c74450fa2bb08eb55e680d999ca
Author: Luck, Tony
AuthorDate: Fri, 27 Apr 2018 09:37:08 -0700
Committer: Thomas Gleixner
CommitDate: Sun, 6 May 2018 12:46:39 +0200
x86/MCE: Fix stack out
1101 - 1172 of 1172 matches
Mail list logo