[PATCH 1/2] x86/mce: Include the PPIN in machine check records when it is available

2016-11-17 Thread Luck, Tony
From: Tony Luck Intel Xeons from Ivy Bridge onwards support a processor identification number. On systems that have it, include it in the machine check record. I'm told that this would be helpful for users that run large data centers with multi-socket servers to keep track of which CPUs are seein

Re: [PATCH v3 01/18] Documentation, ABI: Add a document entry for cache id

2016-10-10 Thread Luck, Tony
On Sat, Oct 08, 2016 at 12:11:08PM -0500, Nilay Vaish wrote: > On 7 October 2016 at 21:45, Fenghua Yu wrote: > > From: Fenghua Yu > > + caches typically exist per core, but there may not be a > > + power of two cores on a socket, so these caches may be > > +

Re: [PATCH v3 05/18] Documentation, x86: Documentation for Intel resource allocation user interface

2016-10-10 Thread Luck, Tony
On Sat, Oct 08, 2016 at 01:33:06PM -0700, Fenghua Yu wrote: > On Sat, Oct 08, 2016 at 12:12:07PM -0500, Nilay Vaish wrote: > > On 7 October 2016 at 21:45, Fenghua Yu wrote: > > > From: Fenghua Yu > > > > > > +L3 details (code and data prioritization disabled) > > > +--

Re: [PATCH v3 07/18] x86/intel_rdt: Add Haswell feature discovery

2016-10-10 Thread Luck, Tony
On Sun, Oct 09, 2016 at 06:28:23PM +0200, Borislav Petkov wrote: > On Sun, Oct 09, 2016 at 10:09:37AM -0700, Fenghua Yu wrote: > > The MSR is not guaranteed on every stepping of the family and model machine > > because some parts may have the MSR fused off. And some bits in the MSR > > may not be i

Re: [PATCH v3 11/18] x86/intel_rdt: Add basic resctrl filesystem support

2016-10-10 Thread Luck, Tony
On Sun, Oct 09, 2016 at 05:31:25PM -0500, Nilay Vaish wrote: > > +static struct dentry *rdt_mount(struct file_system_type *fs_type, > > + int flags, const char *unused_dev_name, > > + void *data) > > +{ > > + struct dentry *dentry; >

RE: [PATCH v3 07/18] x86/intel_rdt: Add Haswell feature discovery

2016-10-11 Thread Luck, Tony
> I wonder what's worse - comparing SKU strings - we know that from the MCE > recovery experience - or poking at maybe nonexistent MSRs? :-) > > I guess the latter is cleaner so let's try it. Vikas got beat up for comparing SKU strings, so the probe method was offered as an alternative. It's defi

Re: [PATCH v3 05/18] Documentation, x86: Documentation for Intel resource allocation user interface

2016-10-11 Thread Luck, Tony
On Tue, Oct 11, 2016 at 12:07:33PM -0500, Nilay Vaish wrote: > On 10 October 2016 at 12:19, Luck, Tony wrote: > > The next resource coming will have values that are simple ranges {0 .. max} > > > > Regarding addition of more resources, I was wondering if one common &

[RFC PATCH 19/18] x86/intel_rdt: Add support for L2 cache allocation

2016-10-07 Thread Luck, Tony
Just in case you think that part 0010 is a bit complex setting up all that infrastructure for just the L3 cache and using for_each_rdt_resource() all over the place to loop over just one thing. Here's the payoff. Untested because I don't have a machine handy that supports L2 ... but this should be

Re: [PATCH v5 12/18] x86/intel_rdt: Add "info" files to resctrl file system

2016-10-26 Thread Luck, Tony
> >> +.mode= 0444, >> +.kf_ops= &rdtgroup_kf_single_ops, >> +.seq_show= rdt_num_closid_show, >> +}, >> +{ >> +.name= "cbm_val", > > cbm_val? Is that a value? No, it's the valid bitmask which you can set. So > cmb_mask or somethi

Re: [PATCH v5 10/18] x86/intel_rdt: Build structures for each resource based on cache topology

2016-10-26 Thread Luck, Tony
Order is visible to users when we print entries in the schemata file, and validate input that they write (we require that they provide all masks in the same order as this list). If we hot remove a socket, it disappears from the list, and from the schemata file. When we put in a replacement it r

Re: [PATCH 1/2] x86/mce: Include the PPIN in machine check records when it is available

2016-11-18 Thread Luck, Tony
On Fri, Nov 18, 2016 at 02:00:22PM +0100, Borislav Petkov wrote: > On Thu, Nov 17, 2016 at 04:35:48PM -0800, Luck, Tony wrote: > > @@ -2134,8 +2140,37 @@ static int __init mcheck_enable(char *str) > > } > > __setup("mce", mcheck_enable); > > > &g

[PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

2016-11-18 Thread Luck, Tony
From: Tony Luck Intel Xeons from Ivy Bridge onwards support a processor identification number set in the factory. To the user this is a handy unique number to identify a particular cpu. Intel can decode this to the fab/production run to track errors. On systems that have it, include it in the mac

Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector

2017-03-22 Thread Luck, Tony
On Thu, Mar 09, 2017 at 11:08:17AM +0100, Borislav Petkov wrote: > +static bool cec_add_mce(struct mce *m) > +{ > + if (!m) > + return false; > + > + if (memory_error(m) && mce_usable_address(m)) > + if (!cec_add_elem(m->addr >> PAGE_SHIFT)) > + r

Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector

2017-03-23 Thread Luck, Tony
On Thu, Mar 23, 2017 at 04:22:28PM +0100, Borislav Petkov wrote: > On Wed, Mar 22, 2017 at 07:03:39PM +0100, Borislav Petkov wrote: > > Lemme try to write a small script exercising exactly that scenario to > > see whether I'm actually not talking crap here :-) > > Ok, here's a snapshot from the CE

Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector

2017-03-23 Thread Luck, Tony
On Thu, Mar 23, 2017 at 06:28:39PM +0100, Borislav Petkov wrote: > Meh, I don't like the idea of keeping an evergrowing list of PFNs we > can't do anything about anyway. Keeping every PFN would be overkill (most of them should be taken offline with no issues). A fixed array of a few of them with

[PATCH] x86/mce: Fix copy/paste error in exception table entries.

2017-03-20 Thread Luck, Tony
From: Tony Luck Back in commit 92b0729c34cab ("x86/mm, x86/mce: Add memcpy_mcsafe()") I made a copy/paste error setting up the exception table entries and ended up with two for label .L_cache_w3 and none for .L_cache_w2. This means that if we take a machine check on: .L_cache_w2: movq 2*8(%rsi)

Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector

2017-03-20 Thread Luck, Tony
On Thu, Mar 09, 2017 at 11:08:17AM +0100, Borislav Petkov wrote: > +config RAS_CEC > + bool "Correctable Errors Collector" > + depends on X86_MCE && MEMORY_FAILURE && DEBUG_FS > + ---help--- > + This is a small cache which collects correctable memory errors per 4K > + page P

Re: [PATCH 00/17] PCI resource mmap cleanup

2017-03-24 Thread Luck, Tony
On Fri, Mar 24, 2017 at 11:40:33AM +, David Woodhouse wrote: > That leaves IA64 as the last holdout, as the selection of vm_page_prot > there is rather complicated: > > prot = phys_mem_access_prot(NULL, vma->vm_pgoff, size, > vma->vm_page_prot); > >

[PATCH] x86/intel_rdt: Implement "update" mode when writing schemata file

2017-03-24 Thread Luck, Tony
From: Tony Luck The schemata file can have multiple lines and it is cumbersome to update from shell scripts. Remove code that requires that the user provide values for every resource (in the right order). If the user provides values for just a few resources, update them and leave the rest uncha

RE: [PATCH 1/1] x86/cqm: Cqm requirements

2017-03-07 Thread Luck, Tony
> That's all nice and good, but I still have no coherent explanation why > measuring across allocation domains makes sense. Is this in reaction to this one? >> 5) Put multiple threads into a single measurement group If we fix it to say "threads from the same CAT group" does it fix things?

[PATCH v2] mm, page_alloc: Add missing check for memory holes

2017-03-08 Thread Luck, Tony
From: Tony Luck commit 13ad59df67f19788f6c22985b1a33e466eceb643 ("mm, page_alloc: avoid page_to_pfn() when merging buddies") moved the check for memory holes out of page_is_buddy() and had the callers do the check. But this wasn't done correctly in one place which caused ia64 to crash very early

Re: [PATCH 2/5] x86/intel_rdt: Improvements to parsing schemata

2017-03-10 Thread Luck, Tony
On Fri, Mar 10, 2017 at 07:58:51PM +0100, Thomas Gleixner wrote: > Well, we have several options to tackle this: > > 1) Have schemata files for each resource > >schemata_l2, _l3 _mb > > 2) Request a full overwrite every time (all entries required) > >That still does not require ordering

Re: [PATCH v4] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-03-06 Thread Luck, Tony
t kernel should > > not do anything except clearing MCG_STATUS. This is useful > > for kdump to let vmcore dumping perform as hard as it can. > > Ok, I went and rewrote the text to make it more succinct, to the point > and correct spelling and formatting. > > Tony, ACK? Yes. Looks good now. Acked-by: Tony Luck -Tony

RE: [PATCH] ia64: efi: use timespec64 for persistent clock

2016-06-17 Thread Luck, Tony
> Aside from this, we get a little closer to removing the > __weak read_persistent_clock() definition, which relies on > converting all architectures to provide read_persistent_clock64 > instead. > > Signed-off-by: Arnd Bergmann Applied. Thanks -Tony

RE: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-02-21 Thread Luck, Tony
> It's from my understanding, I didn't get the explicit description from the > intel SDM on this point. > If a broadcast SRAO comes on real hardware, will MSR_IA32_MCG_STATUS of each > cpu have MCG_STATUS_RIPV bit set? MCG_STATUS is a per-thread MSR and will contain the status appropriate for th

Re: [PATCH v3] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-22 Thread Luck, Tony
On Wed, Feb 22, 2017 at 12:11:14PM +0800, Xunlei Pang wrote: > + /* > + * Cases to bail out to avoid rendezvous process timeout: > + * 1)If this CPU is offline. > + * 2)If crashing_cpu was set, e.g. entering kdump, > + * we need to skip cpus remaining in 1st kernel. > +

RE: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

2017-02-06 Thread Luck, Tony
Digging through the e-mails from last week to generate a new version of the requirements I looked harder at this: > 12) Whatever fs or syscall is provided instead of perf syscalls, it > should provide total_time_enabled in the way perf does, otherwise is > hard to interpret MBM values. This looks

RE: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

2017-02-06 Thread Luck, Tony
> 12) Whatever fs or syscall is provided instead of perf syscalls, it > should provide total_time_enabled in the way perf does, otherwise is > hard to interpret MBM values. It seems that it is hard to define what we even mean by memory bandwidth. If you are measuring just one task and you find th

RE: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

2017-02-06 Thread Luck, Tony
> cgroup mode gives a per-CPU breakdown of event and running time, the > tool aggregates it into running time vs event count. Both per-cpu > breakdown and the aggregate are useful. > > Piggy-backing on perf's cgroup mode would give us all the above for free. Do you have some sample output from a p

Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

2017-02-07 Thread Luck, Tony
On Tue, Feb 07, 2017 at 12:08:09AM -0800, Stephane Eranian wrote: > Hi, > > I wanted to take a few steps back and look at the overall goals for > cache monitoring. > From the various threads and discussion, my understanding is as follows. > > I think the design must ensure that the following usag

RE: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

2017-02-02 Thread Luck, Tony
>> 7) Must be able to measure based on existing resctrl CAT group >> 8) Can get measurements for subsets of tasks in a CAT group (to find >> the guys hogging the resources) >> 9) Measure per logical CPU (pick active RMID in same precedence for >> task/cpu as CAT picks CLOSID) > > I

RE: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

2017-02-02 Thread Luck, Tony
>> Nice to have: >> 1) Readout using "perf(1)" [subset of modes that make sense ... tying >> monitoring >> to resctrl file system will make most command line usage of perf(1) close to >> impossible. > > > We discussed this offline and I still disagree that it is close to > impossible to use

Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

2017-02-02 Thread Luck, Tony
On Thu, Feb 02, 2017 at 12:22:42PM -0800, David Carrillo-Cisneros wrote: > There is no need to change perf(1) to support > # perf stat -I 1000 -e intel_cqm/llc_occupancy {command} > > the PMU can work with resctrl to provide the support through > perf_event_open, with the advantage that tools oth

Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

2017-02-03 Thread Luck, Tony
On Thu, Feb 02, 2017 at 06:14:05PM -0800, David Carrillo-Cisneros wrote: > If we tie allocation groups and monitoring groups, we are tying the > meaning of CPUs and we'll have to choose between the CAT meaning or > the perf meaning. > > Let's allow semantics that will allow perf like monitoring to

Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

2017-02-03 Thread Luck, Tony
On Fri, Feb 03, 2017 at 01:08:05PM -0800, David Carrillo-Cisneros wrote: > On Fri, Feb 3, 2017 at 9:52 AM, Luck, Tony wrote: > > On Thu, Feb 02, 2017 at 06:14:05PM -0800, David Carrillo-Cisneros wrote: > >> If we tie allocation groups and monitoring groups, we are tying the &g

RE: MAINTAINERS without commits in the last 3 years

2016-08-26 Thread Luck, Tony
>> Maybe mark it Obsolete or Orphan. > > Yap, let's mark it Orphan and remove Jason Uhlenkott's email address. I > wouldn't go as far as removing it just yet especially if it doesn't cost > us anything to have it in there and someone might have a box with the > hardware somewhere... Agreed. I don

RE: [Ksummit-discuss] checkkpatch (in)sanity ?

2016-08-29 Thread Luck, Tony
> 80 columns is simply silly when dealing with either > long identifiers or many levels of indentation. > > One thing that 80 column limit does do is encourage > shorter identifiers and fewer levels of indentation. > > Generally, both of those are good things. I think the main complaint with the l

[PATCH V2 2/4] x86/mce, PCI: Provide quirks to identify Xeon models with machine check recovery

2016-08-30 Thread Luck, Tony
Each Xeon includes a number of capability registers in PCI space that describe some features not enumerated by CPUID. Use these to determine that we are running on a model that can recover from machine checks. Hooks for Ivybridge ... Skylake provided. Signed-off-by: Tony Luck --- V2: Boris & Li

Re: [RFC PATCH 1/4] RAS: Add a Corrected Errors Collector

2016-06-07 Thread Luck, Tony
On Tue, Jun 07, 2016 at 06:52:22PM +0200, Borislav Petkov wrote: > +void mce_log(struct mce *m) > { > unsigned next, entry; > > + if (!in_atomic() && memory_error(m) && mce_usable_address(m)) > + if (!ce_add_elem(m->addr >> PAGE_SHIFT)) > + return; > + >

RE: [PATCH v2 2/3] Documentation, ABI: Add a document entry for cache id

2016-07-08 Thread Luck, Tony
> It means one cache's id is unique in all caches with same cache index number. > For example, in all caches with index3 (i.e. level3), cache id 0 is unique to > identify > a L3 cache. But in caches with index 0 (i.e. Level0), there is also a cache > id 0. > So cache id is unique in one index. Bu

RE: [PATCH v2 2/3] Documentation, ABI: Add a document entry for cache id

2016-07-08 Thread Luck, Tony
> > > "index4" is the L3-unified cache. > > > > Crazy. What was wrong with using 'level' or 'depth'? > > It is all there: > > $ grep . /sys/devices/system/cpu/cpu0/cache/index?/level > /sys/devices/system/cpu/cpu0/cache/index0/level:1 The term "index" came from the Intel Software developer manual

Re: [lkp] [ACPI / APEI] a3e2acc5e3: kmsg.BERT:Can't_request_iomem_region<#-#>

2016-07-11 Thread Luck, Tony
Get BIOS version with: # dmidecode -t 0 Sent from my iPhone > On Jul 11, 2016, at 17:54, Ye, Xiaolong wrote: > >> On Sun, Jul 10, 2016 at 08:28:37PM -0700, Tony Luck wrote: >> I'm very surprised that there was a BERT table on an Atom machine. More >> details about the machine please. Also

[PATCH] acpi, apei: Add Boot Error Record Table (BERT) support

2016-06-29 Thread Luck, Tony
From: Huang Ying ACPI/APEI is designed to verifiy/report H/W errors, like Corrected Error(CE) and Uncorrected Error(UC). It contains four tables: HEST, ERST, EINJ and BERT. The first three tables have been merged for a long time, but because of lacking BIOS support for BERT, the support for BERT

RE: [PATCH] acpi, apei: Add Boot Error Record Table (BERT) support

2016-06-29 Thread Luck, Tony
>> but then it looks like it was forgotten again :-( > > Do you want me to take it? Yes please. -Tony

Re: [PATCH] x86/irq: Cure live lock in irq_force_complete_move()

2016-03-11 Thread Luck, Tony
With this patch applied my system survives me doing several rounds of: # echo 0 | tee /sys/devices/system/cpu/cpu*/online # echo 1 | tee /sys/devices/system/cpu/cpu*/online whereas without the patch the first of those went to [152455.129604] NMI watchdog: Watchdog detected hard LOCKUP on cpu 96

RE: [PATCH V6 0/6] Intel memory b/w monitoring support

2016-03-11 Thread Luck, Tony
> Please see if the branch below works for you: > > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/core tragically no :-( The instant I started perf stat to trace some MBM events, I got a panic. But I think something went awry with the base version you applied these patche

RE: [PATCH V6 0/6] Intel memory b/w monitoring support

2016-03-11 Thread Luck, Tony
Some tracing printk() show that we are calling update_sample() with totally bogus arguments. There are a few good calls, then I see rmid=-380863112 evt_type=-30689 first=0 That turns into a wild vrmid, and we fault accessing mbm_current->prev_msr -Tony

RE: [PATCH V6 0/6] Intel memory b/w monitoring support

2016-03-12 Thread Luck, Tony
>> There are a few good calls, then I see rmid=-380863112 evt_type=-30689 >> first=0 >> >> That turns into a wild vrmid, and we fault accessing mbm_current->prev_msr > > It's because I'm a right idiot.. The below should sort that methinks. Tsk tsk ... don't insult the coder, just critique the co

Re: [RFC PATCH] Unexport do_machine_check() and machine_check_poll()

2016-03-14 Thread Luck, Tony
On Mon, Mar 14, 2016 at 05:38:54PM +0100, Borislav Petkov wrote: > Hey Tony, > > how about the below, untested change? > > Some backporting work to SLE11 got me pondering over why we're exporting > all those MCA-internal things to modules. Modules don't have any > business calling those so how ab

RE: [RFC PATCH] Unexport do_machine_check() and machine_check_poll()

2016-03-14 Thread Luck, Tony
> But the sentiment is: I want to unexport do_machine_check() and > machine_check_poll() and not let external modules call into them > directly. Why, you ask? Because they have no business doing that. EXPORT is a big hammer ... we either let every module have access to a function, or none. It sou

Re: [PATCH] mce-apei: do not rely on ACPI_ERST_GET_RECORD_ID for record id

2016-03-03 Thread Luck, Tony
> retry: > - rc = erst_get_record_id_next(&pos, record_id); > - if (rc) > - goto out; > + /* > + * Some hardware is broken and doesn't actually advance the record id I'd blame this on firmware rather than hardware. > + * returned by ACPI_ERST_GET_RECORD_ID when

RE: [PATCH 4/6] x86/mbm: Memory bandwidth monitoring event management

2016-03-07 Thread Luck, Tony
>> +bytes = mbm_current->interval_bytes * MSEC_PER_SEC; >> +do_div(bytes, diff_time); >> +mbm_current->bandwidth = bytes; >> +mbm_current->interval_bytes = 0; >> +mbm_current->interval_start = cur_time; >> +} >>> + >> +return mbm_c

RE: [PATCH 04/10] x86/cpufeature: Kill cpu_has_x2apic

2016-03-29 Thread Luck, Tony
> Tony, it looks like that cpu_has_x2apic in asm/iommu.h has been > forgotten and can go now? Yes - it looks to be a relic of some common code with x86 where ia64 needed To indicate that it didn't have x2apic. Dropping it looks fine. -Tony

RE: [PATCH] x86/mce: Avoid using object after free in genpool.

2016-03-23 Thread Luck, Tony
> Acked-by: Borislav Petkov Thanks > I have a couple more RAS patches for tip, want me to pick that one up > too? I'm going to send my pile to tip guys after -rc1 is out. Yes please ... throw this one on the pile. -Tony

[GIT PULL] pstore change for 4.6

2016-03-20 Thread Luck, Tony
The following changes since commit f6cede5b49e822ebc41a099fe41ab4989f64e2cb: Linux 4.5-rc7 (2016-03-06 14:48:03 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/please-pull-pstore for you to fetch changes up to 764fd639d794a1c

Re: [PATCH v13] x86, mce: Add memcpy_trap()

2016-02-19 Thread Luck, Tony
Make use of the EXTABLE_FAULT exception table entries. This routine returns a structure to indicate the result of the copy: struct mcsafe_ret { u64 trap_nr; u64 bytes_left; }; If the copy is successful, then both 'trap_nr' and 'bytes_left' are zero. If we faulted during the copy,

Re: [PATCH v10 4/4] x86: Create a new synthetic cpu capability for machine check recovery

2016-02-10 Thread Luck, Tony
On Wed, Feb 10, 2016 at 12:06:03PM +0100, Borislav Petkov wrote: > What about MSR_IA32_PLATFORM_ID or some other MSR or register, for > example? Bits 52:50 give us "information concerning the intended platform for the processor" ... but we don't seem to decode that vague statement into anything th

Re: [PATCH v10 3/4] x86, mce: Add __mcsafe_copy()

2016-02-10 Thread Luck, Tony
On Wed, Feb 10, 2016 at 11:58:43AM +0100, Borislav Petkov wrote: > But one could take out that function do some microbenchmarking with > different sizes and once with the current version and once with the > pushes and pops of r1[2-5] to see where the breakeven is. On a 4K page copy from a source a

RE: [PATCH v11 0/4] Machine check recovery when kernel accesses poison

2016-02-11 Thread Luck, Tony
> That's some changelog, I tell ya. Well, it took us long enough so for all 4: I'll see if Peter Jackson wants to turn it into a series of movies. > Reviewed-by: Borislav Petkov Ingo: Boris is happy ... your turn to find things for me to fix (or is it ready for 4.6 now??) -Tony

Re: [PATCH v14] x86, mce: Add memcpy_mcsafe()

2016-03-02 Thread Luck, Tony
On Thu, Feb 18, 2016 at 11:47:26AM -0800, Tony Luck wrote: > Make use of the EXTABLE_FAULT exception table entries to write > a kernel copy routine that doesn't crash the system if it > encounters a machine check. Prime use case for this is to copy > from large arrays of non-volatile memory used as

RE: [PATCH 01/15] avr32: convert to asm-generic/memory_model.h

2015-09-28 Thread Luck, Tony
> Seems like we should simply introduce a CONFIG_VMEM_MAP for ia64 > to get this started. Does my memory trick me or did we used to have > vmem_map on other architectures as well but managed to get rid of it > everywhere but on ia64? I think ia64 hung onto this because of the SGI sn1 platforms. T

RE: [Patch V1 1/3] x86, mce: MCE log size not enough for high core parts

2015-09-24 Thread Luck, Tony
> Now that we have this shiny 2-pages sized lockless gen_pool, why are we > still dealing with struct mce_log mcelog? Why can't we rip it out and > kill it finally? And switch to the gen_pool? > > All code that reads from mcelog - /dev/mcelog chrdev - should switch to > the lockless buffer and will

RE: [Patch V1 1/3] x86, mce: MCE log size not enough for high core parts

2015-09-24 Thread Luck, Tony
> If we get new ones logged in the meantime and userspace hasn't managed > to consume and delete the present ones yet, we overwrite the oldest ones > and set MCE_OVERFLOW like mce_log does now for mcelog. And that's no > difference in functionality than what we have now. U. No.

RE: drm/mgag200: doesn't work in panic context

2015-06-26 Thread Luck, Tony
>> I'm here to report two panics which hang forever (the machine cannot >> reboot). It is because mgag200 doesn't work in panic context. It sleeps and >> allocates memory non-atomically. > > This is the same for all drm drivers, the drm atomic handling with > fbcon/fbdev is totally broken. It wou

RE: [RFC PATCH 10/12] mm: add the buddy system interface

2015-06-26 Thread Luck, Tony
> gfpflags_to_migratetype() > if (memory_mirror_enabled()) { /* We want to mirror all unmovable pages */ > if (!(gfp_mask & __GFP_MOVABLE)) >return MIGRATE_MIRROR > } I'm not sure that we can divide memory into just two buckets of "mirrored" and "movable". My expectation is

RE: [RFC v2 PATCH 7/8] mm: add the buddy system interface

2015-06-29 Thread Luck, Tony
> @@ -814,7 +814,7 @@ int __init_memblock memblock_clear_hotplug(phys_addr_t > base, phys_addr_t size) > */ > int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size) > { > - system_has_some_mirror = true; > + static_key_slow_inc(&system_has_mirror); > > retu

Re: [PATCH v4 4/4] Use 2GB memory block size on large-memory x86-64 systems

2015-08-21 Thread Luck, Tony
On Tue, Nov 04, 2014 at 04:29:44PM +0800, Daniel J Blueman wrote: > On large-memory x86-64 systems of 64GB or more with memory hot-plug > enabled, use a 2GB memory block size. Eg with 64GB memory, this reduces > the number of directories in /sys/devices/system/memory from 512 to 32, > making it mor

Re: [PATCH v4 4/4] Use 2GB memory block size on large-memory x86-64 systems

2015-08-21 Thread Luck, Tony
On Fri, Aug 21, 2015 at 11:38:13AM -0700, Yinghai Lu wrote: > That commit could be reverted. > According to > https://lkml.org/lkml/2014/11/10/123 Do we really need to force the MIN_MEMORY_BLOCK_SIZE on small systems? What about this patch - which just uses max_pfn to choose the block size. It s

RE: [PATCH] x86/mce: fix failed to reenable cmci when swiching to interrupt mode

2015-08-11 Thread Luck, Tony
> Well, ok, but do it differently, please: rename > cmci_storm_disable_banks() to cmci_storm_switch_banks(bool on) which > turns them on and off. Unless Tony has a better suggestion... I like the boolean argument ... but not the "switch_banks" name. It sounds more like we are juggling between bank

RE: [PATCH 0/2] EDAC: Fix sysfs dimm_label show & store operations

2015-09-21 Thread Luck, Tony
Toshi Kani (2): 1/2 EDAC: Fix sysfs dimm_label show operation 2/2 EDAC: Fix sysfs dimm_label store operation Looks good. Both parts: Acked-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majord

RE: [PATCH] acpi/apei: free lists of resources in einj and erst

2015-07-10 Thread Luck, Tony
--- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -379,10 +379,9 @@ static int __einj_error_trigger(u64 trigger_paddr, u32 type, rc = apei_resources_add(&addr_resources, trigger_param_region->address,

[GIT PULL] x86/ras material for 4.3 queue

2015-07-10 Thread Luck, Tony
Some of these almost made it into 4.2, then we found a bug and delayed to fix it. Bug fixes have now been merged back into the original patch series. The following changes since commit d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754: Linux 4.2-rc1 (2015-07-05 11:01:52 -0700) are available in the git

Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks

2015-12-25 Thread Luck, Tony
mce_in_kernel_recov() should check whether we have a fix up entry for the specific IP that hit the machine check before rating the severity as kernel recoverable. If we add more functions (for different cache behaviour, or to optimize for specific processor model) we can make sure to put them a

RE: [RFC] Dump interesting arch/platform info

2016-02-01 Thread Luck, Tony
> So the use case is you go, modprobe this thing and do > > cat /sys/kernel/debug/x86/archinfo > > Right now, it dumps only the GDT and even that is not fully done. But > more interesting stuff should be added to it as need arises. > > Thoughts, ideas, suggestions? On a high core count, multi-sock

RE: [PATCHV2 1/3] x86, ras: Add new infrastructure for machine check fixup tables

2015-12-11 Thread Luck, Tony
>> + regs->ip = new_ip; >> + regs->ax = BIT(63) | addr; > > Can this be an actual #define? Doh! Yes, of course. That would be much better. Now I need to think of a good name for it. -Tony

RE: [PATCHV2 3/3] x86, ras: Add mcsafe_memcpy() function to recover from machine checks

2015-12-11 Thread Luck, Tony
> I still don't get the BIT(63) thing. Can you explain it? It will be more obvious when I get around to writing copy_from_user(). Then we will have a function that can take page faults if there are pages that are not present. If the page faults can't be fixed we have a -EFAULT condition. We can

RE: [PATCHV2 3/3] x86, ras: Add mcsafe_memcpy() function to recover from machine checks

2015-12-11 Thread Luck, Tony
> I'm missing something, though. The normal fixup_exception path > doesn't touch rax at all. The memory_failure path does. But couldn't > you distinguish them by just pointing the exception handlers at > different landing pads? Perhaps I'm just trying to take a short cut to avoid writing some c

RE: [PATCHV2 3/3] x86, ras: Add mcsafe_memcpy() function to recover from machine checks

2015-12-11 Thread Luck, Tony
> Also, are there really PCOMMIT-capable CPUs that still forcibly > broadcast MCE? If, so, that's unfortunate. PCOMMIT and LMCE arrive together ... though BIOS is in the decision path to enable LMCE, so it is possible that some systems could still broadcast if the BIOS writer decides to not allow

RE: [PATCHV2 3/3] x86, ras: Add mcsafe_memcpy() function to recover from machine checks

2015-12-11 Thread Luck, Tony
>> But a machine check safe copy_from_user() would be useful >> current generation cpus that broadcast all the time. > > Fair enough. Thanks for spending the time to look at this. Coaxing me to re-write the tail of do_machine_check() has made that code much better. Too many years of one patch on

Re: [PATCH -v2] x86: Add an archinfo dumper module

2016-02-04 Thread Luck, Tony
On Thu, Feb 04, 2016 at 04:22:35PM +0100, Borislav Petkov wrote: > Here's v2 with the stuff we talked about, implemented. I've added > 'control_regs' file too so that you can do: > > $ cat /sys/kernel/debug/x86/archinfo/control_regs > CR4: [-|-|SMEP|OSXSAVE|-|-|-|-|OSXMMEXCPT|OSFXSR|-|PGE|MCE|PAE|

Re: [PATCH -v2] x86: Add an archinfo dumper module

2016-02-05 Thread Luck, Tony
Patch on top of your v2 that defines a register priting function based on a single string format descriptor. CR4 changed to use it. arch/x86/kernel/archinfo.c | 135 ++--- 1 file changed, 114 insertions(+), 21 deletions(-) We could break even on code lin

RE: [PATCH -v2] x86: Add an archinfo dumper module

2016-02-05 Thread Luck, Tony
+ show_reg_bits(m, "CR4", cr4_format, cr4); Can %pXX formats use more than one argument? If so, we might be able to move all my code to lib/vsprintf.c and just write: seq_printf(m, "CR4: %pBITS: 0x%lx\n", cr4_format, cr4, cr4); If they can't, we could bundle the format and value

RE: [PATCH v9 2/4] x86, mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries

2016-02-03 Thread Luck, Tony
>> which is used to indicate that the machine check was triggered by code >> in the kernel with a EXTABLE_CLASS_FAULT fixup entry. > > I think that the EXTABLE_CLASS_FAULT references no longer match the code. You'd think that checkpatch could have spotted that the commit comment mentions an identi

Re: [PATCH v3 05/17] ia64: Set System RAM type and descriptor

2016-01-05 Thread Luck, Tony
On Tue, Jan 05, 2016 at 11:54:29AM -0700, Toshi Kani wrote: > Change efi_initialize_iomem_resources() to set 'flags' and 'desc' > from EFI memory types. IORESOURCE_SYSRAM, a modifier bit, is > set to 'flags' for System RAM as IORESOURCE_MEM is already set. > IORESOURCE_SYSTEM_RAM is defined as (IO

Re: [PATCH v7 3/3] x86, mce: Add __mcsafe_copy()

2016-01-05 Thread Luck, Tony
You were heading towards: ld: undefined __mcsafe_copy since that is also inside the #ifdef. Weren't you going to "select" this? I'm seriously wondering whether the ifdef still makes sense. Now I don't have an extra exception table and routines to sort/search/fixup, it doesn't seem as useful

RE: [PATCH v7 3/3] x86, mce: Add __mcsafe_copy()

2016-01-06 Thread Luck, Tony
>> I do select it, but by randconfig I still need to handle the >> CONFIG_X86_MCE=n case. >> >>> I'm seriously wondering whether the ifdef still makes sense. Now I don't >>> have an extra exception table and routines to sort/search/fixup, it doesn't >>> seem as useful as it was a few iterations a

Re: [PATCH v7 1/3] x86: Add classes to exception tables

2016-01-06 Thread Luck, Tony
On Wed, Jan 06, 2016 at 01:33:46PM +0100, Borislav Petkov wrote: > On Wed, Dec 30, 2015 at 09:59:29AM -0800, Tony Luck wrote: > > Starting with a patch from Andy Lutomirski > > that used linker relocation trickery to free up a couple of bits > > in the "fixup" field of the exception table (and gen

Re: [PATCH] EDAC, pnd2: set MCE_PRIO_EDAC priority for pnd2_mce_dec notifier

2020-06-10 Thread Luck, Tony
On Wed, Jun 10, 2020 at 02:58:45PM +0800, Zhenzhong Duan wrote: > ...or else it has MCE_PRIO_LOWEST priority by default. > > Signed-off-by: Zhenzhong Duan > --- > drivers/edac/pnd2_edac.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/edac/pnd2_edac.c b/drivers/edac/pnd2_edac.

Re: [PATCH] EDAC/mc: call edac_inc_ue_error() before panic

2020-06-10 Thread Luck, Tony
On Wed, Jun 10, 2020 at 02:58:46PM +0800, Zhenzhong Duan wrote: > By calling edac_inc_ue_error() before panic, we get a correct UE error > count for core dump analysis. Looks accurate, and I'll add the patch to be applied. But I wonder how big a problem it is. Isn't most of the information derivea

Re: [PATCH] x86/mce: fix a wrong assignment of i_mce.status

2020-06-11 Thread Luck, Tony
+Yazen On Thu, Jun 11, 2020 at 10:32:38AM +0800, Zhenzhong Duan wrote: > The original code is a nop as i_mce.status is or'ed with part of itself, > fix it. > > Signed-off-by: Zhenzhong Duan > --- > arch/x86/kernel/cpu/mce/inject.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > dif

RE: [PATCH v3 09/10] kallsyms: Hide layout

2020-07-07 Thread Luck, Tony
> Signed-off-by: Kristen Carlson Accardi > Reviewed-by: Tony Luck > Tested-by: Tony Luck I'll happily review and test again ... but since you've made substantive changes, you should drop these tags until I do. FWIW I think random order is a good idea. Do you shuffle once? Or every time somebo

RE: [PATCH v2 0/4] Expose new features for intel processor

2020-07-07 Thread Luck, Tony
>Cathy Zhang (2): > x86: Expose SERIALIZE for supported cpuid > x86: Expose TSX Suspend Load Address Tracking Having separate patches for adding the X86_FEATURE bits is fine (provides space in the commit log to document what each is for). In this case it also preserves the "Author" of each. But

RE: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Luck, Tony
>> The heuristic always initializes the MSR with the per mm PASID IIF the >> mm has a valid PASID but the MSR doesn't have one. This heuristic usually >> happens only once on the first #GP in a thread. > > But it doesn't guarantee the PASID is the right one. Suppose both the mm > has a PASID and th

Re: [PATCH v2 6/7] x86/mce: Recover from poison found while copying from user space

2020-10-05 Thread Luck, Tony
On Mon, Oct 05, 2020 at 06:32:47PM +0200, Borislav Petkov wrote: > On Wed, Sep 30, 2020 at 04:26:10PM -0700, Tony Luck wrote: > > arch/x86/kernel/cpu/mce/core.c | 33 + > > include/linux/sched.h | 2 ++ > > 2 files changed, 23 insertions(+), 12 deletions(-

RE: [RFC PATCH 0/7] RAS/CEC: Extend CEC for errors count check on short time period

2020-10-02 Thread Luck, Tony
> Because from my x86 CPUs limited experience, the cache arrays are mostly > fine and errors reported there are not something that happens very > frequently so we don't even need to collect and count those. On Intel X86 we leave the counting and threshold decisions about cache health to the hardwa

Re: [PATCH 0/3] x86: Add initial support to discover Intel hybrid CPUs

2020-10-02 Thread Luck, Tony
On Sat, Oct 03, 2020 at 03:39:29AM +0200, Thomas Gleixner wrote: > On Fri, Oct 02 2020 at 13:19, Ricardo Neri wrote: > > Add support to discover and enumerate CPUs in Intel hybrid parts. A hybrid > > part has CPUs with more than one type of micro-architecture. Thus, certain > > features may only be

RE: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-25 Thread Luck, Tony
> These 2 variables are accessed in 2 hot call stacks (for this 288 CPU > Xeon Phi platform): This might be the key element of "weirdness" for this system. It has 288 CPUs ... cache alignment problems are often not too bad on "small" systems. The as you scale up to bigger machines you suddenly hit

RE: TDX #VE in SYSCALL gap (was: [RFD] x86: Curing the exception and syscall trainwreck in hardware)

2020-08-25 Thread Luck, Tony
> > Or malicious hypervisor action, and that's a problem. > > > > Suppose the hypervisor remaps a GPA used in the SYSCALL gap (e.g. the > > actual SYSCALL text or the first memory it accesses -- I don't have a > > TDX spec so I don't know the details). Is it feasible to defend against a malicious

RE: [PATCH v3 4/6] x86/mce: Avoid tail copy when machine check terminated a copy from user

2020-10-07 Thread Luck, Tony
>> Machine checks are more serious. Just give up at the point where the >> main copy loop triggered the #MC and return from the copy code as if >> the copy succeeded. The machine check handler will use task_work_add() to >> make sure that the task is sent a SIGBUS. > > Isn't that just plain wrong?

[PLEASE PULL] ia64 for v5.10

2020-10-12 Thread Luck, Tony
The following changes since commit f4d51dffc6c01a9e94650d95ce0104964f8ae822: Linux 5.9-rc4 (2020-09-06 17:11:40 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/ia64_for_5.10 for you to fetch changes up to c331649e637152788b0c

<    5   6   7   8   9   10   11   12   >