On Wed, 3 Mar 2021, Liang, Kan wrote:
> We never use bit 58. It should be a new issue.
> Is it repeatable?
yes, it's repeatable.
(which I'm glad to see because it looks suspiciously like a memory bit
flip)
Though since it's a WARN_ONCE I have to reboot each time I want to test.
If I get a c
Hello
on my Haswell machine the perf_fuzzer managed to trigger this message:
[117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to write
0x0400) at rIP: 0x8106e4f4 (native_write_msr+0x4/0x20)
[117248.089957] Call Trace:
[117248.092685] intel_pmu_pebs_enable_al
On Mon, 1 Mar 2021, Liang, Kan wrote:
> https://lore.kernel.org/lkml/tip-01330d7288e0050c5aaabc558059ff91589e6...@git.kernel.org/
> The patch is an SW workaround for some old CPUs (HSW and earlier), which may
> set 0 to the PEBS status. It adds a check in the intel_pmu_drain_pebs_nhm().
> It tries
On Thu, 11 Feb 2021, Liang, Kan wrote:
> > On Thu, Jan 28, 2021 at 02:49:47PM -0500, Vince Weaver wrote:
> I'd like to reproduce it on my machine.
> Is this issue only found in a Haswell client machine?
>
> To reproduce the issue, can I use ./perf_fuzzer under perf_eve
On Thu, 28 Jan 2021, Vince Weaver wrote:
> the perf_fuzzer has turned up a repeatable crash on my haswell system.
>
> addr2line is not being very helpful, it points to DECLARE_PER_CPU_FIRST.
> I'll investigate more when I have the chance.
so I poked around some more.
This seem
Hello
the perf_fuzzer has turned up a repeatable crash on my haswell system.
addr2line is not being very helpful, it points to DECLARE_PER_CPU_FIRST.
I'll investigate more when I have the chance.
Vince
[96289.009646] BUG: kernel NULL pointer dereference, address: 0150
[96289.017094]
On Fri, 26 Jul 2019, Arnaldo Carvalho de Melo wrote:
> Em Tue, Jul 23, 2019 at 04:42:30PM -0400, Vince Weaver escreveu:
> > my perf_tool_fuzzer has found another issue, this one a buffer overflow
> > in perf_header__read_build_ids. The build id filename is read in with a
>
Commit-ID: 3143906c2770778d89b730e0342b745d1b4a8303
Gitweb: https://git.kernel.org/tip/3143906c2770778d89b730e0342b745d1b4a8303
Author: Vince Weaver
AuthorDate: Thu, 1 Aug 2019 14:30:43 -0400
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 14 Aug 2019 10:59:59 -0300
perf.data
The perf.data file format documentation for HEADER_SAMPLE_TOPOLOGY
specifies the layout in a confusing manner that doesn't match the rest of
the document. This patch attempts to describe things consistent with the
rest of the file.
Signed-off-by: Vince Weaver
diff --git a/tools
Commit-ID: 2e9a06dda10aea81a17c623f08534dac6735434a
Gitweb: https://git.kernel.org/tip/2e9a06dda10aea81a17c623f08534dac6735434a
Author: Vince Weaver
AuthorDate: Thu, 25 Jul 2019 11:57:43 -0400
Committer: Arnaldo Carvalho de Melo
CommitDate: Mon, 29 Jul 2019 09:03:43 -0300
perf tools
Commit-ID: 7622236ceb167aa3857395f9bdaf871442aa467e
Gitweb: https://git.kernel.org/tip/7622236ceb167aa3857395f9bdaf871442aa467e
Author: Vince Weaver
AuthorDate: Tue, 23 Jul 2019 11:06:01 -0400
Committer: Arnaldo Carvalho de Melo
CommitDate: Mon, 29 Jul 2019 09:03:43 -0300
perf header
On Fri, 26 Jul 2019, Arnaldo Carvalho de Melo wrote:
> Em Fri, Jul 26, 2019 at 04:46:51PM -0400, Vince Weaver escreveu:
> >
> > Currently the perf_data_fuzzer causes perf report to get stuck in an
> > infinite loop.
> >
> > >From what I can tell, the iss
Currently the perf_data_fuzzer causes perf report to get stuck in an
infinite loop.
>From what I can tell, the issue happens in reader__process_events()
when an event is mapped using mmap(), but when it goes to process the
event finds out the internal event header has the size (invalidly) set t
probably all perf_header_strings are affected by this. The fuzzer just
tripped up cmdline now, which needs this fix.
Signed-off-by: Vince Weaver
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c24db7f4909c..631aa1911f3a 100644
--- a/tools/perf/util/header.c
+++ b
Hello,
the perf_data_fuzzer found an issue when strings have size 0.
malloc() in do_read_string() is happy to allocate a string of
size 0 but when code (in this case the pmu parser) tries to work with
those it will segfault.
Signed-off-by: Vince Weaver
diff --git a/tools/perf/util/header.c b
The perf.data-file-format documentation incorrectly says the
HEADER_TOTAL_MEM results are in bytes. The results are in kilobytes
(perf reads the value from /proc/meminfo)
Signed-off-by: Vince Weaver
diff --git a/tools/perf/Documentation/perf.data-file-format.txt
b/tools/perf/Documentation
, not sure if filename should be NUL
terminated or not.
Signed-off-by: Vince Weaver
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c24db7f4909c..9a893a26e678 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2001,6 +2001,9 @@ static int
Hello
so I have been having lots of trouble with hand-crafted perf.data files
causing segfaults and the like, so I have started fuzzing the perf tool.
First issue found:
If f_header.attr_size is 0 in the perf.data file, then perf will crash
with a divide-by-zero error.
Signed-off-by: Vince
On Wed, 19 Jun 2019, syzbot wrote:
> syzbot found the following crash on:
>
> HEAD commit:0011572c Merge branch 'for-5.2-fixes' of git://git.kernel...
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=12c38d66a0
> kernel config: https://syzkaller.apps
On Tue, 28 May 2019, Peter Zijlstra wrote:
> On Tue, May 28, 2019 at 09:33:40AM -0400, Liang, Kan wrote:
> > Uncore PMU doesn't support sampling. It will return -EINVAL.
> > There is no regs support for counting. The request will be ignored.
> >
> > I think current check for uncore is good enough
I've run the fuzzer overnight with both patches applied and have not seen
any issues.
Vince
On Wed, 22 May 2019, Liang, Kan wrote:
> XMM registers can only collected by hardware PEBS events. We should disable it
> for all software/probe events.
>
> Could you please try the patch as below?
I tested the patch (it was whitespace damaged for some reason, not
sure if that was on my end tho
The perf fuzzer caused my skylake machine to crash hard with the trace at
the end here. (this is with current git)
It appears to be happening in new code introduced by:
commit 878068ea270ea82767ff1d26c91583263c81fba0
Author: Kan Liang
Date: Tue Apr 2 12:44:59 2019 -0700
perf/x86: Supp
On Tue, 16 Apr 2019, tip-bot for Stephane Eranian wrote:
> Commit-ID: f447e4eb3ad1e60d173ca997fcb2ef2a66f12574
> Gitweb:
> https://git.kernel.org/tip/f447e4eb3ad1e60d173ca997fcb2ef2a66f12574
> Author: Stephane Eranian
> AuthorDate: Mon, 8 Apr 2019 10:32:52 -0700
> Committer: Ingo Molna
On Sun, 7 Apr 2019, Cyrill Gorcunov wrote:
> Vince, could you please disable alias events and see if it change
> anything, once you have time? Note once we've aliases disabled the
> counter for cpu cycles get used for NMI watchdog so PERF_COUNT_HW_CPU_CYCLES
> won't be available in "perf" tool its
On Thu, 4 Apr 2019, Cyrill Gorcunov wrote:
> On Thu, Apr 04, 2019 at 12:37:18PM -0400, Vince Weaver wrote:
>
> Oh, Vince, I suspect such kind of bisection might consume a lot of your
> time :( Maybe we could update perf fuzzer so that it would send events
> to some net-storage f
On Thu, 4 Apr 2019, Cyrill Gorcunov wrote:
> On Thu, Apr 04, 2019 at 09:25:47AM -0400, Vince Weaver wrote:
> >
> > It looks like there are at least two bugs here, one that's a full
> > hardlockup with nothing on serial console. The other is the NULL
> > derefer
On Wed, 3 Apr 2019, Cyrill Gorcunov wrote:
> On Wed, Apr 03, 2019 at 10:19:44PM +0300, Cyrill Gorcunov wrote:
> >
> > You know, seems I got what happened -- p4_general_events do
> > not cover all general events, they stop at PERF_COUNT_HW_BUS_CYCLES,
> > while more 3 general event left. This is '
so moving this to its own thread.
There was a two-part question asked.
1. Can the perf-fuzzer crash a Pentium 4 system
2. Does anyone care anymore?
The answer to #1 turns out to be "yes"
I'm not sure about #2 (but it's telling my p4 test system hadn't been
turned on in over 3 y
On Wed, 3 Apr 2019, Cyrill Gorcunov wrote:
> > Shame on Intel though for not providing perf JSON files for the
> > Pentium 4 event names.
>
> Mind to point me where json events should lay, I could try to convert
> names.
I was mostly joking about that. But the event lists are in the kernel
tr
gt; > Cc: Arnaldo Carvalho de Melo
> > Cc: Jiri Olsa
> > Cc: Linus Torvalds
> > Cc: Peter Zijlstra
> > Cc: Thomas Gleixner
> > Cc: Vince Weaver
> > Cc: to...@suse.com
> > Link:
> > https://lkml.kernel.org/r/20190321123849.gn6...@hirez.programmi
On Tue, 2 Apr 2019, Cyrill Gorcunov wrote:
> You know, running fuzzer on p4 might worth in anycase. As to potential
> problems to fix -- i could try find some time slot for, still quite
> limited too 'cause of many other duties :(
Well I fired up the Pentium 4
/dev/sda1 has gone 1457 day
On Tue, 2 Apr 2019, Cyrill Gorcunov wrote:
> On Tue, Apr 02, 2019 at 03:03:02PM +0200, Peter Zijlstra wrote:
> > I have vague memories of the P4 thing crashing with Vince's perf_fuzzer,
> > but maybe I'm wrong.
>
> No, you're correct. p4 was crashing many times before we manage to make
> it more-l
On Fri, 1 Feb 2019, Jiri Olsa wrote:
> >
> > I've just started fuzzing with the patch applied. Often it takes a few
> > hours to trigger the bug.
>
> cool, thanks
I let it run overnight and no crash.
> > Added question about this bug. It appeared that the crash was triggered
> > by the BTS
t the crash was triggered
by the BTS driver over-writing kernel memory. The data being written, was
this user controllable? Meaning, is this a security issue being fixed, or
just a crashing issue?
Vince Weaver
vincent.wea...@maine.edu
On Fri, 25 Jan 2019, Ravi Bangoria wrote:
> I'm seeing a system crash while running perf_fuzzer with upstream kernel
> on an Intel machine. I hit the crash twice (out of which I don't have log
> of first crash so don't know if the reason is same for both the crashes).
> I've attached my .config wi
On Fri, 18 Jan 2019, Peter Zijlstra wrote:
>
> You can actually use rdpmc when you attach to a CPU, but you have to
> ensure that the userspace component is guaranteed to run on that very
> CPU (sched_setaffinity(2) comes to mind).
unfortunately the HPC people using PAPI would probably be annoyed
On Fri, 18 Jan 2019, Peter Zijlstra wrote:
> On Fri, Jan 11, 2019 at 04:52:22PM -0500, Vince Weaver wrote:
> > On Thu, 10 Jan 2019, Vince Weaver wrote:
> >
> > > On Thu, 10 Jan 2019, Vince Weaver wrote:
> > >
> > > > On Thu, 10 Jan 2019, Vince We
On Thu, 10 Jan 2019, Vince Weaver wrote:
> On Thu, 10 Jan 2019, Vince Weaver wrote:
>
> > On Thu, 10 Jan 2019, Vince Weaver wrote:
> >
> > > However if you create an all-process attached to CPU event:
> > > perf_event_open(attr, -1, X, -1, 0);
> > >
On Thu, 10 Jan 2019, Vince Weaver wrote:
> On Thu, 10 Jan 2019, Vince Weaver wrote:
>
> > However if you create an all-process attached to CPU event:
> > perf_event_open(attr, -1, X, -1, 0);
> > the mmap event index is set as if this were a valid event and so the
On Thu, 10 Jan 2019, Vince Weaver wrote:
> However if you create an all-process attached to CPU event:
> perf_event_open(attr, -1, X, -1, 0);
> the mmap event index is set as if this were a valid event and so the rdpmc
> succeeds even though it shouldn't (we're trying
Hello
I think this is a bug turned up by PAPI. I've been trying to track down
where this happens in the perf_event code myself, but it might be faster
to just report it.
If you create a per-process attached to CPU event:
perf_event_open(attr, 0, X, -1, 0);
the mmap event index is set t
On Thu, 6 Dec 2018, Jiri Olsa wrote:
> On Thu, Dec 06, 2018 at 10:35:28AM -0500, Vince Weaver wrote:
> > On Wed, 5 Dec 2018, Jiri Olsa wrote:
> > Maybe it is a corruption issue. I had applied my own debug patch that
> > would dump some info if data->callchain was NUL
On Wed, 5 Dec 2018, Jiri Olsa wrote:
> On Wed, Dec 05, 2018 at 12:11:19PM -0500, Vince Weaver wrote:
> > On Wed, 5 Dec 2018, Jiri Olsa wrote:
> >
> > > On Wed, Dec 05, 2018 at 01:45:38PM +0100, Jiri Olsa wrote:
> > > > On Tue, Dec 04, 2018 at 10:54:55AM -0500,
On Wed, 5 Dec 2018, Jiri Olsa wrote:
> On Wed, Dec 05, 2018 at 01:45:38PM +0100, Jiri Olsa wrote:
> > On Tue, Dec 04, 2018 at 10:54:55AM -0500, Vince Weaver wrote:
> > > Hello,
> > >
> > > I was able to trigger another oops with the perf_fuzzer with curren
Hello,
I was able to trigger another oops with the perf_fuzzer with current git.
This is 4.20-rc5 after the fix for the very similar oops I previously
reported got committed.
It seems to be pointing to the same location in the source as
before, I guess maybe triggered a different way?
Unfortu
On Thu, 8 Nov 2018, Alexander Shishkin wrote:
> Vince Weaver writes:
>
> > On Thu, 8 Nov 2018, Vince Weaver wrote:
> >
> >> [91760.326510] BUG: unable to handle kernel NULL pointer dereference at
> >>
> >> [91760.334876] PGD 0 P4D 0
On Thu, 8 Nov 2018, Vince Weaver wrote:
> [91760.326510] BUG: unable to handle kernel NULL pointer dereference at
>
> [91760.334876] PGD 0 P4D 0
> [91760.337596] Oops: [#1] SMP PTI
> [91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW
2-rc0 *** by Vince Weaver
Linux version 4.20.0-rc1+ x86_64
Processor: Intel 6/60/3
Stopping after 3
Watchdog enabled with timeout 60s
Will auto-exit if signal storm detected
Seeding RNG from time 1541627285
To reproduce
On Mon, 27 Aug 2018, Peter Zijlstra wrote:
> Something like so then?
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index eeb787b1c53c..f35eb72739c0 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -144,7 +144,7 @@ e
On Fri, 24 Aug 2018, Peter Zijlstra wrote:
> > +++ b/include/uapi/linux/perf_event.h
> > @@ -143,6 +143,8 @@ enum perf_event_sample_format {
> > PERF_SAMPLE_PHYS_ADDR = 1U << 19,
> >
> > PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
> > +
> > + __P
I notice that Linux 4.18 has the following changeset which changes the
user visible perf_event.h file
commit 6cbc304f2f360f25cc8607817239d6f4a2fd3dc5
Author: Peter Zijlstra
Date: Thu May 10 15:48:41 2018 +0200
perf/x86/intel: Fix unwind errors from PEBS entries (m
interrupt-parent = <&local_intc>;
interrupts = <9 IRQ_TYPE_LEVEL_HIGH>;
};
Tested-by: Vince Weaver
Vince
On Thu, 17 May 2018, Vince Weaver wrote:
> On Thu, 17 May 2018, Peter Zijlstra wrote:
> with cortex-a7 now, would it be possible to later drop that if proper
> cortex-a53 support is added to the armv7 pmu driver? Or would that lead
> to all kinds of back-compatability mess?
F
On Thu, 17 May 2018, Peter Zijlstra wrote:
> On Thu, May 17, 2018 at 06:55:26PM +0200, Stefan Wahren wrote:
> > > Vince Weaver hat am 17. Mai 2018 um 18:34
> > > geschrieben:
> > > On Thu, 17 May 2018, Stefan Wahren wrote:
> > > > &
On Thu, 17 May 2018, Stefan Wahren wrote:
>
> > Eric Anholt hat am 17. Mai 2018 um 15:17 geschrieben:
> >
> >
> > The a53 and a7 counters seem to match up, so we advertise a7 so that
> > arm32 can probe.
so how closely did you look at the a53/a7 differences? I see some major
differences, es
On Thu, 17 May 2018, Eric Anholt wrote:
>
> Is that better than a53? I'm happy to switch to that. The important
> part to me is the a7, since basically everyone with this hw is running
> arm32.
no, on further investigation it looks like a53 is more proper to use than
the generic armv8.
Is the
On Thu, 17 May 2018, Eric Anholt wrote:
> diff --git a/arch/arm/boot/dts/bcm2837.dtsi b/arch/arm/boot/dts/bcm2837.dtsi
> index 7704bb029605..1f5e5c782835 100644
> --- a/arch/arm/boot/dts/bcm2837.dtsi
> +++ b/arch/arm/boot/dts/bcm2837.dtsi
> @@ -17,6 +17,12 @@
> };
> };
>
> +
On Fri, 4 May 2018, Josh Poimboeuf wrote:
>
> The 'nmi_restore' warning points to a bug in my patch, but the others
> are head scratchers. Here's a patch which combines the first two
> patches, plus improves the existing warnings a bit. Can you try it?
with that updated patch I hit
May 4 21:5
On Fri, 4 May 2018, Josh Poimboeuf wrote:
> Also, any tips for reproducing this locally? I cloned the perf fuzzer
> github. Is it as simple as just "make" and "./run_tests.sh"?
run_tests only runs the perf_event regressiong tests.
To run the fuzzer, enter the "fuzzer" directory and either run
On Wed, 2 May 2018, Josh Poimboeuf wrote:
> After looking closer, I realized that at least some of these warnings
> are due to bad unwind hints in the entry code. Can you try this patch
> instead of the last one?
with just this new patch applied I still get warnings such as this:
[ 469.436218]
On Tue, 1 May 2018, Josh Poimboeuf wrote:
> Can you try the following patch?
I applied the patch, but the warnings don't really look that different.
[ 62.220322] WARNING: stack recursion on stack type 4
[ 62.220326] WARNING: can't dereference registers at 9ca2e86d for ip
swapgs_rest
Hello
I reported this back in January, but I think it got lost since everyone
was busy with other more pressing matters.
But in any case, the perf_fuzzer still can trigger these type of messages
and just wanted to see if they were a cause for concern, or just noise.
[66620.496076] WARNING: can
On Fri, 20 Apr 2018, Vince Weaver wrote:
> > AFAICT it works on Power and possibly ARM.
>
> at least some ARMs are a bit more honest about it than x86
>
> ivybridge:
> Performance counter stats for '/bin/ls':
> 1,368,162 instructions
>
On Fri, 20 Apr 2018, Peter Zijlstra wrote:
> On Wed, Apr 18, 2018 at 11:10:20AM -0400, Vince Weaver wrote:
> > On Tue, 17 Apr 2018, Jiri Olsa wrote:
> >
> > > On Mon, Apr 16, 2018 at 10:04:53PM +, Stephane Eranian wrote:
> > > > Hi,
> > >
On Tue, 17 Apr 2018, Jiri Olsa wrote:
> On Mon, Apr 16, 2018 at 10:04:53PM +, Stephane Eranian wrote:
> > Hi,
> >
> > I am trying to understand what the exclude_idle event attribute is supposed
> > to accomplish.
> > As per the definition in the header file:
> >
> > exclude_idle : 1,
Author: Song Liu
Date: Wed Dec 6 14:45:15 2017 -0800
When running the perf_fuzzer on a current git checkout my logs are flooded
with messages such as this:
[71487.869077] trace_kprobe: Could not insert probe at unknown+0: -22
[71488.174479] trace_kprobe: Could not insert probe at unknown+0: -2
On Fri, 9 Mar 2018, Peter Zijlstra wrote:
> On Fri, Mar 09, 2018 at 09:31:11AM -0500, Vince Weaver wrote:
> > On Fri, 9 Mar 2018, tip-bot for Kan Liang wrote:
> >
> > > Commit-ID: 1af22eba248efe2de25658041a80a3d40fb3e92e
> > > Gitweb:
t threshold is 1, in which
> case user-space RDPMC works well even with auto-reload events.
>
> Signed-off-by: Kan Liang
> Signed-off-by: Peter Zijlstra (Intel)
> Cc: Alexander Shishkin
> Cc: Arnaldo Carvalho de Melo
> Cc: Jiri Olsa
> Cc: Linus Torvalds
> Cc: Peter Zij
On Thu, 11 Jan 2018, Peter Zijlstra wrote:
> It makes my IVB very ill, starts spewing RCU stall warnings, but is
> otherwise very unresponsive.
>
> Awesome... I'll prod at it when my brain works again.
>
Not sure if it's related, but I hit this on the core2 machine fuzzing
overnight with "pti=
On Thu, 11 Jan 2018, Vince Weaver wrote:
> On Thu, 11 Jan 2018, Peter Zijlstra wrote:
>
> > On Thu, Jan 11, 2018 at 01:21:12PM -0600, Josh Poimboeuf wrote:
> > > Yuck. This time it was stack recursion on the entry stack. In the
> > > previous error, recursion
On Thu, 11 Jan 2018, Vince Weaver wrote:
> Not sure if this info helps, but if I make perf_fuzzer *not* create AUX
> mmap() buffers, I'm unable to reproduce the hangs both on core2 and
> haswell.
Confirmed, I can crash the system without the fuzzer, just by doing
per
Not sure if this info helps, but if I make perf_fuzzer *not* create AUX
mmap() buffers, I'm unable to reproduce the hangs both on core2 and
haswell.
Vince
On Thu, 11 Jan 2018, Peter Zijlstra wrote:
> On Thu, Jan 11, 2018 at 01:21:12PM -0600, Josh Poimboeuf wrote:
> > Yuck. This time it was stack recursion on the entry stack. In the
> > previous error, recursion was detected on the IRQ stack. Otherwise they
> > look quite similar.
> >
> > Was tha
On Wed, 10 Jan 2018, Josh Poimboeuf wrote:
> For the crash, you might try enabling CONFIG_DEBUG_ENTRY and seeing if
> that gives you any output.
I did enable that, didn't seem to help on the haswell machien at least.
> > > > WARNING: can't dereference iret registers at 0783fea8 for ip
>
On Thu, 11 Jan 2018, Vince Weaver wrote:
> on the same core2 machine I got this which didn't crash the machine (but
> the perf_fuzzer process is stuck)
also got this one:
Cannot open /sys/kernel/tracing/kprobe_events
[ 408.159243] watchdog: BUG: soft lockup - CPU#1 st
on the same core2 machine I got this which didn't crash the machine (but
the perf_fuzzer process is stuck)
[ 4592.608066] INFO: task systemd-logind:488 blocked for more than 120 seconds.
[ 4592.615159] Not tainted 4.15.0-rc7+ #211
[ 4592.619648] "echo 0 > /proc/sys/kernel/hung_task_timeout
On Thu, 11 Jan 2018, Vince Weaver wrote:
> On Thu, 11 Jan 2018, Peter Zijlstra wrote:
>
> > OK, I'm going to try fuzzing as a user with paranoid=0, and if that
> > doesn't help, I'm going to switch to linus' tree with my patches on.
>
> OK, I'
On Thu, 11 Jan 2018, Peter Zijlstra wrote:
> OK, I'm going to try fuzzing as a user with paranoid=0, and if that
> doesn't help, I'm going to switch to linus' tree with my patches on.
OK, I'm fuzzing on a core2 machine and it locks up too.
It did give the following first (but it kept going for a
On Thu, 11 Jan 2018, Peter Zijlstra wrote:
> I'm seeing things like:
>
> Cannot open /sys/kernel/tracing/kprobe_events
>
> this is likely caused by me not having anything mounted there. Rostedt
> provided the magic incantation to make that work, I'll go try now.
The perf_fuzzer krpobe code is p
On Tue, 9 Jan 2018, Vince Weaver wrote:
> Also I managed to hit (presumably) the same bug on a skylake machine.
> That one doesn't have a serial cable hooked up to it, I'll try to see if I
> can find one.
>
> I am running debian-unstable with gcc 7.2 if it makes a
I tried it again with your patch and force_early_printk, no luck.
I can start dropping printks around the NMI code but I feel like I don't
really know what I'm doing.
Also I managed to hit (presumably) the same bug on a skylake machine.
That one doesn't have a serial cable hooked up to it, I'
On Tue, 9 Jan 2018, Peter Zijlstra wrote:
> So CONFIG_PAGE_TABLE_ISOLATION=y and booting with "pti=off" makes it
> 'work', right?
yes. Previously I was changing CONFIG_PAGE_TABLE_ISOLATION and
recompiling, but just now I booted with it set to yes and pti=off and the
fuzzer has been running fin
On Tue, 9 Jan 2018, Peter Zijlstra wrote:
> > I'll try your patch and see if it makes a difference.
>
> I suspect not, it shouldn't be PTI specific.
yes, applying your patch didn't help, still locks up on the Haswell
machine.
Is there any debugging I could turn on that would help? I tried KAS
On Tue, 9 Jan 2018, Peter Zijlstra wrote:
> So remind me again, how are you running that fuzzer? I'm running
> ./fast_repro99.sh as root.
I'm running ./fast_repro98.sh on a regular haswell machine with paranoid
set to "0".
I'll try your patch and see if it makes a difference.
I can also try on
On Mon, 8 Jan 2018, Ingo Molnar wrote:
>
> Note that the page table isolation (PTI) feature has a number of effects on
> perf
> and on NMI handlers, so one of the things to try would be to disable PTI.
Yes, it seems to be a KPTI issue.
With KPTI disabled I can fuzz for a few hours, no problems
Hello
Was trying out current git (4.15-rc7) and the perf_fuzzer very quickly
will lock up my Haswell test machine so solidly that I don't get any debug
info, even with a serial console.
I'll try enabling various debug options to see if I can get a more useful
bug report.
Vince
On Tue, 19 Sep 2017, Eric W. Biederman wrote:
> When sorting out the si_code ambiguity fcntl I accidentally overshot and
> included SIGPOLL as well. Ooops! This is my trivial fix for that.
>
> Vince Weaver caught this when it landed in your tree with his
> perf_event_test
On Wed, 13 Sep 2017, Vince Weaver wrote:
> I just compiled up a fresh git kernel and all of the perf_event_test
> overflow tests are failing.
>
> The reason is that instead of getting POLL_IN or POLL_HUP sources as
> expected, they are getting weird results in si_code of "
Hello
I just compiled up a fresh git kernel and all of the perf_event_test
overflow tests are failing.
The reason is that instead of getting POLL_IN or POLL_HUP sources as
expected, they are getting weird results in si_code of "-5".
I haven't had time to bisect this, but I do notice that some
On Thu, 31 Aug 2017, Peter Zijlstra wrote:
> So the below completely rewrites timekeeping (and probably breaks
> world) but does away with the need to touch events that don't get
> scheduled.
>
> Esp the cgroup stuff is entirely untested since I simply don't know how
> to operate that. I did run
On Fri, 11 Aug 2017, Mark Rutland wrote:
> > This isn't some key thing that needs to be fixed, I was just curious about
> > the behavior difference between x86 and ARM.
>
> Sure; likewise I'm curious.
well I finally got a current git 64-bit kernel booted on the pi3.
Challenge: USB known to be
On Fri, 11 Aug 2017, Mark Rutland wrote:
> IIRC, patches were sent back in 2014, but as I mentioned above, those
> were far from suitable for upstream, even ignoring cases like
> big.LITTLE. Said patches were never reworked and reposted.
Here's the commit message in the perf_event_tests tree, hav
On Fri, 11 Aug 2017, Mark Rutland wrote:
> IIUC by 'rdpmc' you mean direct userspace counter access?
>
> Patches for that never made it upstream. Last I saw, there were no
> patches in a suitable state for review.
yes, someone from Linaro sent me some code a while back that implemented
the user
So I was working on my perf_event_tests on ARM/ARM64 (the end goal was to
get ARM64 rdpmc support working, but apparently those patches never made
it upstream?)
anyway one test was failing due to an x86/arm difference, which is
possibly only tangentially perf related.
On x86 you can mmap() a
On Fri, 4 Aug 2017, Peter Zijlstra wrote:
> Testing if userspace rdpmc reads are supported... NEW BEHAVIOR
> Testing if rdpmc fallback works on sw events...PASSED
> Testing if userspace rdpmc reads give expected results... PASSED
>
> is that 'NEW BEHAVIOR' thing someth
On Wed, 2 Aug 2017, Peter Zijlstra wrote:
> Playing with that test it really is the IOC_DISABLE while STOP'ed that
> messes things up.
>
> Ah.. magic.. the below seems to fix things, hopefully it doesn't break
> anything else.
yes, I've tested this and it seems to fix things.
With both this and
On Wed, 2 Aug 2017, Peter Zijlstra wrote:
> On Wed, Jul 26, 2017 at 03:39:01PM -0400, Vince Weaver wrote:
> > In fact, current->mm->context.perf_rdpmc_allowed goes negative which seems
> > like it shouldn't happen?
>
> Good find that...
>
> The below
On Tue, 1 Aug 2017, Naveen N. Rao wrote:
> Add a new option 'signal_on_wakeup' to request for a signal to be
> delivered on ring buffer wakeup controlled through watermark and
> {wakeup_events, wakeup_watermark}. HUP is signaled on exit.
>
> Setting signal_on_wakeup disables use of IOC_REFRESH to
Hello
so one last bug found by the PAPI testsuite.
This one involves the rdpmc auto-disable on last unmap of an event
feature.
Failing test case:
fd=perf_event_open();
addr=mmap(fd);
exec() // without closing or unmapping the event
fd=perf_event_open();
1 - 100 of 804 matches
Mail list logo