Re: [Intel-gfx] [PATCH 0/8] DPF (GPU l3 parity detection) improvements

2013-09-17 Thread Bell, Bryan J
ehalf Of Daniel Vetter Sent: Tuesday, September 17, 2013 12:28 AM To: Widawsky, Benjamin Cc: Bell, Bryan J; intel-gfx@lists.freedesktop.org; Venkatesh, Vishnu Subject: Re: [Intel-gfx] [PATCH 0/8] DPF (GPU l3 parity detection) improvements On Tue, Sep 17, 2013 at 6:15 AM, Ben Widawsky wrote: >

Re: [Intel-gfx] [PATCH 0/8] DPF (GPU l3 parity detection) improvements

2013-09-17 Thread Daniel Vetter
On Tue, Sep 17, 2013 at 6:15 AM, Ben Widawsky wrote: > I see. I had thought the hang bit was part of the test injection, when > it's actually modifying the behavior or L3 errors. Any opinions on > what the default should be (agreed that policy should be controlled by > user space, but we can contr

Re: [Intel-gfx] [PATCH 0/8] DPF (GPU l3 parity detection) improvements

2013-09-17 Thread Ben Widawsky
I see. I had thought the hang bit was part of the test injection, when it's actually modifying the behavior or L3 errors. Any opinions on what the default should be (agreed that policy should be controlled by user space, but we can control the default)? What does a "hang" mean exactly, is the rest

Re: [Intel-gfx] [PATCH 0/8] DPF (GPU l3 parity detection) improvements

2013-09-16 Thread Bell, Bryan J
The "hang" injection is for the scenarios like: (1) L3 error occurs (2) Workload completion, reported to user mode driver, e.g. OpenCL (3) L3 error interrupt, handled. If (2) occurs before (3), it's possible to report that a GPGPU workload successfully completed when in fact it did not due to the

Re: [Intel-gfx] [PATCH 0/8] DPF (GPU l3 parity detection) improvements

2013-09-13 Thread Ville Syrjälä
For patches 1,2,3,6,7: Reviewed-by: Ville Syrjälä -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 0/8] DPF (GPU l3 parity detection) improvements

2013-09-12 Thread Ben Widawsky
Since IVB, our driver has supported GPU L3 cacheline remapping for parity errors. This is known as, "DPF" for Dynamic Parity Feature. I am told such an error is a good predictor for a subsequent error in the same part of the cache. To address this possible issue for workloads requiring precise and