Francisco, > -----Original Message----- > From: Francisco Jerez [mailto:curroje...@riseup.net] > Sent: Friday, July 21, 2017 12:21 AM > To: Marathe, Yogesh <yogesh.mara...@intel.com>; Muthukumar, Aravindan > <aravindan.muthuku...@intel.com>; mesa-dev@lists.freedesktop.org > Cc: Muthukumar, Aravindan <aravindan.muthuku...@intel.com> > Subject: RE: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks > > "Marathe, Yogesh" <yogesh.mara...@intel.com> writes: > > > Francisco, > > > >> -----Original Message----- > >> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On > >> Behalf Of Francisco Jerez > >> Sent: Thursday, July 20, 2017 10:51 PM > >> To: Muthukumar, Aravindan <aravindan.muthuku...@intel.com>; mesa- > >> d...@lists.freedesktop.org > >> Cc: Muthukumar, Aravindan <aravindan.muthuku...@intel.com> > >> Subject: Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag > >> checks > >> > >> aravindan.muthuku...@intel.com writes: > >> > >> > From: Aravindan Muthukumar <aravindan.muthuku...@intel.com> > >> > > >> > This patch improves CPI Rate(Cycles per Instruction) and branch > >> > mispredict for i965. The function check_state() was showing CPI > >> > retired rate. > >> > > >> > Performance stats with android: > >> > CPI retired lowered by 28% (lower is better) Branch missprediction > >> > lowered by 13% (lower is better) 3DMark improved by 2% > >> > > >> > The dissassembly doesn't show difference, although above results > >> > were observed with patch. > >> > > >> > >> How did you determine that your results are not just statistical noise? > > > > No its not statistical noise. As commit msg mentions, we used metrics > > CPI retired rate, utilization, branch miss predict as metrics, these can be > measured using SEP on IA. > > It essentially enables event based sampling and we can measure these through > counters. > > > > How much variance do these metrics have? (particularly the overall score of > the > benchmark) How many times did you run the benchmark? >
2% to be exact, other stats are also present in commit message, the benchmark was run at least 5 times before concluding and more than that during experimenting. > > When we did the analysis of tests we were running, we found > > brw_upload_pipeline_state->check_state functions having bad CPI rates > > and hence we made changed there. The intention was always to reduce > > driver overhead, although this is miniscule effort. > > > >> Did you do some sort of significance testing? Which test, > >> significance level and sample size did you use? > > > > Sorry this is not something we have done, we tested on android > > functionality and perf only. Performance benchmark 3dmark and overall > > stability of the android system were used as tests. Kindly let us know > > if you have specific tests to be run and we would be happy to run that. > > > > What CPU did you get the numbers on? > > > > Broxton. > > > >> > >> Thank you. > >> > >> > Signed-off-by: Aravindan Muthukumar > >> > <aravindan.muthuku...@intel.com> > >> > Signedd-off-by: Yogesh Marathe <yogesh.mara...@intel.com> > >> > Tested-by: Asish <as...@intel.com> > >> > --- > >> > > >> > Changes since V1: > >> > - Removed memset() change > >> > - Changed commit message as per review comments > >> > > >> > src/mesa/drivers/dri/i965/brw_defines.h | 4 ++++ > >> > src/mesa/drivers/dri/i965/brw_state_upload.c | 12 ++++++++---- > >> > 2 files changed, 12 insertions(+), 4 deletions(-) > >> > > >> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > >> > b/src/mesa/drivers/dri/i965/brw_defines.h > >> > index 2a8dbf8..8c9a510 100644 > >> > --- a/src/mesa/drivers/dri/i965/brw_defines.h > >> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > >> > @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode > { # > >> > define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4) > >> > > >> > #endif > >> > + > >> > +/* Checking the state of mesa and brw before emitting atoms */ > >> > +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw)) > >> > + > >> > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > >> > b/src/mesa/drivers/dri/i965/brw_state_upload.c > >> > index acaa97e..1c8b969 100644 > >> > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > >> > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > >> > @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw, > >> > struct brw_state_flags *state, > >> > const struct brw_tracked_state *atom) { > >> > - if (check_state(state, &atom->dirty)) { > >> > atom->emit(brw); > >> > merge_ctx_state(brw, state); > >> > - } > >> > } > >> > > >> > static inline void > >> > @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context > *brw, > >> > const struct brw_tracked_state *atom = &atoms[i]; > >> > struct brw_state_flags generated; > >> > > >> > - check_and_emit_atom(brw, &state, atom); > >> > + /* Checking the state and emitting atoms */ > >> > + if (CHECK_BRW_STATE(state, atom->dirty)) { > >> > + check_and_emit_atom(brw, &state, atom); > >> > + } > >> > > >> > accumulate_state(&examined, &atom->dirty); > >> > > >> > @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context > *brw, > >> > for (i = 0; i < num_atoms; i++) { > >> > const struct brw_tracked_state *atom = &atoms[i]; > >> > > >> > - check_and_emit_atom(brw, &state, atom); > >> > + /* Checking the state and emitting atoms */ > >> > + if (CHECK_BRW_STATE(state, atom->dirty)) { > >> > + check_and_emit_atom(brw, &state, atom); > >> > + } > >> > } > >> > } > >> > > >> > -- > >> > 2.7.4 > >> > > >> > _______________________________________________ > >> > mesa-dev mailing list > >> > mesa-dev@lists.freedesktop.org > >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev