Em Fri, Mar 08, 2019 at 11:19:46AM +0100, Martin Liška escreveu: > PING^1
Collected Boris's Acked-by and the original cset commit log and the patch at the end of the message, next time please resubmit with a: [PATCH v2] proper summary proper description collect acks s-o-b Thanks, - Arnaldo > On 2/13/19 12:19 PM, Martin Liška wrote: > > Hi. > > > > I'm sending updated version of the patch. I'm going to document particular > > changes > > in quotes below: > > > > On 8/23/18 6:16 PM, William Cohen wrote: > >> On 08/23/2018 10:31 AM, Arnaldo Carvalho de Melo wrote: > >>> Em Thu, Aug 23, 2018 at 01:21:45PM +0200, Martin Liška escreveu: > >>>> May I please ping this. > >>> I was waiting for someone to give some ack, perhaps Will Cohen can take > >>> a brief look and provide that? Will? > >>> > >>> Thanks, > >>> > >>> - Arnaldo > >>> > >>>> Thanks, > >>>> Martin > >>>> > >>>> On 08/06/2018 10:42 AM, Martin Liška wrote: > >>>>> Hello. > >>>>> > >>>>> Following patch adds PMC events for AMD Family 17 CPUs as defined in > >>>>> [1]. > >>>>> It covers events described in section: 2.1.13. Regex pattern in > >>>>> mapfile.csv > >>>>> covers all CPUs of the family. > >>>>> > >>>>> Thanks, > >>>>> Martin > >>>>> > >>>>> [1] > >>>>> https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf > >>>>> > >>>>> Signed-off-by: Martin Liška <mli...@suse.cz> > >>>>> > >>>>> --- > >>>>> .../pmu-events/arch/x86/amdfam17h/cache.json | 332 ++++++++++++++++++ > >>>>> .../pmu-events/arch/x86/amdfam17h/core.json | 124 +++++++ > >>>>> .../arch/x86/amdfam17h/floating-point.json | 196 +++++++++++ > >>>>> .../pmu-events/arch/x86/amdfam17h/memory.json | 225 ++++++++++++ > >>>>> .../pmu-events/arch/x86/amdfam17h/other.json | 51 +++ > >>>>> tools/perf/pmu-events/arch/x86/mapfile.csv | 1 + > >>>>> 6 files changed, 929 insertions(+) > >>>>> create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/cache.json > >>>>> create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/core.json > >>>>> create mode 100644 > >>>>> tools/perf/pmu-events/arch/x86/amdfam17h/floating-point.json > >>>>> create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/memory.json > >>>>> create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/other.json > >>>>> > >>>>> > >> Hi, > >> > >> I had already deleted the patch from my mailbox earlier, so I downloaded > >> the patch from the archive and added some inline comments to the attached > >> patch. > >> > >> -Will > >> > >> > >> > >> Hello. > >> > >> Following patch adds PMC events for AMD Family 17 CPUs as defined in [1]. > >> It covers events described in section: 2.1.13. Regex pattern in mapfile.csv > >> covers all CPUs of the family. > >> > >> Thanks, > >> Martin > >> > >> [1] > >> https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf > >> > >> Signed-off-by: Martin Liška <mli...@suse.cz> > >> > >> --- > >> .../pmu-events/arch/x86/amdfam17h/cache.json | 332 ++++++++++++++++++ > >> .../pmu-events/arch/x86/amdfam17h/core.json | 124 +++++++ > >> .../arch/x86/amdfam17h/floating-point.json | 196 +++++++++++ > >> .../pmu-events/arch/x86/amdfam17h/memory.json | 225 ++++++++++++ > >> .../pmu-events/arch/x86/amdfam17h/other.json | 51 +++ > >> tools/perf/pmu-events/arch/x86/mapfile.csv | 1 + > >> 6 files changed, 929 insertions(+) > >> create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/cache.json > >> create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/core.json > >> create mode 100644 > >> tools/perf/pmu-events/arch/x86/amdfam17h/floating-point.json > >> create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/memory.json > >> create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/other.json > >> > >> > >> > >> --------------DD285E7CC6B09B0E203385F4 > >> Content-Type: text/x-patch; > >> name="0001-AMD-perf-PMU-eventts-for-AMD-Family-17h.patch" > >> Content-Transfer-Encoding: 7bit > >> Content-Disposition: attachment; > >> filename="0001-AMD-perf-PMU-eventts-for-AMD-Family-17h.patch" > >> > >> diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/cache.json > >> b/tools/perf/pmu-events/arch/x86/amdfam17h/cache.json > >> new file mode 100644 > >> index 000000000000..6a41cc9d1d5e > >> --- /dev/null > >> +++ b/tools/perf/pmu-events/arch/x86/amdfam17h/cache.json > >> @@ -0,0 +1,332 @@ > >> +[ > >> + { > >> + "EventName": "ic_fw32", > >> + "EventCode": "0x80", > >> + "BriefDescription": "The number of 32B fetch windows transferred from > >> IC pipe to DE instruction decoder (includes non-cacheable and cacheable > >> fill responses)." > >> + }, > >> + { > >> + "EventName": "ic_fw32_miss", > >> + "EventCode": "0x81", > >> + "BriefDescription": "The number of 32B fetch windows tried to read > >> the L1 IC and missed in the full tag." > >> + }, > >> + { > >> + "EventName": "ic_cache_fill_l2", > >> + "EventCode": "0x82", > >> + "BriefDescription": "The number of 64 byte instruction cache line was > >> fulfilled from the L2 cache." > >> + }, > >> + { > >> + "EventName": "ic_cache_fill_sys", > >> + "EventCode": "0x83", > >> + "BriefDescription": "The number of 64 byte instruction cache line > >> fulfilled from system memory or another cache." > >> + }, > >> + { > >> + "EventName": "bp_l1_tlb_miss_l2_hit", > >> + "EventCode": "0x84", > >> + "BriefDescription": "The number of instruction fetches that miss in > >> the L1 ITLB but hit in the L2 ITLB." > >> + }, > >> + { > >> + "EventName": "bp_l1_tlb_miss_l2_miss", > >> + "EventCode": "0x85", > >> + "BriefDescription": "The number of instruction fetches that miss in > >> both the L1 and L2 TLBs." > >> + }, > >> + { > >> + "EventName": "bp_snp_re_sync", > >> + "EventCode": "0x86", > >> + "BriefDescription": "The number of pipeline restarts caused by > >> invalidating probes that hit on the instruction stream currently being > >> executed. This would happen if the active instruction stream was being > >> modified by another processor in an MP system - typically a highly > >> unlikely event." > >> + }, > >> + { > >> + "EventName": "ic_fetch_stall.ic_stall_any", > >> + "EventCode": "0x87", > >> + "BriefDescription": "IC pipe was stalled during this clock cycle for > >> any reason (nothing valid in pipe ICM1).", > >> + "PublicDescription": "Instruction Pipe Stall. IC pipe was stalled > >> during this clock cycle for any reason (nothing valid in pipe ICM1).", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "ic_fetch_stall.ic_stall_dq_empty", > >> + "EventCode": "0x87", > >> + "BriefDescription": "IC pipe was stalled during this clock cycle > >> (including IC to OC fetches) due to DQ empty.", > >> + "PublicDescription": "Instruction Pipe Stall. IC pipe was stalled > >> during this clock cycle (including IC to OC fetches) due to DQ empty.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ic_fetch_stall.ic_stall_back_pressure", > >> + "EventCode": "0x87", > >> + "BriefDescription": "IC pipe was stalled during this clock cycle > >> (including IC to OC fetches) due to back-pressure.", > >> + "PublicDescription": "Instruction Pipe Stall. IC pipe was stalled > >> during this clock cycle (including IC to OC fetches) due to > >> back-pressure.", > >> + "UMask": "0x1" > >> + }, > >> > >> Aren't the following bp_l1_btb_correct and bp_l2btb_correct branch > >> prediction instructions should they be in a branch.json file rather than > >> be lumped in with the cache perf events? > > > > Yes, moved there. > > > >> > >> + { > >> + "EventName": "bp_l1_btb_correct", > >> + "EventCode": "0x8a", > >> + "BriefDescription": "L1 BTB Correction." > >> + }, > >> + { > >> + "EventName": "bp_l2_btb_correct", > >> + "EventCode": "0x8b", > >> + "BriefDescription": "L2 BTB Correction." > >> + }, > >> + { > >> + "EventName": "ic_cache_inval.l2_invalidating_probe", > >> + "EventCode": "0x8c", > >> + "BriefDescription": "IC line invalidated due to L2 invalidating probe > >> (external or LS).", > >> + "PublicDescription": "The number of instruction cache lines > >> invalidated. A non-SMC event is CMC (cross modifying code), either from > >> the other thread of the core or another core. IC line invalidated due to > >> L2 invalidating probe (external or LS).", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ic_cache_inval.fill_invalidated", > >> + "EventCode": "0x8c", > >> + "BriefDescription": "IC line invalidated due to overwriting fill > >> response.", > >> + "PublicDescription": "The number of instruction cache lines > >> invalidated. A non-SMC event is CMC (cross modifying code), either from > >> the other thread of the core or another core. IC line invalidated due to > >> overwriting fill response.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "bp_tlb_rel", > >> + "EventCode": "0x99", > >> + "BriefDescription": "The number of ITLB reload requests." > >> + }, > >> > >> The AMD documentions isn't really clear what the > >> ic_oc_mode_switch.oc_ic_mode_switch and > >> ic_oc_mode_switch.ic_oc_mode_switch do. Should these two events go into > >> the other.json? > > > > Yes, done. > > > >> > >> + { > >> + "EventName": "ic_oc_mode_switch.oc_ic_mode_switch", > >> + "EventCode": "0x28a", > >> + "BriefDescription": "OC to IC mode switch.", > >> + "PublicDescription": "OC Mode Switch. OC to IC mode switch.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ic_oc_mode_switch.ic_oc_mode_switch", > >> + "EventCode": "0x28a", > >> + "BriefDescription": "IC to OC mode switch.", > >> + "PublicDescription": "OC Mode Switch. IC to OC mode switch.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "l2_request_g1.rd_blk_l", > >> + "EventCode": "0x60", > >> + "BriefDescription": "Requests to L2 Group1.", > >> + "PublicDescription": "Requests to L2 Group1.", > >> + "UMask": "0x80" > >> + }, > >> + { > >> + "EventName": "l2_request_g1.rd_blk_x", > >> + "EventCode": "0x60", > >> + "BriefDescription": "Requests to L2 Group1.", > >> + "PublicDescription": "Requests to L2 Group1.", > >> + "UMask": "0x40" > >> + }, > >> + { > >> + "EventName": "l2_request_g1.ls_rd_blk_c_s", > >> + "EventCode": "0x60", > >> + "BriefDescription": "Requests to L2 Group1.", > >> + "PublicDescription": "Requests to L2 Group1.", > >> + "UMask": "0x20" > >> + }, > >> + { > >> + "EventName": "l2_request_g1.cacheable_ic_read", > >> + "EventCode": "0x60", > >> + "BriefDescription": "Requests to L2 Group1.", > >> + "PublicDescription": "Requests to L2 Group1.", > >> + "UMask": "0x10" > >> + }, > >> + { > >> + "EventName": "l2_request_g1.change_to_x", > >> + "EventCode": "0x60", > >> + "BriefDescription": "Requests to L2 Group1.", > >> + "PublicDescription": "Requests to L2 Group1.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "l2_request_g1.prefetch_l2", > >> + "EventCode": "0x60", > >> + "BriefDescription": "Requests to L2 Group1.", > >> + "PublicDescription": "Requests to L2 Group1.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "l2_request_g1.l2_hw_pf", > >> + "EventCode": "0x60", > >> + "BriefDescription": "Requests to L2 Group1.", > >> + "PublicDescription": "Requests to L2 Group1.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "l2_request_g1.other_requests", > >> + "EventCode": "0x60", > >> + "BriefDescription": "Events covered by l2_request_g2.", > >> + "PublicDescription": "Requests to L2 Group1. Events covered by > >> l2_request_g2.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "l2_request_g2.group1", > >> + "EventCode": "0x61", > >> + "BriefDescription": "All Group 1 commands not in unit0.", > >> + "PublicDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous. All Group 1 commands not in unit0.", > >> + "UMask": "0x80" > >> + }, > >> + { > >> + "EventName": "l2_request_g2.ls_rd_sized", > >> + "EventCode": "0x61", > >> + "BriefDescription": "RdSized, RdSized32, RdSized64.", > >> + "PublicDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous. RdSized, RdSized32, RdSized64.", > >> + "UMask": "0x40" > >> + }, > >> + { > >> + "EventName": "l2_request_g2.ls_rd_sized_nc", > >> + "EventCode": "0x61", > >> + "BriefDescription": "RdSizedNC, RdSized32NC, RdSized64NC.", > >> + "PublicDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous. RdSizedNC, RdSized32NC, RdSized64NC.", > >> + "UMask": "0x20" > >> + }, > >> + { > >> + "EventName": "l2_request_g2.ic_rd_sized", > >> + "EventCode": "0x61", > >> + "BriefDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous.", > >> + "PublicDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous.", > >> + "UMask": "0x10" > >> + }, > >> + { > >> + "EventName": "l2_request_g2.ic_rd_sized_nc", > >> + "EventCode": "0x61", > >> + "BriefDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous.", > >> + "PublicDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "l2_request_g2.smc_inval", > >> + "EventCode": "0x61", > >> + "BriefDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous.", > >> + "PublicDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "l2_request_g2.bus_locks_originator", > >> + "EventCode": "0x61", > >> + "BriefDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous.", > >> + "PublicDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "l2_request_g2.bus_locks_responses", > >> + "EventCode": "0x61", > >> + "BriefDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous.", > >> + "PublicDescription": "Multi-events in that LS and IF requests can be > >> received simultaneous.", > >> + "UMask": "0x1" > >> + }, > >> > >> The following event brief description for l2_latency is too long. For > >> this description there is no way to program event l2_request_g1 unit mask > >> to be FEH. The l2_request_g1 only (and other events) configurations only > >> allow setting a single bit. > > > > Simplified. > > > >> > >> + { > >> + "EventName": "l2_latency.l2_cycles_waiting_on_fills", > >> + "EventCode": "0x62", > >> + "BriefDescription": "Total cycles spent waiting for L2 fills to > >> complete from L3 or memory, divided by four. This may be used to calculate > >> average latency by multiplying this count by four and then dividing by the > >> total number of L2 fills (unit mask l2_request_g1 == FEh). Event counts > >> are for both threads. To calculate average latency, the number of fills > >> from both threads must be used.", > >> + "PublicDescription": "Total cycles spent waiting for L2 fills to > >> complete from L3 or memory, divided by four. This may be used to calculate > >> average latency by multiplying this count by four and then dividing by the > >> total number of L2 fills (unit mask l2_request_g1 == FEh). Event counts > >> are for both threads. To calculate average latency, the number of fills > >> from both threads must be used.", > >> + "UMask": "0x1" > >> + }, > >> > >> The AMD manual doesn't provide much details, but are the following > >> l2_wbc_req.* events suppose to have identical *Description sections? > > > > I reworded (and renamed slightly) that based on discussion with Linux > > x86_64 port maintainer Boris Petkov. > > > >> > >> + { > >> + "EventName": "l2_wbc_req.wcb_write", > >> + "EventCode": "0x63", > >> + "BriefDescription": "LS to L2 WBC requests.", > >> + "PublicDescription": "LS to L2 WBC requests.", > >> + "UMask": "0x40" > >> + }, > >> + { > >> + "EventName": "l2_wbc_req.wcb_close", > >> + "EventCode": "0x63", > >> + "BriefDescription": "LS to L2 WBC requests.", > >> + "PublicDescription": "LS to L2 WBC requests.", > >> + "UMask": "0x20" > >> + }, > >> + { > >> + "EventName": "l2_wbc_req.cache_line_flush", > >> + "EventCode": "0x63", > >> + "BriefDescription": "LS to L2 WBC requests.", > >> + "PublicDescription": "LS to L2 WBC requests.", > >> + "UMask": "0x10" > >> + }, > >> + { > >> + "EventName": "l2_wbc_req.i_line_flush", > >> + "EventCode": "0x63", > >> + "BriefDescription": "LS to L2 WBC requests.", > >> + "PublicDescription": "LS to L2 WBC requests.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "l2_wbc_req.zero_byte_store", > >> + "EventCode": "0x63", > >> + "BriefDescription": "This becomes WriteNoData at SDP; this count does > >> not include DVM Sync Ops and bus locks which are counted in > >> l2_request_g2.", > >> + "PublicDescription": "LS to L2 WBC requests. This becomes WriteNoData > >> at SDP; this count does not include DVM Sync Ops and bus locks which are > >> counted in l2_request_g2.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "l2_wbc_req.local_ic_clr", > >> + "EventCode": "0x63", > >> + "BriefDescription": "Local IC Clear.", > >> + "PublicDescription": "LS to L2 WBC requests. Local IC Clear.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "l2_wbc_req.cl_zero", > >> + "EventCode": "0x63", > >> + "BriefDescription": "Cache Line Zero.", > >> + "PublicDescription": "LS to L2 WBC requests. Cache Line Zero.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "l2_cache_req_stat.ls_rd_blk_cs", > >> + "EventCode": "0x64", > >> + "BriefDescription": "LS ReadBlock C/S Hit.", > >> + "PublicDescription": "This event does not count accesses to the L2 > >> cache by the L2 prefetcher, but it does count accesses by the L1 > >> prefetcher. LS ReadBlock C/S Hit.", > >> + "UMask": "0x80" > >> + }, > >> + { > >> + "EventName": "l2_cache_req_stat.ls_rd_blk_l_hit_x", > >> + "EventCode": "0x64", > >> + "BriefDescription": "LS Read Block L Hit X.", > >> + "PublicDescription": "This event does not count accesses to the L2 > >> cache by the L2 prefetcher, but it does count accesses by the L1 > >> prefetcher. LS Read Block L Hit X.", > >> + "UMask": "0x40" > >> + }, > >> + { > >> + "EventName": "l2_cache_req_stat.ls_rd_blk_l_hit_s", > >> + "EventCode": "0x64", > >> + "BriefDescription": "LsRdBlkL Hit Shared.", > >> + "PublicDescription": "This event does not count accesses to the L2 > >> cache by the L2 prefetcher, but it does count accesses by the L1 > >> prefetcher. LsRdBlkL Hit Shared.", > >> + "UMask": "0x20" > >> + }, > >> + { > >> + "EventName": "l2_cache_req_stat.ls_rd_blk_x", > >> + "EventCode": "0x64", > >> + "BriefDescription": "LsRdBlkX/ChgToX Hit X. Count RdBlkX finding > >> Shared as a Miss.", > >> + "PublicDescription": "This event does not count accesses to the L2 > >> cache by the L2 prefetcher, but it does count accesses by the L1 > >> prefetcher. LsRdBlkX/ChgToX Hit X. Count RdBlkX finding Shared as a > >> Miss.", > >> + "UMask": "0x10" > >> + }, > >> + { > >> + "EventName": "l2_cache_req_stat.ls_rd_blk_c", > >> + "EventCode": "0x64", > >> + "BriefDescription": "LS Read Block C S L X Change to X Miss.", > >> + "PublicDescription": "This event does not count accesses to the L2 > >> cache by the L2 prefetcher, but it does count accesses by the L1 > >> prefetcher. LS Read Block C S L X Change to X Miss.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "l2_cache_req_stat.ic_fill_hit_x", > >> + "EventCode": "0x64", > >> + "BriefDescription": "IC Fill Hit Exclusive Stale.", > >> + "PublicDescription": "This event does not count accesses to the L2 > >> cache by the L2 prefetcher, but it does count accesses by the L1 > >> prefetcher. IC Fill Hit Exclusive Stale.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "l2_cache_req_stat.ic_fill_hit_s", > >> + "EventCode": "0x64", > >> + "BriefDescription": "IC Fill Hit Shared.", > >> + "PublicDescription": "This event does not count accesses to the L2 > >> cache by the L2 prefetcher, but it does count accesses by the L1 > >> prefetcher. IC Fill Hit Shared.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "l2_cache_req_stat.ic_fill_miss", > >> + "EventCode": "0x64", > >> + "BriefDescription": "IC Fill Miss.", > >> + "PublicDescription": "This event does not count accesses to the L2 > >> cache by the L2 prefetcher, but it does count accesses by the L1 > >> prefetcher. IC Fill Miss.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "l2_fill_pending.l2_fill_busy", > >> + "EventCode": "0x6d", > >> + "BriefDescription": "Total cycles spent with one or more fill > >> requests in flight from L2.", > >> + "PublicDescription": "Total cycles spent with one or more fill > >> requests in flight from L2.", > >> + "UMask": "0x1" > >> + } > >> +] > >> \ No newline at end of file > >> diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/core.json > >> b/tools/perf/pmu-events/arch/x86/amdfam17h/core.json > >> new file mode 100644 > >> index 000000000000..79754a187fe5 > >> --- /dev/null > >> +++ b/tools/perf/pmu-events/arch/x86/amdfam17h/core.json > >> @@ -0,0 +1,124 @@ > >> +[ > >> + { > >> + "EventName": "ex_ret_instr", > >> + "EventCode": "0xc0", > >> + "BriefDescription": "Retired Instructions." > >> + }, > >> > >> For the following ex_ret_* instruction make the Briefdescription in a form > >> like the ex_ret_instr above and move the existing BriefDescription to the > >> long description. > > > > Done. > > > >> > >> + { > >> + "EventName": "ex_ret_cops", > >> + "EventCode": "0xc1", > >> + "BriefDescription": "The number of uOps retired. This includes all > >> processor activity (instructions, exceptions, interrupts, microcode > >> assists, etc.). The number of events logged per cycle can vary from 0 to > >> 4." > >> + }, > >> + { > >> + "EventName": "ex_ret_brn", > >> + "EventCode": "0xc2", > >> + "BriefDescription": "The number of branch instructions retired. This > >> includes all types of architectural control flow changes, including > >> exceptions and interrupts." > >> + }, > >> + { > >> + "EventName": "ex_ret_brn_misp", > >> + "EventCode": "0xc3", > >> + "BriefDescription": "The number of branch instructions retired, of > >> any type, that were not correctly predicted. This includes those for which > >> prediction is not attempted (far control transfers, exceptions and > >> interrupts)." > >> + }, > >> + { > >> + "EventName": "ex_ret_brn_tkn", > >> + "EventCode": "0xc4", > >> + "BriefDescription": "The number of taken branches that were retired. > >> This includes all types of architectural control flow changes, including > >> exceptions and interrupts." > >> + }, > >> + { > >> + "EventName": "ex_ret_brn_tkn_misp", > >> + "EventCode": "0xc5", > >> + "BriefDescription": "The number of retired taken branch instructions > >> that were mispredicted." > >> + }, > >> + { > >> + "EventName": "ex_ret_brn_far", > >> + "EventCode": "0xc6", > >> + "BriefDescription": "The number of far control transfers retired > >> including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions > >> and interrupts. Far control transfers are not subject to branch > >> prediction." > >> + }, > >> + { > >> + "EventName": "ex_ret_brn_resync", > >> + "EventCode": "0xc7", > >> + "BriefDescription": "The number of resync branches. These reflect > >> pipeline restarts due to certain microcode assists and events such as > >> writes to the active instruction stream, among other things. Each > >> occurrence reflects a restart penalty similar to a branch mispredict. This > >> is relatively rare." > >> + }, > >> + { > >> + "EventName": "ex_ret_near_ret", > >> + "EventCode": "0xc8", > >> + "BriefDescription": "The number of near return instructions (RET or > >> RET Iw) retired." > >> + }, > >> + { > >> + "EventName": "ex_ret_near_ret_mispred", > >> + "EventCode": "0xc9", > >> + "BriefDescription": "The number of near returns retired that were not > >> correctly predicted by the return address predictor. Each such mispredict > >> incurs the same penalty as a mispredicted conditional branch instruction." > >> + }, > >> + { > >> + "EventName": "ex_ret_brn_ind_misp", > >> + "EventCode": "0xca", > >> + "BriefDescription": "Retired Indirect Branch Instructions > >> Mispredicted." > >> + }, > >> + { > >> + "EventName": "ex_ret_mmx_fp_instr.sse_instr", > >> + "EventCode": "0xcb", > >> + "BriefDescription": "SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, > >> SSE41, SSE42, AVX).", > >> + "PublicDescription": "The number of MMX, SSE or x87 instructions > >> retired. The UnitMask allows the selection of the individual classes of > >> instructions as given in the table. Each increment represents one complete > >> instruction. Since this event includes non-numeric instructions it is not > >> suitable for measuring MFLOPS. SSE instructions (SSE, SSE2, SSE3, SSSE3, > >> SSE4A, SSE41, SSE42, AVX).", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "ex_ret_mmx_fp_instr.mmx_instr", > >> + "EventCode": "0xcb", > >> + "BriefDescription": "MMX instructions.", > >> + "PublicDescription": "The number of MMX, SSE or x87 instructions > >> retired. The UnitMask allows the selection of the individual classes of > >> instructions as given in the table. Each increment represents one complete > >> instruction. Since this event includes non-numeric instructions it is not > >> suitable for measuring MFLOPS. MMX instructions.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ex_ret_mmx_fp_instr.x87_instr", > >> + "EventCode": "0xcb", > >> + "BriefDescription": "x87 instructions.", > >> + "PublicDescription": "The number of MMX, SSE or x87 instructions > >> retired. The UnitMask allows the selection of the individual classes of > >> instructions as given in the table. Each increment represents one complete > >> instruction. Since this event includes non-numeric instructions it is not > >> suitable for measuring MFLOPS. x87 instructions.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "ex_ret_cond", > >> + "EventCode": "0xd1", > >> + "BriefDescription": "Retired Conditional Branch Instructions." > >> + }, > >> + { > >> + "EventName": "ex_ret_cond_misp", > >> + "EventCode": "0xd2", > >> + "BriefDescription": "Retired Conditional Branch Instructions > >> Mispredicted." > >> + }, > >> + { > >> + "EventName": "ex_div_busy", > >> + "EventCode": "0xd3", > >> + "BriefDescription": "Div Cycles Busy count." > >> + }, > >> + { > >> + "EventName": "ex_div_count", > >> + "EventCode": "0xd4", > >> + "BriefDescription": "Div Op Count." > >> + }, > >> + { > >> + "EventName": "ex_tagged_ibs_ops.ibs_count_rollover", > >> + "EventCode": "0x1cf", > >> + "BriefDescription": "Number of times an op could not be tagged by IBS > >> because of a previous tagged op that has not retired.", > >> + "PublicDescription": "Tagged IBS Ops. Number of times an op could not > >> be tagged by IBS because of a previous tagged op that has not retired.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "ex_tagged_ibs_ops.ibs_tagged_ops_ret", > >> + "EventCode": "0x1cf", > >> + "BriefDescription": "Number of Ops tagged by IBS that retired.", > >> + "PublicDescription": "Tagged IBS Ops. Number of Ops tagged by IBS > >> that retired.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ex_tagged_ibs_ops.ibs_tagged_ops", > >> + "EventCode": "0x1cf", > >> + "BriefDescription": "Number of Ops tagged by IBS.", > >> + "PublicDescription": "Tagged IBS Ops. Number of Ops tagged by IBS.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "ex_ret_fus_brnch_inst", > >> + "EventCode": "0x1d0", > >> + "BriefDescription": "The number of fused retired branch instructions > >> retired per cycle. The number of events logged per cycle can vary from 0 > >> to 3." > >> + } > >> +] > >> \ No newline at end of file > >> diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/floating-point.json > >> b/tools/perf/pmu-events/arch/x86/amdfam17h/floating-point.json > >> new file mode 100644 > >> index 000000000000..529e95c2d4bb > >> --- /dev/null > >> +++ b/tools/perf/pmu-events/arch/x86/amdfam17h/floating-point.json > >> @@ -0,0 +1,196 @@ > >> > >> For the fpu_pipe_assignement.* does it make sense to just allow > >> measurement of one pipe at a time? Seems like the likely use cases would > >> be 0xf0 (dual, all multi-pipe uOps) and 0x0f (total, total number of > >> uOps). Are people going to really care about number of uOps to Pipe3 vs > >> Pipe0? > > > > Done. > > > >> > >> +[ > >> + { > >> + "EventName": "fpu_pipe_assignment.dual3", > >> + "EventCode": "0x00", > >> + "BriefDescription": "Total number multi-pipe uOps assigned to Pipe > >> 3.", > >> + "PublicDescription": "The number of operations (uOps) and dual-pipe > >> uOps dispatched to each of the 4 FPU execution pipelines. This event > >> reflects how busy the FPU pipelines are and may be used for workload > >> characterization. This includes all operations performed by x87, MMXTM, > >> and SSE instructions, including moves. Each increment represents a one- > >> cycle dispatch event. This event is a speculative event. Since this event > >> includes non-numeric operations it is not suitable for measuring MFLOPS. > >> Total number multi-pipe uOps assigned to Pipe 3.", > >> + "UMask": "0x80" > >> + }, > >> + { > >> + "EventName": "fpu_pipe_assignment.dual2", > >> + "EventCode": "0x00", > >> + "BriefDescription": "Total number multi-pipe uOps assigned to Pipe > >> 2.", > >> + "PublicDescription": "The number of operations (uOps) and dual-pipe > >> uOps dispatched to each of the 4 FPU execution pipelines. This event > >> reflects how busy the FPU pipelines are and may be used for workload > >> characterization. This includes all operations performed by x87, MMXTM, > >> and SSE instructions, including moves. Each increment represents a one- > >> cycle dispatch event. This event is a speculative event. Since this event > >> includes non-numeric operations it is not suitable for measuring MFLOPS. > >> Total number multi-pipe uOps assigned to Pipe 2.", > >> + "UMask": "0x40" > >> + }, > >> + { > >> + "EventName": "fpu_pipe_assignment.dual1", > >> + "EventCode": "0x00", > >> + "BriefDescription": "Total number multi-pipe uOps assigned to Pipe > >> 1.", > >> + "PublicDescription": "The number of operations (uOps) and dual-pipe > >> uOps dispatched to each of the 4 FPU execution pipelines. This event > >> reflects how busy the FPU pipelines are and may be used for workload > >> characterization. This includes all operations performed by x87, MMXTM, > >> and SSE instructions, including moves. Each increment represents a one- > >> cycle dispatch event. This event is a speculative event. Since this event > >> includes non-numeric operations it is not suitable for measuring MFLOPS. > >> Total number multi-pipe uOps assigned to Pipe 1.", > >> + "UMask": "0x20" > >> + }, > >> + { > >> + "EventName": "fpu_pipe_assignment.dual0", > >> + "EventCode": "0x00", > >> + "BriefDescription": "Total number multi-pipe uOps assigned to Pipe > >> 0.", > >> + "PublicDescription": "The number of operations (uOps) and dual-pipe > >> uOps dispatched to each of the 4 FPU execution pipelines. This event > >> reflects how busy the FPU pipelines are and may be used for workload > >> characterization. This includes all operations performed by x87, MMXTM, > >> and SSE instructions, including moves. Each increment represents a one- > >> cycle dispatch event. This event is a speculative event. Since this event > >> includes non-numeric operations it is not suitable for measuring MFLOPS. > >> Total number multi-pipe uOps assigned to Pipe 0.", > >> + "UMask": "0x10" > >> + }, > >> + { > >> + "EventName": "fpu_pipe_assignment.total3", > >> + "EventCode": "0x00", > >> + "BriefDescription": "Total number uOps assigned to Pipe 3.", > >> + "PublicDescription": "The number of operations (uOps) and dual-pipe > >> uOps dispatched to each of the 4 FPU execution pipelines. This event > >> reflects how busy the FPU pipelines are and may be used for workload > >> characterization. This includes all operations performed by x87, MMXTM, > >> and SSE instructions, including moves. Each increment represents a one- > >> cycle dispatch event. This event is a speculative event. Since this event > >> includes non-numeric operations it is not suitable for measuring MFLOPS. > >> Total number uOps assigned to Pipe 3.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "fpu_pipe_assignment.total2", > >> + "EventCode": "0x00", > >> + "BriefDescription": "Total number uOps assigned to Pipe 2.", > >> + "PublicDescription": "The number of operations (uOps) and dual-pipe > >> uOps dispatched to each of the 4 FPU execution pipelines. This event > >> reflects how busy the FPU pipelines are and may be used for workload > >> characterization. This includes all operations performed by x87, MMXTM, > >> and SSE instructions, including moves. Each increment represents a one- > >> cycle dispatch event. This event is a speculative event. Since this event > >> includes non-numeric operations it is not suitable for measuring MFLOPS. > >> Total number uOps assigned to Pipe 2.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "fpu_pipe_assignment.total1", > >> + "EventCode": "0x00", > >> + "BriefDescription": "Total number uOps assigned to Pipe 1.", > >> + "PublicDescription": "The number of operations (uOps) and dual-pipe > >> uOps dispatched to each of the 4 FPU execution pipelines. This event > >> reflects how busy the FPU pipelines are and may be used for workload > >> characterization. This includes all operations performed by x87, MMXTM, > >> and SSE instructions, including moves. Each increment represents a one- > >> cycle dispatch event. This event is a speculative event. Since this event > >> includes non-numeric operations it is not suitable for measuring MFLOPS. > >> Total number uOps assigned to Pipe 1.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "fpu_pipe_assignment.total0", > >> + "EventCode": "0x00", > >> + "BriefDescription": "Total number uOps assigned to Pipe 0.", > >> + "PublicDescription": "The number of operations (uOps) and dual-pipe > >> uOps dispatched to each of the 4 FPU execution pipelines. This event > >> reflects how busy the FPU pipelines are and may be used for workload > >> characterization. This includes all operations performed by x87, MMXTM, > >> and SSE instructions, including moves. Each increment represents a one- > >> cycle dispatch event. This event is a speculative event. Since this event > >> includes non-numeric operations it is not suitable for measuring MFLOPS. > >> Total number uOps assigned to Pipe 0.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "fp_sched_empty", > >> + "EventCode": "0x01", > >> + "BriefDescription": "This is a speculative event. The number of > >> cycles in which the FPU scheduler is empty. Note that some Ops like FP > >> loads bypass the scheduler." > >> + }, > >> > >> For fp_retx86_fp_ops, would it be possible to have a setting for all event > >> in addition to the individual flags? > > > > Likewise done. > > > >> > >> + { > >> + "EventName": "fp_retx87_fp_ops.div_sqr_r_ops", > >> + "EventCode": "0x02", > >> + "BriefDescription": "Divide and square root Ops.", > >> + "PublicDescription": "The number of x87 floating-point Ops that have > >> retired. The number of events logged per cycle can vary from 0 to 8. > >> Divide and square root Ops.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "fp_retx87_fp_ops.mul_ops", > >> + "EventCode": "0x02", > >> + "BriefDescription": "Multiply Ops.", > >> + "PublicDescription": "The number of x87 floating-point Ops that have > >> retired. The number of events logged per cycle can vary from 0 to 8. > >> Multiply Ops.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "fp_retx87_fp_ops.add_sub_ops", > >> + "EventCode": "0x02", > >> + "BriefDescription": "Add/subtract Ops.", > >> + "PublicDescription": "The number of x87 floating-point Ops that have > >> retired. The number of events logged per cycle can vary from 0 to 8. > >> Add/subtract Ops.", > >> + "UMask": "0x1" > >> + }, > >> > >> For fp_ret_sse_avx_ops, would like to have a umask setting for all the > >> events sub events it can measure. > > > > Likewise done. > > > >> > >> + { > >> + "EventName": "fp_ret_sse_avx_ops.dp_mult_add_flops", > >> + "EventCode": "0x03", > >> + "BriefDescription": "Double precision multiply-add FLOPS. > >> Multiply-add counts as 2 FLOPS.", > >> + "PublicDescription": "This is a retire-based event. The number of > >> retired SSE/AVX FLOPS. The number of events logged per cycle can vary from > >> 0 to 64. This event can count above 15. Double precision multiply-add > >> FLOPS. Multiply-add counts as 2 FLOPS.", > >> + "UMask": "0x80" > >> + }, > >> + { > >> + "EventName": "fp_ret_sse_avx_ops.dp_div_flops", > >> + "EventCode": "0x03", > >> + "BriefDescription": "Double precision divide/square root FLOPS.", > >> + "PublicDescription": "This is a retire-based event. The number of > >> retired SSE/AVX FLOPS. The number of events logged per cycle can vary from > >> 0 to 64. This event can count above 15. Double precision divide/square > >> root FLOPS.", > >> + "UMask": "0x40" > >> + }, > >> + { > >> + "EventName": "fp_ret_sse_avx_ops.dp_mult_flops", > >> + "EventCode": "0x03", > >> + "BriefDescription": "Double precision multiply FLOPS.", > >> + "PublicDescription": "This is a retire-based event. The number of > >> retired SSE/AVX FLOPS. The number of events logged per cycle can vary from > >> 0 to 64. This event can count above 15. Double precision multiply FLOPS.", > >> + "UMask": "0x20" > >> + }, > >> + { > >> + "EventName": "fp_ret_sse_avx_ops.dp_add_sub_flops", > >> + "EventCode": "0x03", > >> + "BriefDescription": "Double precision add/subtract FLOPS.", > >> + "PublicDescription": "This is a retire-based event. The number of > >> retired SSE/AVX FLOPS. The number of events logged per cycle can vary from > >> 0 to 64. This event can count above 15. Double precision add/subtract > >> FLOPS.", > >> + "UMask": "0x10" > >> + }, > >> + { > >> + "EventName": "fp_ret_sse_avx_ops.sp_mult_add_flops", > >> + "EventCode": "0x03", > >> + "BriefDescription": "Single precision multiply-add FLOPS. > >> Multiply-add counts as 2 FLOPS.", > >> + "PublicDescription": "This is a retire-based event. The number of > >> retired SSE/AVX FLOPS. The number of events logged per cycle can vary from > >> 0 to 64. This event can count above 15. Single precision multiply-add > >> FLOPS. Multiply-add counts as 2 FLOPS.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "fp_ret_sse_avx_ops.sp_div_flops", > >> + "EventCode": "0x03", > >> + "BriefDescription": "Single-precision divide/square root FLOPS.", > >> + "PublicDescription": "This is a retire-based event. The number of > >> retired SSE/AVX FLOPS. The number of events logged per cycle can vary from > >> 0 to 64. This event can count above 15. Single-precision divide/square > >> root FLOPS.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "fp_ret_sse_avx_ops.sp_mult_flops", > >> + "EventCode": "0x03", > >> + "BriefDescription": "Single-precision multiply FLOPS.", > >> + "PublicDescription": "This is a retire-based event. The number of > >> retired SSE/AVX FLOPS. The number of events logged per cycle can vary from > >> 0 to 64. This event can count above 15. Single-precision multiply FLOPS.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "fp_ret_sse_avx_ops.sp_add_sub_flops", > >> + "EventCode": "0x03", > >> + "BriefDescription": "Single-precision add/subtract FLOPS.", > >> + "PublicDescription": "This is a retire-based event. The number of > >> retired SSE/AVX FLOPS. The number of events logged per cycle can vary from > >> 0 to 64. This event can count above 15. Single-precision add/subtract > >> FLOPS.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "fp_num_mov_elim_scal_op.optimized", > >> + "EventCode": "0x04", > >> + "BriefDescription": "Number of Scalar Ops optimized.", > >> + "PublicDescription": "This is a dispatch based speculative event, and > >> is useful for measuring the effectiveness of the Move elimination and > >> Scalar code optimization schemes. Number of Scalar Ops optimized.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "fp_num_mov_elim_scal_op.opt_potential", > >> + "EventCode": "0x04", > >> + "BriefDescription": "Number of Ops that are candidates for > >> optimization (have Z-bit either set or pass).", > >> + "PublicDescription": "This is a dispatch based speculative event, and > >> is useful for measuring the effectiveness of the Move elimination and > >> Scalar code optimization schemes. Number of Ops that are candidates for > >> optimization (have Z-bit either set or pass).", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "fp_num_mov_elim_scal_op.sse_mov_ops_elim", > >> + "EventCode": "0x04", > >> + "BriefDescription": "Number of SSE Move Ops eliminated.", > >> + "PublicDescription": "This is a dispatch based speculative event, and > >> is useful for measuring the effectiveness of the Move elimination and > >> Scalar code optimization schemes. Number of SSE Move Ops eliminated.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "fp_num_mov_elim_scal_op.sse_mov_ops", > >> + "EventCode": "0x04", > >> + "BriefDescription": "Number of SSE Move Ops.", > >> + "PublicDescription": "This is a dispatch based speculative event, and > >> is useful for measuring the effectiveness of the Move elimination and > >> Scalar code optimization schemes. Number of SSE Move Ops.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "fp_retired_ser_ops.x87_ctrl_ret", > >> + "EventCode": "0x05", > >> + "BriefDescription": "x87 control word mispredict traps due to > >> mispredictions in RC or PC, or changes in mask bits.", > >> + "PublicDescription": "The number of serializing Ops retired. x87 > >> control word mispredict traps due to mispredictions in RC or PC, or > >> changes in mask bits.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "fp_retired_ser_ops.x87_bot_ret", > >> + "EventCode": "0x05", > >> + "BriefDescription": "x87 bottom-executing uOps retired.", > >> + "PublicDescription": "The number of serializing Ops retired. x87 > >> bottom-executing uOps retired.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "fp_retired_ser_ops.sse_ctrl_ret", > >> + "EventCode": "0x05", > >> + "BriefDescription": "SSE control word mispredict traps due to > >> mispredictions in RC, FTZ or DAZ, or changes in mask bits.", > >> + "PublicDescription": "The number of serializing Ops retired. SSE > >> control word mispredict traps due to mispredictions in RC, FTZ or DAZ, or > >> changes in mask bits.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "fp_retired_ser_ops.sse_bot_ret", > >> + "EventCode": "0x05", > >> + "BriefDescription": "SSE bottom-executing uOps retired.", > >> + "PublicDescription": "The number of serializing Ops retired. SSE > >> bottom-executing uOps retired.", > >> + "UMask": "0x1" > >> + } > >> +] > >> \ No newline at end of file > >> diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/memory.json > >> b/tools/perf/pmu-events/arch/x86/amdfam17h/memory.json > >> new file mode 100644 > >> index 000000000000..15678880f90b > >> --- /dev/null > >> +++ b/tools/perf/pmu-events/arch/x86/amdfam17h/memory.json > >> @@ -0,0 +1,225 @@ > >> +[ > >> > >> Is "Unit Masks ORed." really the description for ls_locks.*? That looks > >> documentation error in the AMD manual. > > > > Boris recommended to remove all except bus_lock sub-event. > > > >> > >> + { > >> + "EventName": "ls_locks.spec_lock_map_commit", > >> + "EventCode": "0x25", > >> + "BriefDescription": "Unit Masks ORed.", > >> + "PublicDescription": "Unit Masks ORed.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "ls_locks.spec_lock", > >> + "EventCode": "0x25", > >> + "BriefDescription": "Unit Masks ORed.", > >> + "PublicDescription": "Unit Masks ORed.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "ls_locks.non_spec_lock", > >> + "EventCode": "0x25", > >> + "BriefDescription": "Unit Masks ORed.", > >> + "PublicDescription": "Unit Masks ORed.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ls_locks.bus_lock", > >> + "EventCode": "0x25", > >> + "BriefDescription": "Unit Masks ORed.", > >> + "PublicDescription": "Unit Masks ORed.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "ls_dispatch.ld_st_dispatch", > >> + "EventCode": "0x29", > >> + "BriefDescription": "Load-op-Stores.", > >> + "PublicDescription": "Counts the number of operations dispatched to > >> the LS unit. Unit Masks ADDed. Load-op-Stores.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "ls_dispatch.store_dispatch", > >> + "EventCode": "0x29", > >> + "BriefDescription": "Counts the number of operations dispatched to > >> the LS unit. Unit Masks ADDed.", > >> + "PublicDescription": "Counts the number of operations dispatched to > >> the LS unit. Unit Masks ADDed.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ls_dispatch.ld_dispatch", > >> + "EventCode": "0x29", > >> + "BriefDescription": "Counts the number of operations dispatched to > >> the LS unit. Unit Masks ADDed.", > >> + "PublicDescription": "Counts the number of operations dispatched to > >> the LS unit. Unit Masks ADDed.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "ls_stlf", > >> + "EventCode": "0x35", > >> + "BriefDescription": "Number of STLF hits." > >> + }, > >> + { > >> + "EventName": "ls_dc_accesses", > >> + "EventCode": "0x40", > >> + "BriefDescription": "The number of accesses to the data cache for > >> load and store references. This may include certain microcode scratchpad > >> accesses, although these are generally rare. Each increment represents an > >> eight-byte access, although the instruction may only be accessing a > >> portion of that. This event is a speculative event." > >> + }, > >> > >> Shouldn't there be some variation in the description of the > >> ls_mab_alloc_pipe.* events with the different unit masks? > > > > Boris recommended to remove these. > > > >> > >> + { > >> + "EventName": "ls_mab_alloc_pipe.tlb_pipe_early", > >> + "EventCode": "0x41", > >> + "BriefDescription": "MAB Allocation by Pipe.", > >> + "PublicDescription": "MAB Allocation by Pipe.", > >> + "UMask": "0x10" > >> + }, > >> + { > >> + "EventName": "ls_mab_alloc_pipe.hw_pf", > >> + "EventCode": "0x41", > >> + "BriefDescription": "MAB Allocation by Pipe.", > >> + "PublicDescription": "MAB Allocation by Pipe.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "ls_mab_alloc_pipe.tlb_pipe_late", > >> + "EventCode": "0x41", > >> + "BriefDescription": "MAB Allocation by Pipe.", > >> + "PublicDescription": "MAB Allocation by Pipe.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "ls_mab_alloc_pipe.st_pipe", > >> + "EventCode": "0x41", > >> + "BriefDescription": "MAB Allocation by Pipe.", > >> + "PublicDescription": "MAB Allocation by Pipe.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ls_mab_alloc_pipe.data_pipe", > >> + "EventCode": "0x41", > >> + "BriefDescription": "MAB Allocation by Pipe.", > >> + "PublicDescription": "MAB Allocation by Pipe.", > >> + "UMask": "0x1" > >> + }, > >> > >> Shouldn't the descriptions ls_l1_d_tlb_miss.* mention the different page > >> sizes that the different unit masks refer to? Also would it be possible > >> to have an entry count all variations of ls_l1_d_tlb_miss? > > > > Description is enhanced here and new *.all event is added. > > > >> > >> + { > >> + "EventName": "ls_l1_d_tlb_miss.tlb_reload1_gl2_miss", > >> + "EventCode": "0x45", > >> + "BriefDescription": "L1 DTLB Miss.", > >> + "PublicDescription": "L1 DTLB Miss.", > >> + "UMask": "0x80" > >> + }, > >> + { > >> + "EventName": "ls_l1_d_tlb_miss.tlb_reload2_ml2_miss", > >> + "EventCode": "0x45", > >> + "BriefDescription": "L1 DTLB Miss.", > >> + "PublicDescription": "L1 DTLB Miss.", > >> + "UMask": "0x40" > >> + }, > >> + { > >> + "EventName": "ls_l1_d_tlb_miss.tlb_reload32_kl2_miss", > >> + "EventCode": "0x45", > >> + "BriefDescription": "L1 DTLB Miss.", > >> + "PublicDescription": "L1 DTLB Miss.", > >> + "UMask": "0x20" > >> + }, > >> + { > >> + "EventName": "ls_l1_d_tlb_miss.tlb_reload4_kl2_miss", > >> + "EventCode": "0x45", > >> + "BriefDescription": "L1 DTLB Miss.", > >> + "PublicDescription": "L1 DTLB Miss.", > >> + "UMask": "0x10" > >> + }, > >> + { > >> + "EventName": "ls_l1_d_tlb_miss.tlb_reload1_gl2_hit", > >> + "EventCode": "0x45", > >> + "BriefDescription": "L1 DTLB Miss.", > >> + "PublicDescription": "L1 DTLB Miss.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "ls_l1_d_tlb_miss.tlb_reload2_ml2_hit", > >> + "EventCode": "0x45", > >> + "BriefDescription": "L1 DTLB Miss.", > >> + "PublicDescription": "L1 DTLB Miss.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "ls_l1_d_tlb_miss.tlb_reload32_kl2_hit", > >> + "EventCode": "0x45", > >> + "BriefDescription": "L1 DTLB Miss.", > >> + "PublicDescription": "L1 DTLB Miss.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ls_l1_d_tlb_miss.tlb_reload4_kl2_hit", > >> + "EventCode": "0x45", > >> + "BriefDescription": "L1 DTLB Miss.", > >> + "PublicDescription": "L1 DTLB Miss.", > >> + "UMask": "0x1" > >> + }, > >> > >> Would it be possible to have a setting for ls_tablewalker.*iside* and > >> another setting for *dside*? > > > > Yes. > > > >> > >> + { > >> + "EventName": "ls_tablewalker.perf_mon_tablewalk_alloc_iside1", > >> + "EventCode": "0x46", > >> + "BriefDescription": "Tablewalker allocation.", > >> + "PublicDescription": "Tablewalker allocation.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "ls_tablewalker.perf_mon_tablewalk_alloc_iside0", > >> + "EventCode": "0x46", > >> + "BriefDescription": "Tablewalker allocation.", > >> + "PublicDescription": "Tablewalker allocation.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "ls_tablewalker.perf_mon_tablewalk_alloc_dside1", > >> + "EventCode": "0x46", > >> + "BriefDescription": "Tablewalker allocation.", > >> + "PublicDescription": "Tablewalker allocation.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ls_tablewalker.perf_mon_tablewalk_alloc_dside0", > >> + "EventCode": "0x46", > >> + "BriefDescription": "Tablewalker allocation.", > >> + "PublicDescription": "Tablewalker allocation.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "ls_misal_accesses", > >> + "EventCode": "0x47", > >> + "BriefDescription": "Misaligned loads." > >> + }, > >> > >> > >> The descriptions for ls_pref_instr_disp.prefetch_nta and store_prefetch_w > >> should have some differences. > > > > Fixed. > > > > That's all I modifed. > > > > Martin > > > >> > >> + { > >> + "EventName": "ls_pref_instr_disp.prefetch_nta", > >> + "EventCode": "0x4b", > >> + "BriefDescription": "Software Prefetch Instructions Dispatched.", > >> + "PublicDescription": "Software Prefetch Instructions Dispatched.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "ls_pref_instr_disp.store_prefetch_w", > >> + "EventCode": "0x4b", > >> + "BriefDescription": "Software Prefetch Instructions Dispatched.", > >> + "PublicDescription": "Software Prefetch Instructions Dispatched.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ls_pref_instr_disp.load_prefetch_w", > >> + "EventCode": "0x4b", > >> + "BriefDescription": "Prefetch, Prefetch_T0_T1_T2.", > >> + "PublicDescription": "Software Prefetch Instructions Dispatched. > >> Prefetch, Prefetch_T0_T1_T2.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "ls_inef_sw_pref.mab_mch_cnt", > >> + "EventCode": "0x52", > >> + "BriefDescription": "The number of software prefetches that did not > >> fetch data outside of the processor core.", > >> + "PublicDescription": "The number of software prefetches that did not > >> fetch data outside of the processor core.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "ls_inef_sw_pref.data_pipe_sw_pf_dc_hit", > >> + "EventCode": "0x52", > >> + "BriefDescription": "The number of software prefetches that did not > >> fetch data outside of the processor core.", > >> + "PublicDescription": "The number of software prefetches that did not > >> fetch data outside of the processor core.", > >> + "UMask": "0x1" > >> + }, > >> + { > >> + "EventName": "ls_not_halted_cyc", > >> + "EventCode": "0x76", > >> + "BriefDescription": "Cycles not in Halt." > >> + } > >> +] > >> \ No newline at end of file > >> diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/other.json > >> b/tools/perf/pmu-events/arch/x86/amdfam17h/other.json > >> new file mode 100644 > >> index 000000000000..03fa0d97ad3d > >> --- /dev/null > >> +++ b/tools/perf/pmu-events/arch/x86/amdfam17h/other.json > >> @@ -0,0 +1,51 @@ > >> +[ > >> + { > >> + "EventName": "de_dis_dispatch_token_stalls0.retire_token_stall", > >> + "EventCode": "0xaf", > >> + "BriefDescription": "RETIRE Tokens unavailable.", > >> + "PublicDescription": "Cycles where a dispatch group is valid but does > >> not get dispatched due to a token stall. RETIRE Tokens unavailable.", > >> + "UMask": "0x40" > >> + }, > >> + { > >> + "EventName": "de_dis_dispatch_token_stalls0.agsq_token_stall", > >> + "EventCode": "0xaf", > >> + "BriefDescription": "AGSQ Tokens unavailable.", > >> + "PublicDescription": "Cycles where a dispatch group is valid but does > >> not get dispatched due to a token stall. AGSQ Tokens unavailable.", > >> + "UMask": "0x20" > >> + }, > >> + { > >> + "EventName": "de_dis_dispatch_token_stalls0.alu_token_stall", > >> + "EventCode": "0xaf", > >> + "BriefDescription": "ALU tokens total unavailable.", > >> + "PublicDescription": "Cycles where a dispatch group is valid but does > >> not get dispatched due to a token stall. ALU tokens total unavailable.", > >> + "UMask": "0x10" > >> + }, > >> + { > >> + "EventName": "de_dis_dispatch_token_stalls0.alsq3_0_token_stall", > >> + "EventCode": "0xaf", > >> + "BriefDescription": "Cycles where a dispatch group is valid but does > >> not get dispatched due to a token stall.", > >> + "PublicDescription": "Cycles where a dispatch group is valid but does > >> not get dispatched due to a token stall.", > >> + "UMask": "0x8" > >> + }, > >> + { > >> + "EventName": "de_dis_dispatch_token_stalls0.alsq3_token_stall", > >> + "EventCode": "0xaf", > >> + "BriefDescription": "ALSQ 3 Tokens unavailable.", > >> + "PublicDescription": "Cycles where a dispatch group is valid but does > >> not get dispatched due to a token stall. ALSQ 3 Tokens unavailable.", > >> + "UMask": "0x4" > >> + }, > >> + { > >> + "EventName": "de_dis_dispatch_token_stalls0.alsq2_token_stall", > >> + "EventCode": "0xaf", > >> + "BriefDescription": "ALSQ 2 Tokens unavailable.", > >> + "PublicDescription": "Cycles where a dispatch group is valid but does > >> not get dispatched due to a token stall. ALSQ 2 Tokens unavailable.", > >> + "UMask": "0x2" > >> + }, > >> + { > >> + "EventName": "de_dis_dispatch_token_stalls0.alsq1_token_stall", > >> + "EventCode": "0xaf", > >> + "BriefDescription": "ALSQ 1 Tokens unavailable.", > >> + "PublicDescription": "Cycles where a dispatch group is valid but does > >> not get dispatched due to a token stall. ALSQ 1 Tokens unavailable.", > >> + "UMask": "0x1" > >> + } > >> +] > >> \ No newline at end of file > >> diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv > >> b/tools/perf/pmu-events/arch/x86/mapfile.csv > >> index 7e3cce3bcf3b..4e0973c08a52 100644 > >> --- a/tools/perf/pmu-events/arch/x86/mapfile.csv > >> +++ b/tools/perf/pmu-events/arch/x86/mapfile.csv > >> @@ -32,3 +32,4 @@ GenuineIntel-6-2C,v2,westmereep-dp,core > >> GenuineIntel-6-25,v2,westmereep-sp,core > >> GenuineIntel-6-2F,v2,westmereex,core > >> GenuineIntel-6-55,v1,skylakex,core > >> +AuthenticAMD-23-[[:xdigit:]]+,v1,amdfam17h,core > >> > >> > >> --------------DD285E7CC6B09B0E203385F4-- > >> > > -- - Arnaldo