运行时,perf list我看到了一系列的 硬件缓存事件 ,如下所示:
perf list
$ perf list | grep 'cache event' L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] iTLB-loads [Hardware cache event] node-load-misses [Hardware cache event] node-loads [Hardware cache event] node-store-misses [Hardware cache event] node-stores [Hardware cache event]
这些事件似乎大多基于测试返回合理的值,但是我想知道如何确定将这些事件映射到系统上的硬件性能计数器事件?
也就是说,这些事件肯定是在Skylake CPU上使用一个或多个基础x86 PMU计数器实现的-但是我怎么知道哪个?
您可以查找/sys/devices/cpu/events其他硬件事件,但不能查找“硬件缓存事件”。
/sys/devices/cpu/events
用户@Margaret指出注释中的合理答案-阅读内核源代码以查看PMU事件的映射。
我们可以检查arch / x86 / events / intel / core.c中的事件定义。我实际上不知道这里的“核心”是否指的是Core体系结构,就大多数定义而言,这是最合适的- 但无论如何,这就是您要查看的文件。
关键部分是此部分,它定义了skl_hw_cache_event_ids:
skl_hw_cache_event_ids
static __initconst const u64 skl_hw_cache_event_ids [PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_OP_MAX] [PERF_COUNT_HW_CACHE_RESULT_MAX] = { [ C(L1D ) ] = { [ C(OP_READ) ] = { [ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_INST_RETIRED.ALL_LOADS */ [ C(RESULT_MISS) ] = 0x151, /* L1D.REPLACEMENT */ }, [ C(OP_WRITE) ] = { [ C(RESULT_ACCESS) ] = 0x82d0, /* MEM_INST_RETIRED.ALL_STORES */ [ C(RESULT_MISS) ] = 0x0, }, [ C(OP_PREFETCH) ] = { [ C(RESULT_ACCESS) ] = 0x0, [ C(RESULT_MISS) ] = 0x0, }, }, ...
解码嵌套初始化,你得到的L1D-dcahe-load对应MEM_INST_RETIRED.ALL_LOAD和L1-dcache-load- misses到L1D.REPLACEMENT。
L1D-dcahe-load
MEM_INST_RETIRED.ALL_LOAD
L1-dcache-load- misses
L1D.REPLACEMENT
我们可以用perf仔细检查一下:
$ ocperf stat -e mem_inst_retired.all_loads,L1-dcache-loads,l1d.replacement,L1-dcache-load-misses,L1-dcache-loads,mem_load_retired.l1_hit head -c100M /dev/zero > /dev/null Performance counter stats for 'head -c100M /dev/zero': 11,587,793 mem_inst_retired_all_loads 11,587,793 L1-dcache-loads 20,233 l1d_replacement 20,233 L1-dcache-load-misses # 0.17% of all L1-dcache hits 11,587,793 L1-dcache-loads 11,495,053 mem_load_retired_l1_hit 0.024322360 seconds time elapsed
“硬件缓存”事件显示的值与使用我们检查源时猜测的基础PMU事件的值完全相同。