perf, tools, script: Add brstackasm output for branch stacks
Implement printing full disassembled sequences for branch stacks in perf
script. This allows to directly print hot paths for individual samples,
together with branch misprediction and even cycle count information.
% perf record -b ...
% perf script -F brstackasm
...
00007f0668d54e88 movsx (%rsi), %ecx
00007f0668d54e8b lea -0x30(%rcx), %eax
00007f0668d54e8e cmp $0x9, %al
00007f0668d54e90 jbe 0x68d54eaf
00007f0668d54e92 cmp %cl, %dl
00007f0668d54e94 jnz 0x68d54eb5
00007f0668d54e96 add $0x1, %rdi
00007f0668d54e9a movsx (%rdi), %edx
00007f0668d54e9d add $0x1, %rsi
00007f0668d54ea1 test %dl, %dl
00007f0668d54ea3 jnz _dl_cache_libcmp+11 # PRED 21 cycles
00007f0668d54dfb lea -0x30(%rdx), %eax
00007f0668d54dfe cmp $0x9, %al
00007f0668d54e00 ja _dl_cache_libcmp+152 # PRED 2 cycles
00007f0668d54e88 movsx (%rsi), %ecx
00007f0668d54e8b lea -0x30(%rcx), %eax
00007f0668d54e8e cmp $0x9, %al
00007f0668d54e90 jbe 0x68d54eaf
00007f0668d54e92 cmp %cl, %dl
00007f0668d54e94 jnz 0x68d54eb5 # PRED 3 cycles
00007f0668d54eb5 movsx %dl, %eax
00007f0668d54eb8 sub %ecx, %eax
00007f0668d54eba ret # PRED 1 cycles
00007f0668d54fae test %eax, %eax
00007f0668d54fb0 jz _dl_load_cache_lookup+688
00007f0668d54fb6 jns 0x68d54f70
00007f0668d54fb8 lea 0x1(%r14), %ebx
00007f0668d54fbc cmp %r15d, %ebx
00007f0668d54fbf nop
00007f0668d54fc0 jle 0x68d54f79 # PRED 2 cycles
Open issues:
- Occasionally the path does not reach up to the sample IP, as the LBRs
may be freezed earlier. Use precise events to avoid that.
v2: Remove bogus hunk. Document --max-blocks. Fix some printfs.
Port to latest tree.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2 files changed