106c076d23cca67c959a6fd1ccadb5b3ef01ddc9 - pub/scm/linux/kernel/git/zanussi/linux-trace

commit	106c076d23cca67c959a6fd1ccadb5b3ef01ddc9	[log] [tgz]
author	Axel Rasmussen <axelrasmussen@google.com>	Thu Sep 17 11:13:47 2020 -0700
committer	Tom Zanussi <zanussi@kernel.org>	Wed Sep 23 08:48:08 2020 -0500
tree	7967afa37268f3ac05ed668741995d13db21ac13
parent	fd264ce96c382bc2e36eb1f49ac45c5980650244 [diff]

mmap_lock: add tracepoints around lock acquisition

The goal of these tracepoints is to be able to debug lock contention
issues. This lock is acquired on most (all?) mmap / munmap / page fault
operations, so a multi-threaded process which does a lot of these can
experience significant contention.

We trace just before we start acquisition, when the acquisition returns
(whether it succeeded or not), and when the lock is released (or
downgraded). The events are broken out by lock type (read / write).

The events are also broken out by memcg path. For container-based
workloads, users often think of several processes in a memcg as a single
logical "task", so collecting statistics at this level is useful.

These events *do not* include latency bucket information, which means
for a proper latency histogram users will need to use BPF instead of
event histograms. The benefit we get from this is simpler code.

This patch is a no-op if the Kconfig option is not enabled. If it is,
tracepoints are still disabled by default (configurable at runtime);
the only fixed cost here is un-inlining a few functions. As best as
I've been able to measure, the overhead this introduces is a small
fraction of 1%. Actually hooking up the tracepoints to BPF introduces
additional overhead, depending on exactly what the BPF program is
collecting.

include/linux/mmap_lock.h[diff]
include/trace/events/mmap_lock.h[Added - diff]
mm/Kconfig[diff]
mm/Makefile[diff]
mm/mmap_lock.c[Added - diff]

5 files changed

tree: 7967afa37268f3ac05ed668741995d13db21ac13