slab: Allocate and use per-call-site caches

Use separate per-call-site kmem_cache or kmem_buckets. These are
allocated on demand to avoid wasting memory for unused caches.

A few caches need to be allocated very early to support allocating the
caches themselves: kstrdup(), kvasprintf(), and pcpu_mem_zalloc(). Any
GFP_ATOMIC allocations are currently left to be allocated from
KMALLOC_NORMAL.

With a distro config, /proc/slabinfo grows from ~400 entries to ~2200.

Since this feature (CONFIG_SLAB_PER_SITE) is redundant to
CONFIG_RANDOM_KMALLOC_CACHES, mark it a incompatible. Add Kconfig help
text that compares the features.

Improvements needed:
- Retain call site gfp flags in alloc_tag meta field to:
  - pre-allocate all GFP_ATOMIC caches (since their caches cannot
    be allocated on demand unless we want them to be GFP_ATOMIC
    themselves...)
  - Separate MEMCG allocations as well
- Allocate individual caches within kmem_buckets on demand to
  further reduce memory usage overhead.

Signed-off-by: Kees Cook <kees@kernel.org>
5 files changed