cgroup: implement cgroup v2 thread support
This patch implements cgroup v2 thread support. The goal of the
thread mode is supporting hierarchical accounting and control at
thread granularity while staying inside the resource domain model
which allows coordination across different resource controllers and
handling of anonymous resource consumptions.
Once thread mode is enabled on a cgroup, the threads of the processes
which are in its subtree can be placed inside the subtree without
being restricted by process granularity or no-internal-process
constraint. Note that the threads aren't allowed to escape to a
different threaded subtree. To be used inside a threaded subtree, a
controller should explicitly support threaded mode and be able to
handle internal competition in the way which is appropriate for the
resource.
The root of a threaded subtree, where thread mode is enabled in the
first place, is called the thread root and serves as the resource
domain for the whole subtree. This is the last cgroup where
non-threaded controllers are operational and where all the
domain-level resource consumptions in the subtree are accounted. This
allows threaded controllers to operate at thread granularity when
requested while staying inside the scope of system-level resource
distribution.
Internally, in a threaded subtree, each css_set has its ->proc_cset
pointing to a matching css_set which belongs to the thread root. This
ensures that thread root level cgroup_subsys_state for all threaded
controllers are readily accessible for domain-level operations.
This patch enables threaded mode for the pids and perf_events
controllers. Neither has to worry about domain-level resource
consumptions and it's enough to simply set the flag.
For more details on the interface and behavior of the thread mode,
please refer to the section 2-2-2 in Documentation/cgroup-v2.txt added
by this patch. Note that the documentation update is not complete as
the rest of the documentation needs to be updated accordingly.
Rolling those updates into this patch can be confusing so that will be
separate patches.
Signed-off-by: Tejun Heo <tj@kernel.org>
5 files changed