Documentation/process/threat-model.rst - pub/scm/linux/kernel/git/rafael/linux-pm - Git at Google

 The Linux Kernel threat model
 =============================

 There are a lot of assumptions regarding what the kernel does and does not
 protect against. These assumptions tend to cause confusion for bug reports
 (:doc:`security-related ones <security-bugs>` vs :doc:`non-security ones
 <../admin-guide/reporting-issues>`), and can complicate security enforcement
 when the responsibilities for some boundaries is not clear between the kernel,
 distros, administrators and users.

 This document tries to clarify the responsibilities of the kernel in this
 domain.

 The kernel's responsibilities
 -----------------------------

 The kernel abstracts access to local hardware resources and to remote systems
 in a way that allows multiple local users to get a fair share of the available
 resources granted to them, and, when the underlying hardware permits, to assign
 a level of confidentiality to their communications and to the data they are
 processing or storing.

 The kernel assumes that the underlying hardware behaves according to its
 specifications. This includes the integrity of the CPU's instruction set, the
 transparency of the branch prediction unit and the cache units, the consistency
 of the Memory Management Unit (MMU), the isolation of DMA-capable peripherals
 (e.g., via IOMMU), state transitions in controllers, ranges of values read from
 registers, the respect of documented hardware limitations, etc.

 When hardware fails to maintain its specified isolation (e.g., CPU bugs,
 side-channels, hardware response to unexpected inputs), the kernel will usually
 attempt to implement reasonable mitigations. These are best-effort measures
 intended to reduce the attack surface or elevate the cost of an attack within
 the limits of the hardware's facilities; they do not constitute a
 kernel-provided safety guarantee.

 Users always perform their activities under the authority of an administrator
 who is able to grant or deny various types of permissions that may affect how
 users benefit from available resources, or the level of confidentiality of
 their activities. Administrators may also delegate all or part of their own
 permissions to some users, particularly via capabilities but not only. All this
 is performed via configuration (sysctl, file-system permissions etc).

 The Linux Kernel applies a certain collection of default settings that match
 its threat model. Distros have their own threat model and will come with their
 own configuration presets, that the administrator may have to adjust to better
 suit their expectations (relax or restrict).

 By default, the Linux Kernel guarantees the following protections when running
 on common processors featuring privilege levels and memory management units:

 * **User-based isolation**: an unprivileged user may restrict access to their
   own data from other unprivileged users running on the same system. This
   includes:

   * stored data, via file system permissions
   * in-memory data (pages are not accessible by default to other users)
   * process activity (ptrace is not permitted to other users)
   * inter-process communication (other users may not observe data exchanged via
     UNIX domain sockets or other IPC mechanisms).
   * network communications within the same or with other systems

 * **Capability-based protection**:

   * users not having elevated capabilities (including but not limited to
     CAP_SYS_ADMIN) may not alter the
     kernel's configuration, memory nor state, change other users' view of the
     file system layout, grant any user capabilities they do not have, nor
     affect the system's availability (shutdown, reboot, panic, hang, or making
     the system unresponsive via unbounded resource exhaustion).
   * users not having the ``CAP_NET_ADMIN`` capability may not alter the network
     configuration, intercept nor spoof network communications from other users
     nor systems.
   * users not having ``CAP_SYS_PTRACE`` may not observe other users' processes
     activities.

 When ``CONFIG_USER_NS`` is set, the kernel also permits unprivileged users to
 create their own user namespace in which they have all capabilities, but with a
 number of restrictions (they may not perform actions that have impacts on the
 initial user namespace, such as changing time, loading modules or mounting
 block devices). Please refer to ``user_namespaces(7)`` for more details, the
 possibilities of user namespaces are not covered in this document.

 The kernel also offers a lot of troubleshooting and debugging facilities, which
 can constitute attack vectors when placed in wrong hands. While some of them
 are designed to be accessible to regular local users with a low risk (e.g.
 kernel logs via ``/proc/kmsg``), some would expose enough information to
 represent a risk in most places and the decision to expose them is under the
 administrator's responsibility (perf events, traces), and others are not
 designed to be accessed by non-privileged users (e.g. debugfs). Access to these
 facilities by a user who has been explicitly granted permission by an
 administrator does not constitute a security breach.

 Bugs that permit to violate the principles above constitute security breaches.
 However, bugs that permit one violation only once another one was already
 achieved are only weaknesses. The kernel applies a number of self-protection
 measures whose purpose is to avoid crossing a security boundary when certain
 classes of bugs are found, but a failure of these extra protections do not
 constitute a vulnerability alone.

 What does not constitute a security bug
 ---------------------------------------

 In the Linux kernel's threat model, the following classes of problems are
 **NOT** considered as Linux Kernel security bugs. However, when it is believed
 that the kernel could do better, they should be reported, so that they can be
 reviewed and fixed where reasonably possible, but they will be handled as any
 regular bug:

 * **Configuration**:

   * outdated kernels and particularly end-of-life branches are out of the scope
     of the kernel's threat model: administrators are responsible for keeping
     their system up to date. For a bug to qualify as a security bug, it must be
     demonstrated that it affects actively maintained versions.

   * build-level: changes to the kernel configuration that are explicitly
     documented as lowering the security level (e.g. ``CONFIG_NOMMU``), or
     targeted at developers only.

   * OS-level: changes to command line parameters, sysctls, filesystem
     permissions, user capabilities, exposure of privileged interfaces, that
     explicitly increase exposure by either offering non-default access to
     unprivileged users, or reduce the kernel's ability to enforce some
     protections or mitigations. Example: write access to procfs or debugfs.

   * issues triggered only when using features intended for development or
     debugging (e.g., LOCKDEP, KASAN, FAULT_INJECTION): these features are known
     to introduce overhead and potential instability and are not intended for
     production use.

   * issues affecting drivers exposed under CONFIG_STAGING, as well as features
     marked EXPERIMENTAL in the configuration.

   * loading of explicitly insecure/broken/staging modules, and generally any
     using any subsystem marked as experimental or not intended for production
     use.

   * running out-of-tree modules or unofficial kernel forks; these should be
     reported to the relevant vendor.

 * **Excess of initial privileges**:

   * actions performed by a user already possessing the privileges required to
     perform that action or modify that state (e.g. ``CAP_SYS_ADMIN``,
     ``CAP_NET_ADMIN``, ``CAP_SYS_RAWIO``, ``CAP_SYS_MODULE`` with no further
     boundary being crossed).

   * actions performed in user namespace that do not bypass the restrictions
     imposed to the initial user (e.g. ptrace usage, signal delivery, resource
     usage, access to FS/device/sysctl/memory, network binding, system/network
     configuration etc).

   * anything performed by the root user in the initial namespace (e.g. kernel
     oops when writing to a privileged device).

 * **Out of production use**:

   This covers theoretical/probabilistic attacks that rely on laboratory
   conditions with zero system noise, or those requiring an unrealistic number
   of attempts (e.g., billions of trials) that would be detected by standard
   system monitoring long before success, such as:

   * prediction of random numbers that only works in a totally silent
     environment (such as IP ID, TCP ports or sequence numbers that can only be
     guessed in a lab).

   * activity observation and information leaks based on probabilistic
     approaches that are prone to measurement noise and not realistically
     reproducible on a production system.

   * issues that can only be triggered by heavy attacks (e.g. brute force) whose
     impact on the system makes it unlikely or impossible to remain undetected
     before they succeed (e.g. consuming all memory before succeeding).

   * problems seen only under development simulators, emulators, or combinations
     that do not exist on real systems at the time of reporting (issues
     involving tens of millions of threads, tens of thousands of CPUs,
     unrealistic CPU frequencies, RAM sizes or disk capacities, network speeds.

   * issues whose reproduction requires hardware modification or emulation,
     including fake USB devices that pretend to be another one.

   * as well as issues that can be triggered at a cost that is orders of
     magnitude higher than the expected benefits (e.g. fully functional keyboard
     emulator only to retrieve 7 uninitialized bytes in a structure, or
     brute-force method involving millions of connection attempts to guess a
     port number).

 * **Hardening failures**:

   * ability to bypass some of the kernel's hardening measures with no
     demonstrable exploit path (e.g. ASLR bypass, events timing or probing with
     no demonstrable consequence). These are just weaknesses, not
     vulnerabilities.

   * missing argument checks and failure to report certain errors with no
     immediate consequence.

 * **Random information leaks**:

   This concerns information leaks of small data parts that happen to be there
   and that cannot be chosen by the attacker, or face access restrictions:

   * structure padding reported by syscalls or other interfaces.

   * identifiers, partial data, non-terminated strings reported in error
     messages.

   * Leaks of kernel memory addresses/pointers do not constitute an immediately
     exploitable vector and are not security bugs, though they must be reported
     and fixed.

 * **Crafted file system images**:

   * bugs triggered by mounting a corrupted or maliciously crafted file system
     image are generally not security bugs, as the kernel assumes the underlying
     storage media is under the administrator's control, unless the filesystem
     driver is specifically documented as being hardened against untrusted media.

   * issues that are resolved, mitigated, or detected by running a filesystem
     consistency check (fsck) on the image prior to mounting.

 * **Physical access**:

   Issues that require physical access to the machine, hardware modification, or
   the use of specialized hardware (e.g., logic analyzers, DMA-attack tools over
   PCI-E/Thunderbolt) are out of scope unless the system is explicitly
   configured with technologies meant to defend against such attacks
   (e.g. IOMMU).

 * **Functional and performance regressions**:

   Any issue that can be mitigated by setting proper permissions and limits
   doesn't qualify as a security bug.
	The Linux Kernel threat model
	=============================

	There are a lot of assumptions regarding what the kernel does and does not
	protect against. These assumptions tend to cause confusion for bug reports
	(:doc:`security-related ones <security-bugs>` vs :doc:`non-security ones
	<../admin-guide/reporting-issues>`), and can complicate security enforcement
	when the responsibilities for some boundaries is not clear between the kernel,
	distros, administrators and users.

	This document tries to clarify the responsibilities of the kernel in this
	domain.

	The kernel's responsibilities
	-----------------------------

	The kernel abstracts access to local hardware resources and to remote systems
	in a way that allows multiple local users to get a fair share of the available
	resources granted to them, and, when the underlying hardware permits, to assign
	a level of confidentiality to their communications and to the data they are
	processing or storing.

	The kernel assumes that the underlying hardware behaves according to its
	specifications. This includes the integrity of the CPU's instruction set, the
	transparency of the branch prediction unit and the cache units, the consistency
	of the Memory Management Unit (MMU), the isolation of DMA-capable peripherals
	(e.g., via IOMMU), state transitions in controllers, ranges of values read from
	registers, the respect of documented hardware limitations, etc.

	When hardware fails to maintain its specified isolation (e.g., CPU bugs,
	side-channels, hardware response to unexpected inputs), the kernel will usually
	attempt to implement reasonable mitigations. These are best-effort measures
	intended to reduce the attack surface or elevate the cost of an attack within
	the limits of the hardware's facilities; they do not constitute a
	kernel-provided safety guarantee.

	Users always perform their activities under the authority of an administrator
	who is able to grant or deny various types of permissions that may affect how
	users benefit from available resources, or the level of confidentiality of
	their activities. Administrators may also delegate all or part of their own
	permissions to some users, particularly via capabilities but not only. All this
	is performed via configuration (sysctl, file-system permissions etc).

	The Linux Kernel applies a certain collection of default settings that match
	its threat model. Distros have their own threat model and will come with their
	own configuration presets, that the administrator may have to adjust to better
	suit their expectations (relax or restrict).

	By default, the Linux Kernel guarantees the following protections when running
	on common processors featuring privilege levels and memory management units:

	* User-based isolation: an unprivileged user may restrict access to their
	own data from other unprivileged users running on the same system. This
	includes:

	* stored data, via file system permissions
	* in-memory data (pages are not accessible by default to other users)
	* process activity (ptrace is not permitted to other users)
	* inter-process communication (other users may not observe data exchanged via
	UNIX domain sockets or other IPC mechanisms).
	* network communications within the same or with other systems

	* Capability-based protection:

	* users not having elevated capabilities (including but not limited to
	CAP_SYS_ADMIN) may not alter the
	kernel's configuration, memory nor state, change other users' view of the
	file system layout, grant any user capabilities they do not have, nor
	affect the system's availability (shutdown, reboot, panic, hang, or making
	the system unresponsive via unbounded resource exhaustion).
	* users not having the ``CAP_NET_ADMIN`` capability may not alter the network
	configuration, intercept nor spoof network communications from other users
	nor systems.
	* users not having ``CAP_SYS_PTRACE`` may not observe other users' processes
	activities.

	When ``CONFIG_USER_NS`` is set, the kernel also permits unprivileged users to
	create their own user namespace in which they have all capabilities, but with a
	number of restrictions (they may not perform actions that have impacts on the
	initial user namespace, such as changing time, loading modules or mounting
	block devices). Please refer to ``user_namespaces(7)`` for more details, the
	possibilities of user namespaces are not covered in this document.

	The kernel also offers a lot of troubleshooting and debugging facilities, which
	can constitute attack vectors when placed in wrong hands. While some of them
	are designed to be accessible to regular local users with a low risk (e.g.
	kernel logs via ``/proc/kmsg``), some would expose enough information to
	represent a risk in most places and the decision to expose them is under the
	administrator's responsibility (perf events, traces), and others are not
	designed to be accessed by non-privileged users (e.g. debugfs). Access to these
	facilities by a user who has been explicitly granted permission by an
	administrator does not constitute a security breach.

	Bugs that permit to violate the principles above constitute security breaches.
	However, bugs that permit one violation only once another one was already
	achieved are only weaknesses. The kernel applies a number of self-protection
	measures whose purpose is to avoid crossing a security boundary when certain
	classes of bugs are found, but a failure of these extra protections do not
	constitute a vulnerability alone.

	What does not constitute a security bug
	---------------------------------------

	In the Linux kernel's threat model, the following classes of problems are
	NOT considered as Linux Kernel security bugs. However, when it is believed
	that the kernel could do better, they should be reported, so that they can be
	reviewed and fixed where reasonably possible, but they will be handled as any
	regular bug:

	* Configuration:

	* outdated kernels and particularly end-of-life branches are out of the scope
	of the kernel's threat model: administrators are responsible for keeping
	their system up to date. For a bug to qualify as a security bug, it must be
	demonstrated that it affects actively maintained versions.

	* build-level: changes to the kernel configuration that are explicitly
	documented as lowering the security level (e.g. ``CONFIG_NOMMU``), or
	targeted at developers only.

	* OS-level: changes to command line parameters, sysctls, filesystem
	permissions, user capabilities, exposure of privileged interfaces, that
	explicitly increase exposure by either offering non-default access to
	unprivileged users, or reduce the kernel's ability to enforce some
	protections or mitigations. Example: write access to procfs or debugfs.

	* issues triggered only when using features intended for development or
	debugging (e.g., LOCKDEP, KASAN, FAULT_INJECTION): these features are known
	to introduce overhead and potential instability and are not intended for
	production use.

	* issues affecting drivers exposed under CONFIG_STAGING, as well as features
	marked EXPERIMENTAL in the configuration.

	* loading of explicitly insecure/broken/staging modules, and generally any
	using any subsystem marked as experimental or not intended for production
	use.

	* running out-of-tree modules or unofficial kernel forks; these should be
	reported to the relevant vendor.

	* Excess of initial privileges:

	* actions performed by a user already possessing the privileges required to
	perform that action or modify that state (e.g. ``CAP_SYS_ADMIN``,
	``CAP_NET_ADMIN``, ``CAP_SYS_RAWIO``, ``CAP_SYS_MODULE`` with no further
	boundary being crossed).

	* actions performed in user namespace that do not bypass the restrictions
	imposed to the initial user (e.g. ptrace usage, signal delivery, resource
	usage, access to FS/device/sysctl/memory, network binding, system/network
	configuration etc).

	* anything performed by the root user in the initial namespace (e.g. kernel
	oops when writing to a privileged device).

	* Out of production use:

	This covers theoretical/probabilistic attacks that rely on laboratory
	conditions with zero system noise, or those requiring an unrealistic number
	of attempts (e.g., billions of trials) that would be detected by standard
	system monitoring long before success, such as:

	* prediction of random numbers that only works in a totally silent
	environment (such as IP ID, TCP ports or sequence numbers that can only be
	guessed in a lab).

	* activity observation and information leaks based on probabilistic
	approaches that are prone to measurement noise and not realistically
	reproducible on a production system.

	* issues that can only be triggered by heavy attacks (e.g. brute force) whose
	impact on the system makes it unlikely or impossible to remain undetected
	before they succeed (e.g. consuming all memory before succeeding).

	* problems seen only under development simulators, emulators, or combinations
	that do not exist on real systems at the time of reporting (issues
	involving tens of millions of threads, tens of thousands of CPUs,
	unrealistic CPU frequencies, RAM sizes or disk capacities, network speeds.

	* issues whose reproduction requires hardware modification or emulation,
	including fake USB devices that pretend to be another one.

	* as well as issues that can be triggered at a cost that is orders of
	magnitude higher than the expected benefits (e.g. fully functional keyboard
	emulator only to retrieve 7 uninitialized bytes in a structure, or
	brute-force method involving millions of connection attempts to guess a
	port number).

	* Hardening failures:

	* ability to bypass some of the kernel's hardening measures with no
	demonstrable exploit path (e.g. ASLR bypass, events timing or probing with
	no demonstrable consequence). These are just weaknesses, not
	vulnerabilities.

	* missing argument checks and failure to report certain errors with no
	immediate consequence.

	* Random information leaks:

	This concerns information leaks of small data parts that happen to be there
	and that cannot be chosen by the attacker, or face access restrictions:

	* structure padding reported by syscalls or other interfaces.

	* identifiers, partial data, non-terminated strings reported in error
	messages.

	* Leaks of kernel memory addresses/pointers do not constitute an immediately
	exploitable vector and are not security bugs, though they must be reported
	and fixed.

	* Crafted file system images:

	* bugs triggered by mounting a corrupted or maliciously crafted file system
	image are generally not security bugs, as the kernel assumes the underlying
	storage media is under the administrator's control, unless the filesystem
	driver is specifically documented as being hardened against untrusted media.

	* issues that are resolved, mitigated, or detected by running a filesystem
	consistency check (fsck) on the image prior to mounting.

	* Physical access:

	Issues that require physical access to the machine, hardware modification, or
	the use of specialized hardware (e.g., logic analyzers, DMA-attack tools over
	PCI-E/Thunderbolt) are out of scope unless the system is explicitly
	configured with technologies meant to defend against such attacks
	(e.g. IOMMU).

	* Functional and performance regressions:

	Any issue that can be mitigated by setting proper permissions and limits
	doesn't qualify as a security bug.