refs/heads/2016-11-18/capsetfile - pub/scm/linux/kernel/git/sergeh/linux-security

commit	645a7b2b2128d86da725934c91928ade6dc64e9e	[log] [tgz]
author	Serge Hallyn <serge.hallyn@ubuntu.com>	Tue Mar 01 00:09:35 2016 +0000
committer	Serge Hallyn <serge@hallyn.com>	Fri Nov 18 15:52:41 2016 -0600
tree	63c798092069c235a478e69ab935f96ed3711e86
parent	c1717701be2f0639e5f817385a524131dbd3ff38 [diff]

user-namespaced file capabilities - now with even more magic Root in a user ns cannot be trusted to write a traditional security.capability xattr. If it were allowed to do so, then any unprivileged user on the host could map his own uid to root in a namespace, write the xattr, and execute the file with privilege on the host. This patch introduces v3 of the security.capability xattr. It builds a vfs_ns_cap_data struct by appending a uid_t rootid to struct vfs_cap_data. This is the absolute uid_t (i.e. the uid_t in init_user_ns) of the root id (uid 0 in a namespace) in whose namespaces the file capabilities may take effect. When a task in a user ns (which is privileged with CAP_SETFCAP toward that user_ns) asks to write v2 security.capability, the kernel will transparently rewrite the xattr as a v3 with the appropriate rootid. Subsequently, any task executing the file which has the noted kuid as its root uid, or which is in a descendent user_ns of such a user_ns, will run the file with capabilities. If a task writes a v3 security.capability, then it can provide a uid (valid within its own user namespace, over which it has CAP_SETFCAP) for the xattr. The kernel will translate that to the absolute uid, and write that to disk. After this, a task in the writer's namespace will not be able to use those capabilities, but a task in a namespace where the given uid is root will. Only a single security.capability xattr may be written. A task may overwrite the existing one so long as it was written by a user mapped into his own user_ns over which he has CAP_SETFCAP. This allows a simple setxattr to work, allows tar/untar to work, and allows us to tar in one namespace and untar in another while preserving the capability, without risking leaking privilege into a parent namespace. Changelog: Nov 02 2016: fix invalid check at refuse_fcap_overwrite() Nov 07 2016: convert rootid from and to fs user_ns