| .\" Copyright (c) 2016 by Michael Kerrisk <mtk.manpages@gmail.com> |
| .\" |
| .\" %%%LICENSE_START(VERBATIM) |
| .\" Permission is granted to make and distribute verbatim copies of this |
| .\" manual provided the copyright notice and this permission notice are |
| .\" preserved on all copies. |
| .\" |
| .\" Permission is granted to copy and distribute modified versions of this |
| .\" manual under the conditions for verbatim copying, provided that the |
| .\" entire resulting derived work is distributed under the terms of a |
| .\" permission notice identical to this one. |
| .\" |
| .\" Since the Linux kernel and libraries are constantly changing, this |
| .\" manual page may be incorrect or out-of-date. The author(s) assume no |
| .\" responsibility for errors or omissions, or for damages resulting from |
| .\" the use of the information contained herein. The author(s) may not |
| .\" have taken the same level of care in the production of this manual, |
| .\" which is licensed free of charge, as they might when working |
| .\" professionally. |
| .\" |
| .\" Formatted or processed versions of this manual, if unaccompanied by |
| .\" the source, must acknowledge the copyright and authors of this work. |
| .\" %%%LICENSE_END |
| .\" |
| .\" |
| .TH MOUNT_NAMESPACES 7 2016-12-12 "Linux" "Linux Programmer's Manual" |
| .SH NAME |
| mount_namespaces \- overview of Linux mount namespaces |
| .SH DESCRIPTION |
| For an overview of namespaces, see |
| .BR namespaces (7). |
| |
| Mount namespaces provide isolation of the list of mount points seen |
| by the processes in each namespace instance. |
| Thus, the processes in each of the mount namespace instances |
| will see distinct single-directory hierarchies. |
| |
| The views provided by the |
| .IR /proc/[pid]/mounts , |
| .IR /proc/[pid]/mountinfo , |
| and |
| .IR /proc/[pid]/mountstats |
| files (all described in |
| .BR proc (5)) |
| correspond to the mount namespace in which the process with the PID |
| .IR [pid] |
| resides. |
| (All of the processes that reside in the same mount namespace |
| will see the same view in these files.) |
| |
| When a process creates a new mount namespace using |
| .BR clone (2) |
| or |
| .BR unshare (2) |
| with the |
| .BR CLONE_NEWNS |
| flag, the mount point list for the new namespace is a |
| .I copy |
| of the caller's mount point list. |
| Subsequent modifications to the mount point list |
| .RB ( mount (2) |
| and |
| .BR umount (2)) |
| in either mount namespace will not (by default) affect the |
| mount point list seen in the other namespace |
| (but see the following discussion of shared subtrees). |
| .\" |
| .\" ============================================================ |
| .\" |
| .SS Restrictions on mount namespaces |
| Note the following points with respect to mount namespaces: |
| .IP * 3 |
| A mount namespace has an owner user namespace. |
| A mount namespace whose owner user namespace is different from |
| the owner user namespace of its parent mount namespace is |
| considered a less privileged mount namespace. |
| .IP * |
| When creating a less privileged mount namespace, |
| shared mounts are reduced to slave mounts. |
| (Shared and slave mounts are discussed below.) |
| This ensures that mappings performed in less |
| privileged mount namespaces will not propagate to more privileged |
| mount namespaces. |
| .IP * |
| .\" FIXME . |
| .\" What does "come as a single unit from more privileged mount" mean? |
| Mounts that come as a single unit from more privileged mount are |
| locked together and may not be separated in a less privileged mount |
| namespace. |
| (The |
| .BR unshare (2) |
| .B CLONE_NEWNS |
| operation brings across all of the mounts from the original |
| mount namespace as a single unit, |
| and recursive mounts that propagate between |
| mount namespaces propagate as a single unit.) |
| .IP * |
| The |
| .BR mount (2) |
| flags |
| .BR MS_RDONLY , |
| .BR MS_NOSUID , |
| .BR MS_NOEXEC , |
| and the "atime" flags |
| .RB ( MS_NOATIME , |
| .BR MS_NODIRATIME , |
| .BR MS_RELATIME ) |
| settings become locked |
| .\" commit 9566d6742852c527bf5af38af5cbb878dad75705 |
| .\" Author: Eric W. Biederman <ebiederm@xmission.com> |
| .\" Date: Mon Jul 28 17:26:07 2014 -0700 |
| .\" |
| .\" mnt: Correct permission checks in do_remount |
| .\" |
| when propagated from a more privileged to |
| a less privileged mount namespace, |
| and may not be changed in the less privileged mount namespace. |
| .IP * |
| .\" (As of 3.18-rc1 (in Al Viro's 2014-08-30 vfs.git#for-next tree)) |
| A file or directory that is a mount point in one namespace that is not |
| a mount point in another namespace, may be renamed, unlinked, or removed |
| .RB ( rmdir (2)) |
| in the mount namespace in which it is not a mount point |
| (subject to the usual permission checks). |
| .IP |
| Previously, attempting to unlink, rename, or remove a file or directory |
| that was a mount point in another mount namespace would result in the error |
| .BR EBUSY . |
| That behavior had technical problems of enforcement (e.g., for NFS) |
| and permitted denial-of-service attacks against more privileged users. |
| (i.e., preventing individual files from being updated |
| by bind mounting on top of them). |
| .\" |
| .SH SHARED SUBTREES |
| After the implementation of mount namespaces was completed, |
| experience showed that the isolation that they provided was, |
| in some cases, too great. |
| For example, in order to make a newly loaded optical disk |
| available in all mount namespaces, |
| a mount operation was required in each namespace. |
| For this use case, and others, |
| the shared subtree feature was introduced in Linux 2.6.15. |
| This feature allows for automatic, controlled propagation of mount and unmount |
| .I events |
| between namespaces |
| (or, more precisely, between the members of a |
| .IR "peer group" |
| that are propagating events to one another). |
| |
| Each mount point is marked (via |
| .BR mount (2)) |
| as having one of the following |
| .IR "propagation types" : |
| .TP |
| .BR MS_SHARED |
| This mount point shares events with members of a peer group. |
| Mount and unmount events immediately under this mount point will propagate |
| to the other mount points that are members of the peer group. |
| .I Propagation |
| here means that the same mount or unmount will automatically occur |
| under all of the other mount points in the peer group. |
| Conversely, mount and unmount events that take place under |
| peer mount points will propagate to this mount point. |
| .TP |
| .BR MS_PRIVATE |
| This mount point is private; it does not have a peer group. |
| Mount and unmount events do not propagate into or out of this mount point. |
| .TP |
| .BR MS_SLAVE |
| Mount and unmount events propagate into this mount point from |
| a (master) shared peer group. |
| Mount and unmount events under this mount point do not propagate to any peer. |
| |
| Note that a mount point can be the slave of another peer group |
| while at the same time sharing mount and unmount events |
| with a peer group of which it is a member. |
| (More precisely, one peer group can be the slave of another peer group.) |
| .TP |
| .BR MS_UNBINDABLE |
| This is like a private mount, |
| and in addition this mount can't be bind mounted. |
| Attempts to bind mount this mount |
| .RB ( mount (2) |
| with the |
| .BR MS_BIND |
| flag) will fail. |
| |
| When a recursive bind mount |
| .RB ( mount (2) |
| with the |
| .BR MS_BIND |
| and |
| .BR MS_REC |
| flags) is performed on a directory subtree, |
| any bind mounts within the subtree are automatically pruned |
| (i.e., not replicated) |
| when replicating that subtree to produce the target subtree. |
| .PP |
| For a discussion of the propagation type assigned to a new mount, |
| see NOTES. |
| |
| The propagation type is a per-mount-point setting; |
| some mount points may be marked as shared |
| (with each shared mount point being a member of a distinct peer group), |
| while others are private |
| (or slaved or unbindable). |
| |
| Note that a mount's propagation type determines whether |
| mounts and unmounts of mount points |
| .I "immediately under" |
| the mount point are propagated. |
| Thus, the propagation type does not affect propagation of events for |
| grandchildren and further removed descendant mount points. |
| What happens if the mount point itself is unmounted is determined by |
| the propagation type that is in effect for the |
| .I parent |
| of the mount point. |
| |
| Members are added to a |
| .IR "peer group" |
| when a mount point is marked as shared and either: |
| .IP * 3 |
| the mount point is replicated during the creation of a new mount namespace; or |
| .IP * |
| a new bind mount is created from the mount point. |
| .PP |
| In both of these cases, the new mount point joins the peer group |
| of which the existing mount point is a member. |
| A mount ceases to be a member of a peer group when either |
| the mount is explicitly unmounted, |
| or when the mount is implicitly unmounted because a mount namespace is removed |
| (because it has no more member processes). |
| |
| The propagation type of the mount points in a mount namespace |
| can be discovered via the "optional fields" exposed in |
| .IR /proc/[pid]/mountinfo . |
| (See |
| .BR proc (5) |
| for details of this file.) |
| The following tags can appear in the optional fields |
| for a record in that file: |
| .TP |
| .I shared:X |
| This mount point is shared in peer group |
| .IR X . |
| Each peer group has a unique ID that is automatically |
| generated by the kernel, |
| and all mount points in the same peer group will show the same ID. |
| (These IDs are assigned starting from the value 1, |
| and may be recycled when a peer group ceases to have any members.) |
| .TP |
| .I master:X |
| This mount is a slave to shared peer group |
| .IR X . |
| .TP |
| .IR propagate_from:X " (since Linux 2.6.26)" |
| .\" commit 97e7e0f71d6d948c25f11f0a33878d9356d9579e |
| This mount is a slave and receives propagation from shared peer group |
| .IR X . |
| This tag will always appear in conjunction with a |
| .IR master:X |
| tag. |
| Here, |
| .IR X |
| is the closest dominant peer group under the process's root directory. |
| If |
| .IR X |
| is the immediate master of the mount, |
| or if there is no dominant peer group under the same root, |
| then only the |
| .IR master:X |
| field is present and not the |
| .IR propagate_from:X |
| field. |
| For further details, see below. |
| .TP |
| .IR unbindable |
| This is an unbindable mount. |
| .PP |
| If none of the above tags is present, then this is a private mount. |
| .SS MS_SHARED and MS_PRIVATE example |
| Suppose that on a terminal in the initial mount namespace, |
| we mark one mount point as shared and another as private, |
| and then view the mounts in |
| .IR /proc/self/mountinfo : |
| |
| .nf |
| .in +4n |
| sh1# \fBmount \-\-make\-shared /mntS\fP |
| sh1# \fBmount \-\-make\-private /mntP\fP |
| sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 77 61 8:17 / /mntS rw,relatime shared:1 |
| 83 61 8:15 / /mntP rw,relatime |
| .in |
| .fi |
| |
| From the |
| .IR /proc/self/mountinfo |
| output, we see that |
| .IR /mntS |
| is a shared mount in peer group 1, and that |
| .IR /mntP |
| has no optional tags, indicating that it is a private mount. |
| The first two fields in each record in this file are the unique |
| ID for this mount, and the mount ID of the parent mount. |
| We can further inspect this file to see that the parent mount point of |
| .IR /mntS |
| and |
| .IR /mntP |
| is the root directory, |
| .IR / , |
| which is mounted as private: |
| |
| .nf |
| .in +4n |
| sh1# \fBcat /proc/self/mountinfo | awk \(aq$1 == 61\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 61 0 8:2 / / rw,relatime |
| .in |
| .fi |
| |
| On a second terminal, |
| we create a new mount namespace where we run a second shell |
| and inspect the mounts: |
| |
| .nf |
| .in +4n |
| $ \fBPS1=\(aqsh2# \(aq sudo unshare \-m \-\-propagation unchanged sh\fP |
| sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 222 145 8:17 / /mntS rw,relatime shared:1 |
| 225 145 8:15 / /mntP rw,relatime |
| .in |
| .fi |
| |
| The new mount namespace received a copy of the initial mount namespace's |
| mount points. |
| These new mount points maintain the same propagation types, |
| but have unique mount IDs. |
| (The |
| .IR \-\-propagation\ unchanged |
| option prevents |
| .BR unshare (1) |
| from marking all mounts as private when creating a new mount namespace, |
| .\" Since util-linux 2.27 |
| which it does by default.) |
| |
| In the second terminal, we then create submounts under each of |
| .IR /mntS |
| and |
| .IR /mntP |
| and inspect the set-up: |
| |
| .nf |
| .in +4n |
| sh2# \fBmkdir /mntS/a\fP |
| sh2# \fBmount /dev/sdb6 /mntS/a\fP |
| sh2# \fBmkdir /mntP/b\fP |
| sh2# \fBmount /dev/sdb7 /mntP/b\fP |
| sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 222 145 8:17 / /mntS rw,relatime shared:1 |
| 225 145 8:15 / /mntP rw,relatime |
| 178 222 8:22 / /mntS/a rw,relatime shared:2 |
| 230 225 8:23 / /mntP/b rw,relatime |
| .in |
| .fi |
| |
| From the above, it can be seen that |
| .IR /mntS/a |
| was created as shared (inheriting this setting from its parent mount) and |
| .IR /mntP/b |
| was created as a private mount. |
| |
| Returning to the first terminal and inspecting the set-up, |
| we see that the new mount created under the shared mount point |
| .IR /mntS |
| propagated to its peer mount (in the initial mount namespace), |
| but the new mount created under the private mount point |
| .IR /mntP |
| did not propagate: |
| |
| .nf |
| .in +4n |
| sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 77 61 8:17 / /mntS rw,relatime shared:1 |
| 83 61 8:15 / /mntP rw,relatime |
| 179 77 8:22 / /mntS/a rw,relatime shared:2 |
| .in |
| .fi |
| .\" |
| .SS MS_SLAVE example |
| Making a mount point a slave allows it to receive propagated |
| mount and unmount events from a master shared peer group, |
| while preventing it from propagating events to that master. |
| This is useful if we want to (say) receive a mount event when |
| an optical disk is mounted in the master shared peer group |
| (in another mount namespace), |
| but want to prevent mount and unmount events under the slave mount |
| from having side effects in other namespaces. |
| |
| We can demonstrate the effect of slaving by first marking |
| two mount points as shared in the initial mount namespace: |
| |
| .nf |
| .in +4n |
| sh1# \fBmount \-\-make\-shared /mntX\fP |
| sh1# \fBmount \-\-make\-shared /mntY\fP |
| sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 132 83 8:23 / /mntX rw,relatime shared:1 |
| 133 83 8:22 / /mntY rw,relatime shared:2 |
| .in |
| .fi |
| |
| On a second terminal, |
| we create a new mount namespace and inspect the mount points: |
| |
| .nf |
| .in +4n |
| sh2# \fBunshare \-m \-\-propagation unchanged sh\fP |
| sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 168 167 8:23 / /mntX rw,relatime shared:1 |
| 169 167 8:22 / /mntY rw,relatime shared:2 |
| .in |
| .fi |
| |
| In the new mount namespace, we then mark one of the mount points as a slave: |
| |
| .nf |
| .in +4n |
| sh2# \fBmount \-\-make\-slave /mntY\fP |
| sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 168 167 8:23 / /mntX rw,relatime shared:1 |
| 169 167 8:22 / /mntY rw,relatime master:2 |
| .in |
| .fi |
| |
| From the above output, we see that |
| .IR /mntY |
| is now a slave mount that is receiving propagation events from |
| the shared peer group with the ID 2. |
| |
| Continuing in the new namespace, we create submounts under each of |
| .IR /mntX |
| and |
| .IR /mntY : |
| |
| .nf |
| .in +4n |
| sh2# \fBmkdir /mntX/a\fP |
| sh2# \fBmount /dev/sda3 /mntX/a\fP |
| sh2# \fBmkdir /mntY/b\fP |
| sh2# \fBmount /dev/sda5 /mntY/b\fP |
| .in |
| .fi |
| |
| When we inspect the state of the mount points in the new mount namespace, |
| we see that |
| .IR /mntX/a |
| was created as a new shared mount |
| (inheriting the "shared" setting from its parent mount) and |
| .IR /mntY/b |
| was created as a private mount: |
| |
| .nf |
| .in +4n |
| sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 168 167 8:23 / /mntX rw,relatime shared:1 |
| 169 167 8:22 / /mntY rw,relatime master:2 |
| 173 168 8:3 / /mntX/a rw,relatime shared:3 |
| 175 169 8:5 / /mntY/b rw,relatime |
| .in |
| .fi |
| |
| Returning to the first terminal (in the initial mount namespace), |
| we see that the mount |
| .IR /mntX/a |
| propagated to the peer (the shared |
| .IR /mntX ), |
| but the mount |
| .IR /mntY/b |
| was not propagated: |
| |
| .nf |
| .in +4n |
| sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 132 83 8:23 / /mntX rw,relatime shared:1 |
| 133 83 8:22 / /mntY rw,relatime shared:2 |
| 174 132 8:3 / /mntX/a rw,relatime shared:3 |
| .in |
| .fi |
| |
| Now we create a new mount point under |
| .IR /mntY |
| in the first shell: |
| |
| .nf |
| .in +4n |
| sh1# \fBmkdir /mntY/c\fP |
| sh1# \fBmount /dev/sda1 /mntY/c\fP |
| sh1# \fBcat /proc/self/mountinfo | grep '/mnt' | sed 's/ \- .*//'\fP |
| 132 83 8:23 / /mntX rw,relatime shared:1 |
| 133 83 8:22 / /mntY rw,relatime shared:2 |
| 174 132 8:3 / /mntX/a rw,relatime shared:3 |
| 178 133 8:1 / /mntY/c rw,relatime shared:4 |
| .in |
| .fi |
| |
| When we examine the mount points in the second mount namespace, |
| we see that in this case the new mount has been propagated |
| to the slave mount point, |
| and that the new mount is itself a slave mount (to peer group 4): |
| |
| .nf |
| .in +4n |
| sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 168 167 8:23 / /mntX rw,relatime shared:1 |
| 169 167 8:22 / /mntY rw,relatime master:2 |
| 173 168 8:3 / /mntX/a rw,relatime shared:3 |
| 175 169 8:5 / /mntY/b rw,relatime |
| 179 169 8:1 / /mntY/c rw,relatime master:4 |
| .in |
| .fi |
| .\" |
| .SS MS_UNBINDABLE example |
| One of the primary purposes of unbindable mounts is to avoid |
| the "mount point explosion" problem when repeatedly performing bind mounts |
| of a higher-level subtree at a lower-level mount point. |
| The problem is illustrated by the following shell session. |
| |
| Suppose we have a system with the following mount points: |
| |
| .nf |
| .in +4n |
| # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP |
| /dev/sda1 on / |
| /dev/sdb6 on /mntX |
| /dev/sdb7 on /mntY |
| .in |
| .fi |
| |
| Suppose furthermore that we wish to recursively bind mount |
| the root directory under several users' home directories. |
| We do this for the first user, and inspect the mount points: |
| |
| .nf |
| .in +4n |
| # \fBmount \-\-rbind / /home/cecilia/\fP |
| # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP |
| /dev/sda1 on / |
| /dev/sdb6 on /mntX |
| /dev/sdb7 on /mntY |
| /dev/sda1 on /home/cecilia |
| /dev/sdb6 on /home/cecilia/mntX |
| /dev/sdb7 on /home/cecilia/mntY |
| .in |
| .fi |
| |
| When we repeat this operation for the second user, |
| we start to see the explosion problem: |
| |
| .nf |
| .in +4n |
| # \fBmount \-\-rbind / /home/henry\fP |
| # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP |
| /dev/sda1 on / |
| /dev/sdb6 on /mntX |
| /dev/sdb7 on /mntY |
| /dev/sda1 on /home/cecilia |
| /dev/sdb6 on /home/cecilia/mntX |
| /dev/sdb7 on /home/cecilia/mntY |
| /dev/sda1 on /home/henry |
| /dev/sdb6 on /home/henry/mntX |
| /dev/sdb7 on /home/henry/mntY |
| /dev/sda1 on /home/henry/home/cecilia |
| /dev/sdb6 on /home/henry/home/cecilia/mntX |
| /dev/sdb7 on /home/henry/home/cecilia/mntY |
| .in |
| .fi |
| |
| Under |
| .IR /home/henry , |
| we have not only recursively added the |
| .IR /mntX |
| and |
| .IR /mntY |
| mounts, but also the recursive mounts of those directories under |
| .IR /home/cecilia |
| that were created in the previous step. |
| Upon repeating the step for a third user, |
| it becomes obvious that the explosion is exponential in nature: |
| |
| .nf |
| .in +4n |
| # \fBmount \-\-rbind / /home/otto\fP |
| # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP |
| /dev/sda1 on / |
| /dev/sdb6 on /mntX |
| /dev/sdb7 on /mntY |
| /dev/sda1 on /home/cecilia |
| /dev/sdb6 on /home/cecilia/mntX |
| /dev/sdb7 on /home/cecilia/mntY |
| /dev/sda1 on /home/henry |
| /dev/sdb6 on /home/henry/mntX |
| /dev/sdb7 on /home/henry/mntY |
| /dev/sda1 on /home/henry/home/cecilia |
| /dev/sdb6 on /home/henry/home/cecilia/mntX |
| /dev/sdb7 on /home/henry/home/cecilia/mntY |
| /dev/sda1 on /home/otto |
| /dev/sdb6 on /home/otto/mntX |
| /dev/sdb7 on /home/otto/mntY |
| /dev/sda1 on /home/otto/home/cecilia |
| /dev/sdb6 on /home/otto/home/cecilia/mntX |
| /dev/sdb7 on /home/otto/home/cecilia/mntY |
| /dev/sda1 on /home/otto/home/henry |
| /dev/sdb6 on /home/otto/home/henry/mntX |
| /dev/sdb7 on /home/otto/home/henry/mntY |
| /dev/sda1 on /home/otto/home/henry/home/cecilia |
| /dev/sdb6 on /home/otto/home/henry/home/cecilia/mntX |
| /dev/sdb7 on /home/otto/home/henry/home/cecilia/mntY |
| .in |
| .fi |
| |
| The mount explosion problem in the above scenario can be avoided |
| by making each of the new mounts unbindable. |
| The effect of doing this is that recursive mounts of the root |
| directory will not replicate the unbindable mounts. |
| We make such a mount for the first user: |
| |
| .nf |
| .in +4n |
| # \fBmount \-\-rbind \-\-make\-unbindable / /home/cecilia\fP |
| .in |
| .fi |
| |
| Before going further, we show that unbindable mounts are indeed unbindable: |
| |
| .nf |
| .in +4n |
| # \fBmkdir /mntZ\fP |
| # \fBmount \-\-bind /home/cecilia /mntZ\fP |
| mount: wrong fs type, bad option, bad superblock on /home/cecilia, |
| missing codepage or helper program, or other error |
| |
| In some cases useful info is found in syslog \- try |
| dmesg | tail or so. |
| .in |
| .fi |
| |
| Now we create unbindable recursive bind mounts for the other two users: |
| |
| .nf |
| .in +4n |
| # \fBmount \-\-rbind \-\-make\-unbindable / /home/henry\fP |
| # \fBmount \-\-rbind \-\-make\-unbindable / /home/otto\fP |
| .in |
| .fi |
| |
| Upon examining the list of mount points, |
| we see there has been no explosion of mount points, |
| because the unbindable mounts were not replicated |
| under each user's directory: |
| |
| .nf |
| .in +4n |
| # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP |
| /dev/sda1 on / |
| /dev/sdb6 on /mntX |
| /dev/sdb7 on /mntY |
| /dev/sda1 on /home/cecilia |
| /dev/sdb6 on /home/cecilia/mntX |
| /dev/sdb7 on /home/cecilia/mntY |
| /dev/sda1 on /home/henry |
| /dev/sdb6 on /home/henry/mntX |
| /dev/sdb7 on /home/henry/mntY |
| /dev/sda1 on /home/otto |
| /dev/sdb6 on /home/otto/mntX |
| /dev/sdb7 on /home/otto/mntY |
| .in |
| .fi |
| .\" |
| .SS Propagation type transitions |
| The following table shows the effect that applying a new propagation type |
| (i.e., |
| .IR "mount \-\-make\-xxxx") |
| has on the existing propagation type of a mount point. |
| The rows correspond to existing propagation types, |
| and the columns are the new propagation settings. |
| For reasons of space, "private" is abbreviated as "priv" and |
| "unbindable" as "unbind". |
| .TS |
| lb2 lb2 lb2 lb2 lb1 |
| lb l l l l l. |
| make-shared make-slave make-priv make-unbind |
| shared shared slave/priv [1] priv unbind |
| slave slave+shared slave [2] priv unbind |
| slave+shared slave+shared slave priv unbind |
| private shared priv [2] priv unbind |
| unbindable shared unbind [2] priv unbind |
| .TE |
| |
| Note the following details to the table: |
| .IP [1] 4 |
| If a shared mount is the only mount in its peer group, |
| making it a slave automatically makes it private. |
| .IP [2] |
| Slaving a nonshared mount has no effect on the mount. |
| .\" |
| .SS Bind (MS_BIND) semantics |
| Suppose that the following command is performed: |
| |
| mount \-\-bind A/a B/b |
| |
| Here, |
| .I A |
| is the source mount point, |
| .I B |
| is the destination mount point, |
| .I a |
| is a subdirectory path under the mount point |
| .IR A , |
| and |
| .I b |
| is a subdirectory path under the mount point |
| .IR B . |
| The propagation type of the resulting mount, |
| .IR B/b , |
| depends on the propagation types of the mount points |
| .IR A |
| and |
| .IR B , |
| and is summarized in the following table. |
| |
| .TS |
| lb2 lb1 lb2 lb2 lb2 lb0 |
| lb2 lb1 lb2 lb2 lb2 lb0 |
| lb lb l l l l l. |
| source(A) |
| shared private slave unbind |
| _ |
| dest(B) shared | shared shared slave+shared invalid |
| nonshared | shared private slave invalid |
| .TE |
| |
| Note that a recursive bind of a subtree follows the same semantics |
| as for a bind operation on each mount in the subtree. |
| (Unbindable mounts are automatically pruned at the target mount point.) |
| |
| For further details, see |
| .I Documentation/filesystems/sharedsubtree.txt |
| in the kernel source tree. |
| .\" |
| .SS Move (MS_MOVE) semantics |
| Suppose that the following command is performed: |
| |
| mount \-\-move A B/b |
| |
| Here, |
| .I A |
| is the source mount point, |
| .I B |
| is the destination mount point, and |
| .I b |
| is a subdirectory path under the mount point |
| .IR B . |
| The propagation type of the resulting mount, |
| .IR B/b , |
| depends on the propagation types of the mount points |
| .IR A |
| and |
| .IR B , |
| and is summarized in the following table. |
| |
| .TS |
| lb2 lb1 lb2 lb2 lb2 lb0 |
| lb2 lb1 lb2 lb2 lb2 lb0 |
| lb lb l l l l l. |
| source(A) |
| shared private slave unbind |
| _ |
| dest(B) shared | shared shared slave+shared invalid |
| nonshared | shared private slave unbindable |
| .TE |
| |
| Note: moving a mount that resides under a shared mount is invalid. |
| |
| For further details, see |
| .I Documentation/filesystems/sharedsubtree.txt |
| in the kernel source tree. |
| .\" |
| .SS Mount semantics |
| Suppose that we use the following command to create a mount point: |
| |
| mount device B/b |
| |
| Here, |
| .I B |
| is the destination mount point, and |
| .I b |
| is a subdirectory path under the mount point |
| .IR B . |
| The propagation type of the resulting mount, |
| .IR B/b , |
| follows the same rules as for a bind mount, |
| where the propagation type of the source mount |
| is considered always to be private. |
| .\" |
| .SS Unmount semantics |
| Suppose that we use the following command to tear down a mount point: |
| |
| unmount A |
| |
| Here, |
| .I A |
| is a mount point on |
| .IR B/b , |
| where |
| .I B |
| is the parent mount and |
| .I b |
| is a subdirectory path under the mount point |
| .IR B . |
| If |
| .B B |
| is shared, then all most-recently-mounted mounts at |
| .I b |
| on mounts that receive propagation from mount |
| .I B |
| and do not have submounts under them are unmounted. |
| .\" |
| .SS The /proc/[pid]/mountinfo "propagate_from" tag |
| The |
| .I propagate_from:X |
| tag is shown in the optional fields of a |
| .IR /proc/[pid]/mountinfo |
| record in cases where a process can't see a slave's immediate master |
| (i.e., the pathname of the master is not reachable from |
| the filesystem root directory) |
| and so cannot determine the |
| chain of propagation between the mounts it can see. |
| |
| In the following example, we first create a two-link master-slave chain |
| between the mounts |
| .IR /mnt , |
| .IR /tmp/etc , |
| and |
| .IR /mnt/tmp/etc . |
| Then the |
| .BR chroot (1) |
| command is used to make the |
| .IR /tmp/etc |
| mount point unreachable from the root directory, |
| creating a situation where the master of |
| .IR /mnt/tmp/etc |
| is not reachable from the (new) root directory of the process. |
| |
| First, we bind mount the root directory onto |
| .IR /mnt |
| and then bind mount |
| .IR /proc |
| at |
| .IR /mnt/proc |
| so that after the later |
| .BR chroot (1) |
| the |
| .BR proc (5) |
| filesystem remains visible at the correct location |
| in the chroot-ed environment. |
| |
| .nf |
| .in +4n |
| # \fBmkdir \-p /mnt/proc\fP |
| # \fBmount \-\-bind / /mnt\fP |
| # \fBmount \-\-bind /proc /mnt/proc\fP |
| .in |
| .fi |
| |
| Next, we ensure that the |
| .IR /mnt |
| mount is a shared mount in a new peer group (with no peers): |
| |
| .nf |
| .in +4n |
| # \fBmount \-\-make\-private /mnt\fP # Isolate from any previous peer group |
| # \fBmount \-\-make\-shared /mnt\fP |
| # \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 239 61 8:2 / /mnt ... shared:102 |
| 248 239 0:4 / /mnt/proc ... shared:5 |
| .in |
| .fi |
| |
| Next, we bind mount |
| .IR /mnt/etc |
| onto |
| .IR /tmp/etc : |
| |
| .nf |
| .in +4n |
| # \fBmkdir \-p /tmp/etc\fP |
| # \fBmount \-\-bind /mnt/etc /tmp/etc\fP |
| # \fBcat /proc/self/mountinfo | egrep \(aq/mnt|/tmp/\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 239 61 8:2 / /mnt ... shared:102 |
| 248 239 0:4 / /mnt/proc ... shared:5 |
| 267 40 8:2 /etc /tmp/etc ... shared:102 |
| .in |
| .fi |
| |
| Initially, these two mount points are in the same peer group, |
| but we then make the |
| .IR /tmp/etc |
| a slave of |
| .IR /mnt/etc , |
| and then make |
| .IR /tmp/etc |
| shared as well, |
| so that it can propagate events to the next slave in the chain: |
| |
| .nf |
| .in +4n |
| # \fBmount \-\-make\-slave /tmp/etc\fP |
| # \fBmount \-\-make\-shared /tmp/etc\fP |
| # \fBcat /proc/self/mountinfo | egrep \(aq/mnt|/tmp/\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 239 61 8:2 / /mnt ... shared:102 |
| 248 239 0:4 / /mnt/proc ... shared:5 |
| 267 40 8:2 /etc /tmp/etc ... shared:105 master:102 |
| .in |
| .fi |
| |
| Then we bind mount |
| .IR /tmp/etc |
| onto |
| .IR /mnt/tmp/etc . |
| Again, the two mount points are initially in the same peer group, |
| but we then make |
| .IR /mnt/tmp/etc |
| a slave of |
| .IR /tmp/etc : |
| |
| .nf |
| .in +4n |
| # \fBmkdir \-p /mnt/tmp/etc\fP |
| # \fBmount \-\-bind /tmp/etc /mnt/tmp/etc\fP |
| # \fBmount \-\-make\-slave /mnt/tmp/etc\fP |
| # \fBcat /proc/self/mountinfo | egrep \(aq/mnt|/tmp/\(aq | sed \(aqs/ \- .*//\(aq\fP |
| 239 61 8:2 / /mnt ... shared:102 |
| 248 239 0:4 / /mnt/proc ... shared:5 |
| 267 40 8:2 /etc /tmp/etc ... shared:105 master:102 |
| 273 239 8:2 /etc /mnt/tmp/etc ... master:105 |
| |
| .in |
| .fi |
| From the above, we see that |
| .IR /mnt |
| is the master of the slave |
| .IR /tmp/etc , |
| which in turn is the master of the slave |
| .IR /mnt/tmp/etc . |
| |
| We then |
| .BR chroot (1) |
| to the |
| .IR /mnt |
| directory, which renders the mount with ID 267 unreachable |
| from the (new) root directory: |
| |
| .nf |
| .in +4n |
| # \fBchroot /mnt\fP |
| .in |
| .fi |
| |
| When we examine the state of the mounts inside the chroot-ed environment, |
| we see the following: |
| |
| .nf |
| .in +4n |
| # \fBcat /proc/self/mountinfo | sed \(aqs/ \- .*//\(aq\fP |
| 239 61 8:2 / / ... shared:102 |
| 248 239 0:4 / /proc ... shared:5 |
| 273 239 8:2 /etc /tmp/etc ... master:105 propagate_from:102 |
| .in |
| .fi |
| |
| Above, we see that the mount with ID 273 |
| is a slave whose master is the peer group 105. |
| The mount point for that master is unreachable, and so a |
| .IR propagate_from |
| tag is displayed, indicating that the closest dominant peer group |
| (i.e., the nearest reachable mount in the slave chain) |
| is the peer group with the ID 102 (corresponding to the |
| .IR /mnt |
| mount point before the |
| .BR chroot (1) |
| was performed. |
| .\" |
| .SH VERSIONS |
| Mount namespaces first appeared in Linux 2.4.19. |
| .SH CONFORMING TO |
| Namespaces are a Linux-specific feature. |
| .\" |
| .SH NOTES |
| The propagation type assigned to a new mount point depends |
| on the propagation type of the parent directory. |
| If the mount point has a parent (i.e., it is a non-root mount |
| point) and the propagation type of the parent is |
| .BR MS_SHARED , |
| then the propagation type of the new mount is also |
| .BR MS_SHARED . |
| Otherwise, the propagation type of the new mount is |
| .BR MS_PRIVATE . |
| But see also NOTES. |
| |
| Notwithstanding the fact that the default propagation type |
| for new mount points is in many cases |
| .BR MS_PRIVATE , |
| .BR MS_SHARED |
| is typically more useful. |
| For this reason, |
| .BR systemd (1) |
| automatically remounts all mount points as |
| .BR MS_SHARED |
| on system startup. |
| Thus, on most modern systems, the default propagation type is in practice |
| .BR MS_SHARED . |
| |
| Since, when one uses |
| .BR unshare (1) |
| to create a mount namespace, |
| the goal is commonly to provide full isolation of the mount points |
| in the new namespace, |
| .BR unshare (1) |
| (since |
| .IR util-linux |
| version 2.27) in turn reverses the step performed by |
| .BR systemd (1), |
| by making all mount points private in the new namespace. |
| That is, |
| .BR unshare (1) |
| performs the equivalent of the following in the new mount namespace: |
| |
| mount \-\-make\-rprivate / |
| |
| To prevent this, one can use the |
| .IR "\-\-propagation\ unchanged" |
| option to |
| .BR unshare (1). |
| |
| For a discussion of propagation types when moving mounts |
| .RB ( MS_MOVE ) |
| and creating bind mounts |
| .RB ( MS_BIND ), |
| see |
| .IR Documentation/filesystems/sharedsubtree.txt . |
| .SH SEE ALSO |
| .BR unshare (1), |
| .BR clone (2), |
| .BR mount (2), |
| .BR setns (2), |
| .BR umount (2), |
| .BR unshare (2), |
| .BR proc (5), |
| .BR namespaces (7), |
| .BR user_namespaces (7) |
| |
| .IR Documentation/filesystems/sharedsubtree.txt |
| in the kernel source tree. |