| From 69a91c237ab0ebe4e9fdeaf6d0090c85275594ec Mon Sep 17 00:00:00 2001 |
| From: Eric Rannaud <e@nanocritical.com> |
| Date: Thu, 30 Oct 2014 01:51:01 -0700 |
| Subject: fs: allow open(dir, O_TMPFILE|..., 0) with mode 0 |
| |
| From: Eric Rannaud <e@nanocritical.com> |
| |
| commit 69a91c237ab0ebe4e9fdeaf6d0090c85275594ec upstream. |
| |
| The man page for open(2) indicates that when O_CREAT is specified, the |
| 'mode' argument applies only to future accesses to the file: |
| |
| Note that this mode applies only to future accesses of the newly |
| created file; the open() call that creates a read-only file |
| may well return a read/write file descriptor. |
| |
| The man page for open(2) implies that 'mode' is treated identically by |
| O_CREAT and O_TMPFILE. |
| |
| O_TMPFILE, however, behaves differently: |
| |
| int fd = open("/tmp", O_TMPFILE | O_RDWR, 0); |
| assert(fd == -1); |
| assert(errno == EACCES); |
| |
| int fd = open("/tmp", O_TMPFILE | O_RDWR, 0600); |
| assert(fd > 0); |
| |
| For O_CREAT, do_last() sets acc_mode to MAY_OPEN only: |
| |
| if (*opened & FILE_CREATED) { |
| /* Don't check for write permission, don't truncate */ |
| open_flag &= ~O_TRUNC; |
| will_truncate = false; |
| acc_mode = MAY_OPEN; |
| path_to_nameidata(path, nd); |
| goto finish_open_created; |
| } |
| |
| But for O_TMPFILE, do_tmpfile() passes the full op->acc_mode to |
| may_open(). |
| |
| This patch lines up the behavior of O_TMPFILE with O_CREAT. After the |
| inode is created, may_open() is called with acc_mode = MAY_OPEN, in |
| do_tmpfile(). |
| |
| A different, but related glibc bug revealed the discrepancy: |
| https://sourceware.org/bugzilla/show_bug.cgi?id=17523 |
| |
| The glibc lazily loads the 'mode' argument of open() and openat() using |
| va_arg() only if O_CREAT is present in 'flags' (to support both the 2 |
| argument and the 3 argument forms of open; same idea for openat()). |
| However, the glibc ignores the 'mode' argument if O_TMPFILE is in |
| 'flags'. |
| |
| On x86_64, for open(), it magically works anyway, as 'mode' is in |
| RDX when entering open(), and is still in RDX on SYSCALL, which is where |
| the kernel looks for the 3rd argument of a syscall. |
| |
| But openat() is not quite so lucky: 'mode' is in RCX when entering the |
| glibc wrapper for openat(), while the kernel looks for the 4th argument |
| of a syscall in R10. Indeed, the syscall calling convention differs from |
| the regular calling convention in this respect on x86_64. So the kernel |
| sees mode = 0 when trying to use glibc openat() with O_TMPFILE, and |
| fails with EACCES. |
| |
| Signed-off-by: Eric Rannaud <e@nanocritical.com> |
| Acked-by: Andy Lutomirski <luto@amacapital.net> |
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| |
| --- |
| fs/namei.c | 3 ++- |
| 1 file changed, 2 insertions(+), 1 deletion(-) |
| |
| --- a/fs/namei.c |
| +++ b/fs/namei.c |
| @@ -3154,7 +3154,8 @@ static int do_tmpfile(int dfd, struct fi |
| if (error) |
| goto out2; |
| audit_inode(pathname, nd->path.dentry, 0); |
| - error = may_open(&nd->path, op->acc_mode, op->open_flag); |
| + /* Don't check for other permissions, the inode was just created */ |
| + error = may_open(&nd->path, MAY_OPEN, op->open_flag); |
| if (error) |
| goto out2; |
| file->f_path.mnt = nd->path.mnt; |