| From f66066bc5136f25e36a2daff4896c768f18c211e Mon Sep 17 00:00:00 2001 |
| From: Linus Torvalds <torvalds@linux-foundation.org> |
| Date: Sun, 2 Jul 2023 23:20:17 -0700 |
| Subject: execve: always mark stack as growing down during early stack setup |
| |
| From: Linus Torvalds <torvalds@linux-foundation.org> |
| |
| commit f66066bc5136f25e36a2daff4896c768f18c211e upstream. |
| |
| While our user stacks can grow either down (all common architectures) or |
| up (parisc and the ia64 register stack), the initial stack setup when we |
| copy the argument and environment strings to the new stack at execve() |
| time is always done by extending the stack downwards. |
| |
| But it turns out that in commit 8d7071af8907 ("mm: always expand the |
| stack with the mmap write lock held"), as part of making the stack |
| growing code more robust, 'expand_downwards()' was now made to actually |
| check the vma flags: |
| |
| if (!(vma->vm_flags & VM_GROWSDOWN)) |
| return -EFAULT; |
| |
| and that meant that this execve-time stack expansion started failing on |
| parisc, because on that architecture, the stack flags do not contain the |
| VM_GROWSDOWN bit. |
| |
| At the same time the new check in expand_downwards() is clearly correct, |
| and simplified the callers, so let's not remove it. |
| |
| The solution is instead to just codify the fact that yes, during |
| execve(), the stack grows down. This not only matches reality, it ends |
| up being particularly simple: we already have special execve-time flags |
| for the stack (VM_STACK_INCOMPLETE_SETUP) and use those flags to avoid |
| page migration during this setup time (see vma_is_temporary_stack() and |
| invalid_migration_vma()). |
| |
| So just add VM_GROWSDOWN to that set of temporary flags, and now our |
| stack flags automatically match reality, and the parisc stack expansion |
| works again. |
| |
| Note that the VM_STACK_INCOMPLETE_SETUP bits will be cleared when the |
| stack is finalized, so we only add the extra VM_GROWSDOWN bit on |
| CONFIG_STACK_GROWSUP architectures (ie parisc) rather than adding it in |
| general. |
| |
| Link: https://lore.kernel.org/all/612eaa53-6904-6e16-67fc-394f4faa0e16@bell.net/ |
| Link: https://lore.kernel.org/all/5fd98a09-4792-1433-752d-029ae3545168@gmx.de/ |
| Fixes: 8d7071af8907 ("mm: always expand the stack with the mmap write lock held") |
| Reported-by: John David Anglin <dave.anglin@bell.net> |
| Reported-and-tested-by: Helge Deller <deller@gmx.de> |
| Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> |
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| include/linux/mm.h | 4 +++- |
| 1 file changed, 3 insertions(+), 1 deletion(-) |
| |
| --- a/include/linux/mm.h |
| +++ b/include/linux/mm.h |
| @@ -377,7 +377,7 @@ extern unsigned int kobjsize(const void |
| #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ |
| |
| /* Bits set in the VMA until the stack is in its final location */ |
| -#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ) |
| +#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ | VM_STACK_EARLY) |
| |
| #define TASK_EXEC ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) |
| |
| @@ -399,8 +399,10 @@ extern unsigned int kobjsize(const void |
| |
| #ifdef CONFIG_STACK_GROWSUP |
| #define VM_STACK VM_GROWSUP |
| +#define VM_STACK_EARLY VM_GROWSDOWN |
| #else |
| #define VM_STACK VM_GROWSDOWN |
| +#define VM_STACK_EARLY 0 |
| #endif |
| |
| #define VM_STACK_FLAGS (VM_STACK | VM_STACK_DEFAULT_FLAGS | VM_ACCOUNT) |