| From: Aleksa Sarai <cyphar@cyphar.com> |
| Subject: selftests: memfd: error out test process when child test fails |
| Date: Mon, 14 Aug 2023 18:40:57 +1000 |
| |
| Patch series "memfd: cleanups for vm.memfd_noexec", v2. |
| |
| The most critical issue with vm.memfd_noexec=2 (the fact that passing |
| MFD_EXEC would bypass it entirely[1]) has been fixed in Andrew's |
| tree[2], but there are still some outstanding issues that need to be |
| addressed: |
| |
| * vm.memfd_noexec=2 shouldn't reject old-style memfd_create(2) syscalls |
| because it will make it far to difficult to ever migrate. Instead it |
| should imply MFD_EXEC. |
| |
| * The dmesg warnings are pr_warn_once(), which on most systems means |
| that they will be used up by systemd or some other boot process and |
| userspace developers will never see it. |
| |
| - For the !(flags & (MFD_EXEC | MFD_NOEXEC_SEAL)) case, outputting a |
| rate-limited message to the kernel log is necessary to tell |
| userspace that they should add the new flags. |
| |
| Arguably the most ideal way to deal with the spam concern[3,4] |
| while still prompting userspace to switch to the new flags would be |
| to only log the warning once per task or something similar. |
| However, adding something to task_struct for tracking this would be |
| needless bloat for a single pr_warn_ratelimited(). |
| |
| So just switch to pr_info_ratelimited() to avoid spamming the log |
| with something that isn't a real warning. There's lots of |
| info-level stuff in dmesg, it seems really unlikely that this |
| should be an actual problem. Most programs are already switching to |
| the new flags anyway. |
| |
| - For the vm.memfd_noexec=2 case, we need to log a warning for every |
| failure because otherwise userspace will have no idea why their |
| previously working program started returning -EACCES (previously |
| -EINVAL) from memfd_create(2). pr_warn_once() is simply wrong here. |
| |
| * The racheting mechanism for vm.memfd_noexec makes it incredibly |
| unappealing for most users to enable the sysctl because enabling it |
| on &init_pid_ns means you need a system reboot to unset it. Given the |
| actual security threat being protected against, CAP_SYS_ADMIN users |
| being restricted in this way makes little sense. |
| |
| The argument for this ratcheting by the original author was that it |
| allows you to have a hierarchical setting that cannot be unset by |
| child pidnses, but this is not accurate -- changing the parent |
| pidns's vm.memfd_noexec setting to be more restrictive didn't affect |
| children. |
| |
| Instead, switch the vm.memfd_noexec sysctl to be properly |
| hierarchical and allow CAP_SYS_ADMIN users (in the pidns's owning |
| userns) to lower the setting as long as it is not lower than the |
| parent's effective setting. This change also makes it so that |
| changing a parent pidns's vm.memfd_noexec will affect all |
| descendants, providing a properly hierarchical setting. The |
| performance impact of this is incredibly minimal since the maximum |
| depth of pidns is 32 and it is only checked during memfd_create(2) |
| and unshare(CLONE_NEWPID). |
| |
| * The memfd selftests would not exit with a non-zero error code when |
| certain tests that ran in a forked process (specifically the ones |
| related to MFD_EXEC and MFD_NOEXEC_SEAL) failed. |
| |
| [1]: https://lore.kernel.org/all/ZJwcsU0vI-nzgOB_@codewreck.org/ |
| [2]: https://lore.kernel.org/all/20230705063315.3680666-1-jeffxu@google.com/ |
| [3]: https://lore.kernel.org/Y5yS8wCnuYGLHMj4@x1n/ |
| [4]: https://lore.kernel.org/f185bb42-b29c-977e-312e-3349eea15383@linuxfoundation.org/ |
| |
| |
| This patch (of 5): |
| |
| Before this change, a test runner using this self test would see a return |
| code of 0 when the tests using a child process (namely the MFD_NOEXEC_SEAL |
| and MFD_EXEC tests) failed, masking test failures. |
| |
| Link: https://lkml.kernel.org/r/20230814-memfd-vm-noexec-uapi-fixes-v2-0-7ff9e3e10ba6@cyphar.com |
| Link: https://lkml.kernel.org/r/20230814-memfd-vm-noexec-uapi-fixes-v2-1-7ff9e3e10ba6@cyphar.com |
| Fixes: 11f75a01448f ("selftests/memfd: add tests for MFD_NOEXEC_SEAL MFD_EXEC") |
| Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> |
| Reviewed-by: Jeff Xu <jeffxu@google.com> |
| Cc: "Christian Brauner (Microsoft)" <brauner@kernel.org> |
| Cc: Daniel Verkamp <dverkamp@chromium.org> |
| Cc: Dominique Martinet <asmadeus@codewreck.org> |
| Cc: Kees Cook <keescook@chromium.org> |
| Cc: Shuah Khan <shuah@kernel.org> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| --- |
| |
| tools/testing/selftests/memfd/memfd_test.c | 19 ++++++++++++++++++- |
| 1 file changed, 18 insertions(+), 1 deletion(-) |
| |
| --- a/tools/testing/selftests/memfd/memfd_test.c~selftests-memfd-error-out-test-process-when-child-test-fails |
| +++ a/tools/testing/selftests/memfd/memfd_test.c |
| @@ -1207,7 +1207,24 @@ static pid_t spawn_newpid_thread(unsigne |
| |
| static void join_newpid_thread(pid_t pid) |
| { |
| - waitpid(pid, NULL, 0); |
| + int wstatus; |
| + |
| + if (waitpid(pid, &wstatus, 0) < 0) { |
| + printf("newpid thread: waitpid() failed: %m\n"); |
| + abort(); |
| + } |
| + |
| + if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) != 0) { |
| + printf("newpid thread: exited with non-zero error code %d\n", |
| + WEXITSTATUS(wstatus)); |
| + abort(); |
| + } |
| + |
| + if (WIFSIGNALED(wstatus)) { |
| + printf("newpid thread: killed by signal %d\n", |
| + WTERMSIG(wstatus)); |
| + abort(); |
| + } |
| } |
| |
| /* |
| _ |