| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-53105: mm: page_alloc: move mlocked flag clearance into free_pages_prepare() |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| mm: page_alloc: move mlocked flag clearance into free_pages_prepare() |
| |
| Syzbot reported a bad page state problem caused by a page being freed |
| using free_page() still having a mlocked flag at free_pages_prepare() |
| stage: |
| |
| BUG: Bad page state in process syz.5.504 pfn:61f45 |
| page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x61f45 |
| flags: 0xfff00000080204(referenced|workingset|mlocked|node=0|zone=1|lastcpupid=0x7ff) |
| raw: 00fff00000080204 0000000000000000 dead000000000122 0000000000000000 |
| raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 |
| page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set |
| page_owner tracks the page as allocated |
| page last allocated via order 0, migratetype Unmovable, gfp_mask 0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), pid 8443, tgid 8442 (syz.5.504), ts 201884660643, free_ts 201499827394 |
| set_page_owner include/linux/page_owner.h:32 [inline] |
| post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537 |
| prep_new_page mm/page_alloc.c:1545 [inline] |
| get_page_from_freelist+0x303f/0x3190 mm/page_alloc.c:3457 |
| __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4733 |
| alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265 |
| kvm_coalesced_mmio_init+0x1f/0xf0 virt/kvm/coalesced_mmio.c:99 |
| kvm_create_vm virt/kvm/kvm_main.c:1235 [inline] |
| kvm_dev_ioctl_create_vm virt/kvm/kvm_main.c:5488 [inline] |
| kvm_dev_ioctl+0x12dc/0x2240 virt/kvm/kvm_main.c:5530 |
| __do_compat_sys_ioctl fs/ioctl.c:1007 [inline] |
| __se_compat_sys_ioctl+0x510/0xc90 fs/ioctl.c:950 |
| do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline] |
| __do_fast_syscall_32+0xb4/0x110 arch/x86/entry/common.c:386 |
| do_fast_syscall_32+0x34/0x80 arch/x86/entry/common.c:411 |
| entry_SYSENTER_compat_after_hwframe+0x84/0x8e |
| page last free pid 8399 tgid 8399 stack trace: |
| reset_page_owner include/linux/page_owner.h:25 [inline] |
| free_pages_prepare mm/page_alloc.c:1108 [inline] |
| free_unref_folios+0xf12/0x18d0 mm/page_alloc.c:2686 |
| folios_put_refs+0x76c/0x860 mm/swap.c:1007 |
| free_pages_and_swap_cache+0x5c8/0x690 mm/swap_state.c:335 |
| __tlb_batch_free_encoded_pages mm/mmu_gather.c:136 [inline] |
| tlb_batch_pages_flush mm/mmu_gather.c:149 [inline] |
| tlb_flush_mmu_free mm/mmu_gather.c:366 [inline] |
| tlb_flush_mmu+0x3a3/0x680 mm/mmu_gather.c:373 |
| tlb_finish_mmu+0xd4/0x200 mm/mmu_gather.c:465 |
| exit_mmap+0x496/0xc40 mm/mmap.c:1926 |
| __mmput+0x115/0x390 kernel/fork.c:1348 |
| exit_mm+0x220/0x310 kernel/exit.c:571 |
| do_exit+0x9b2/0x28e0 kernel/exit.c:926 |
| do_group_exit+0x207/0x2c0 kernel/exit.c:1088 |
| __do_sys_exit_group kernel/exit.c:1099 [inline] |
| __se_sys_exit_group kernel/exit.c:1097 [inline] |
| __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1097 |
| x64_sys_call+0x2634/0x2640 arch/x86/include/generated/asm/syscalls_64.h:232 |
| do_syscall_x64 arch/x86/entry/common.c:52 [inline] |
| do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 |
| entry_SYSCALL_64_after_hwframe+0x77/0x7f |
| Modules linked in: |
| CPU: 0 UID: 0 PID: 8442 Comm: syz.5.504 Not tainted 6.12.0-rc6-syzkaller #0 |
| Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 |
| Call Trace: |
| <TASK> |
| __dump_stack lib/dump_stack.c:94 [inline] |
| dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 |
| bad_page+0x176/0x1d0 mm/page_alloc.c:501 |
| free_page_is_bad mm/page_alloc.c:918 [inline] |
| free_pages_prepare mm/page_alloc.c:1100 [inline] |
| free_unref_page+0xed0/0xf20 mm/page_alloc.c:2638 |
| kvm_destroy_vm virt/kvm/kvm_main.c:1327 [inline] |
| kvm_put_kvm+0xc75/0x1350 virt/kvm/kvm_main.c:1386 |
| kvm_vcpu_release+0x54/0x60 virt/kvm/kvm_main.c:4143 |
| __fput+0x23f/0x880 fs/file_table.c:431 |
| task_work_run+0x24f/0x310 kernel/task_work.c:239 |
| exit_task_work include/linux/task_work.h:43 [inline] |
| do_exit+0xa2f/0x28e0 kernel/exit.c:939 |
| do_group_exit+0x207/0x2c0 kernel/exit.c:1088 |
| __do_sys_exit_group kernel/exit.c:1099 [inline] |
| __se_sys_exit_group kernel/exit.c:1097 [inline] |
| __ia32_sys_exit_group+0x3f/0x40 kernel/exit.c:1097 |
| ia32_sys_call+0x2624/0x2630 arch/x86/include/generated/asm/syscalls_32.h:253 |
| do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline] |
| __do_fast_syscall_32+0xb4/0x110 arch/x86/entry/common.c:386 |
| do_fast_syscall_32+0x34/0x80 arch/x86/entry/common.c:411 |
| entry_SYSENTER_compat_after_hwframe+0x84/0x8e |
| RIP: 0023:0xf745d579 |
| Code: Unable to access opcode bytes at 0xf745d54f. |
| RSP: 002b:00000000f75afd6c EFLAGS: 00000206 ORIG_RAX: 00000000000000fc |
| RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000000000 |
| RDX: 0000000000000000 RSI: 00000000ffffff9c RDI: 00000000f744cff4 |
| RBP: 00000000f717ae61 R08: 0000000000000000 R09: 0000000000000000 |
| R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000 |
| R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 |
| </TASK> |
| |
| The problem was originally introduced by commit b109b87050df ("mm/munlock: |
| replace clear_page_mlock() by final clearance"): it was focused on |
| handling pagecache and anonymous memory and wasn't suitable for lower |
| level get_page()/free_page() API's used for example by KVM, as with this |
| reproducer. |
| |
| Fix it by moving the mlocked flag clearance down to free_page_prepare(). |
| |
| The bug itself if fairly old and harmless (aside from generating these |
| warnings), aside from a small memory leak - "bad" pages are stopped from |
| being allocated again. |
| |
| The Linux kernel CVE team has assigned CVE-2024-53105 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.18 with commit b109b87050df5438ee745b2bddfa3587970025bb and fixed in 6.1.120 with commit 2521664c1fc0fcea825ef0b4d8e2dfb622bc0f9a |
| Issue introduced in 5.18 with commit b109b87050df5438ee745b2bddfa3587970025bb and fixed in 6.6.66 with commit 81ad32b87eb91b627a4b0d8760434e5fac4b993a |
| Issue introduced in 5.18 with commit b109b87050df5438ee745b2bddfa3587970025bb and fixed in 6.11.10 with commit 7873d11911cd1d21e25c354eb130d8c3b5cb3ca5 |
| Issue introduced in 5.18 with commit b109b87050df5438ee745b2bddfa3587970025bb and fixed in 6.12 with commit 66edc3a5894c74f8887c8af23b97593a0dd0df4d |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-53105 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| mm/page_alloc.c |
| mm/swap.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/2521664c1fc0fcea825ef0b4d8e2dfb622bc0f9a |
| https://git.kernel.org/stable/c/81ad32b87eb91b627a4b0d8760434e5fac4b993a |
| https://git.kernel.org/stable/c/7873d11911cd1d21e25c354eb130d8c3b5cb3ca5 |
| https://git.kernel.org/stable/c/66edc3a5894c74f8887c8af23b97593a0dd0df4d |