cve/published/2022/CVE-2022-49760.mbox - pub/scm/linux/security/vulns - Git at Google

 From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001
 From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 To: <linux-cve-announce@vger.kernel.org>
 Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org>
 Subject: CVE-2022-49760: mm/hugetlb: fix PTE marker handling in hugetlb_change_protection()

 Description
 ===========

 In the Linux kernel, the following vulnerability has been resolved:

 mm/hugetlb: fix PTE marker handling in hugetlb_change_protection()

 Patch series "mm/hugetlb: uffd-wp fixes for hugetlb_change_protection()".

 Playing with virtio-mem and background snapshots (using uffd-wp) on
 hugetlb in QEMU, I managed to trigger a VM_BUG_ON().  Looking into the
 details, hugetlb_change_protection() seems to not handle uffd-wp correctly
 in all cases.

 Patch #1 fixes my test case.  I don't have reproducers for patch #2, as it
 requires running into migration entries.

 I did not yet check in detail yet if !hugetlb code requires similar care.


 This patch (of 2):

 There are two problematic cases when stumbling over a PTE marker in
 hugetlb_change_protection():

 (1) We protect an uffd-wp PTE marker a second time using uffd-wp: we will
     end up in the "!huge_pte_none(pte)" case and mess up the PTE marker.

 (2) We unprotect a uffd-wp PTE marker: we will similarly end up in the
     "!huge_pte_none(pte)" case even though we cleared the PTE, because
     the "pte" variable is stale. We'll mess up the PTE marker.

 For example, if we later stumble over such a "wrongly modified" PTE marker,
 we'll treat it like a present PTE that maps some garbage page.

 This can, for example, be triggered by mapping a memfd backed by huge
 pages, registering uffd-wp, uffd-wp'ing an unmapped page and (a)
 uffd-wp'ing it a second time; or (b) uffd-unprotecting it; or (c)
 unregistering uffd-wp. Then, ff we trigger fallocate(FALLOC_FL_PUNCH_HOLE)
 on that file range, we will run into a VM_BUG_ON:

 [  195.039560] page:00000000ba1f2987 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x0
 [  195.039565] flags: 0x7ffffc0001000(reserved|node=0|zone=0|lastcpupid=0x1fffff)
 [  195.039568] raw: 0007ffffc0001000 ffffe742c0000008 ffffe742c0000008 0000000000000000
 [  195.039569] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
 [  195.039569] page dumped because: VM_BUG_ON_PAGE(compound && !PageHead(page))
 [  195.039573] ------------[ cut here ]------------
 [  195.039574] kernel BUG at mm/rmap.c:1346!
 [  195.039579] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
 [  195.039581] CPU: 7 PID: 4777 Comm: qemu-system-x86 Not tainted 6.0.12-200.fc36.x86_64 #1
 [  195.039583] Hardware name: LENOVO 20WNS1F81N/20WNS1F81N, BIOS N35ET50W (1.50 ) 09/15/2022
 [  195.039584] RIP: 0010:page_remove_rmap+0x45b/0x550
 [  195.039588] Code: [...]
 [  195.039589] RSP: 0018:ffffbc03c3633ba8 EFLAGS: 00010292
 [  195.039591] RAX: 0000000000000040 RBX: ffffe742c0000000 RCX: 0000000000000000
 [  195.039592] RDX: 0000000000000002 RSI: ffffffff8e7aac1a RDI: 00000000ffffffff
 [  195.039592] RBP: 0000000000000001 R08: 0000000000000000 R09: ffffbc03c3633a08
 [  195.039593] R10: 0000000000000003 R11: ffffffff8f146328 R12: ffff9b04c42754b0
 [  195.039594] R13: ffffffff8fcc6328 R14: ffffbc03c3633c80 R15: ffff9b0484ab9100
 [  195.039595] FS:  00007fc7aaf68640(0000) GS:ffff9b0bbf7c0000(0000) knlGS:0000000000000000
 [  195.039596] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [  195.039597] CR2: 000055d402c49110 CR3: 0000000159392003 CR4: 0000000000772ee0
 [  195.039598] PKRU: 55555554
 [  195.039599] Call Trace:
 [  195.039600]  <TASK>
 [  195.039602]  __unmap_hugepage_range+0x33b/0x7d0
 [  195.039605]  unmap_hugepage_range+0x55/0x70
 [  195.039608]  hugetlb_vmdelete_list+0x77/0xa0
 [  195.039611]  hugetlbfs_fallocate+0x410/0x550
 [  195.039612]  ? _raw_spin_unlock_irqrestore+0x23/0x40
 [  195.039616]  vfs_fallocate+0x12e/0x360
 [  195.039618]  __x64_sys_fallocate+0x40/0x70
 [  195.039620]  do_syscall_64+0x58/0x80
 [  195.039623]  ? syscall_exit_to_user_mode+0x17/0x40
 [  195.039624]  ? do_syscall_64+0x67/0x80
 [  195.039626]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 [  195.039628] RIP: 0033:0x7fc7b590651f
 [  195.039653] Code: [...]
 [  195.039654] RSP: 002b:00007fc7aaf66e70 EFLAGS: 00000293 ORIG_RAX: 000000000000011d
 [  195.039655] RAX: ffffffffffffffda RBX: 0000558ef4b7f370 RCX: 00007fc7b590651f
 [  195.039656] RDX: 0000000018000000 RSI: 0000000000000003 RDI: 000000000000000c
 [  195.039657] RBP: 0000000008000000 R08: 0000000000000000 R09: 0000000000000073
 [  195.039658] R10: 0000000008000000 R11: 0000000000000293 R12: 0000000018000000
 [  195.039658] R13: 00007fb8bbe00000 R14: 000000000000000c R15: 0000000000001000
 [  195.039661]  </TASK>

 Fix it by not going into the "!huge_pte_none(pte)" case if we stumble over
 an exclusive marker.  spin_unlock() + continue would get the job done.

 However, instead, make it clearer that there are no fall-through
 statements: we process each case (hwpoison, migration, marker, !none,
 none) and then unlock the page table to continue with the next PTE.  Let's
 avoid "continue" statements and use a single spin_unlock() at the end.

 The Linux kernel CVE team has assigned CVE-2022-49760 to this issue.


 Affected and fixed versions
 ===========================

 	Issue introduced in 5.19 with commit 60dfaad65aa97fb6755b9798a6b3c9e79bcd5930 and fixed in 6.1.8 with commit 6062c992e912df1eedad52cf64efb3d48e8d35c5
 	Issue introduced in 5.19 with commit 60dfaad65aa97fb6755b9798a6b3c9e79bcd5930 and fixed in 6.2 with commit 0e678153f5be7e6c8d28835f5a678618da4b7a9c

 Please see https://www.kernel.org for a full list of currently supported
 kernel versions by the kernel community.

 Unaffected versions might change over time as fixes are backported to
 older supported kernel versions.  The official CVE entry at
 	https://cve.org/CVERecord/?id=CVE-2022-49760
 will be updated if fixes are backported, please check that for the most
 up to date information about this issue.


 Affected files
 ==============

 The file(s) affected by this issue are:
 	mm/hugetlb.c


 Mitigation
 ==========

 The Linux kernel CVE team recommends that you update to the latest
 stable kernel version for this, and many other bugfixes.  Individual
 changes are never tested alone, but rather are part of a larger kernel
 release.  Cherry-picking individual commits is not recommended or
 supported by the Linux kernel community at all.  If however, updating to
 the latest release is impossible, the individual changes to resolve this
 issue can be found at these commits:
 	https://git.kernel.org/stable/c/6062c992e912df1eedad52cf64efb3d48e8d35c5
 	https://git.kernel.org/stable/c/0e678153f5be7e6c8d28835f5a678618da4b7a9c
	From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001
	From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	To: <linux-cve-announce@vger.kernel.org>
	Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org>
	Subject: CVE-2022-49760: mm/hugetlb: fix PTE marker handling in hugetlb_change_protection()

	Description
	===========

	In the Linux kernel, the following vulnerability has been resolved:

	mm/hugetlb: fix PTE marker handling in hugetlb_change_protection()

	Patch series "mm/hugetlb: uffd-wp fixes for hugetlb_change_protection()".

	Playing with virtio-mem and background snapshots (using uffd-wp) on
	hugetlb in QEMU, I managed to trigger a VM_BUG_ON(). Looking into the
	details, hugetlb_change_protection() seems to not handle uffd-wp correctly
	in all cases.

	Patch #1 fixes my test case. I don't have reproducers for patch #2, as it
	requires running into migration entries.

	I did not yet check in detail yet if !hugetlb code requires similar care.


	This patch (of 2):

	There are two problematic cases when stumbling over a PTE marker in
	hugetlb_change_protection():

	(1) We protect an uffd-wp PTE marker a second time using uffd-wp: we will
	end up in the "!huge_pte_none(pte)" case and mess up the PTE marker.

	(2) We unprotect a uffd-wp PTE marker: we will similarly end up in the
	"!huge_pte_none(pte)" case even though we cleared the PTE, because
	the "pte" variable is stale. We'll mess up the PTE marker.

	For example, if we later stumble over such a "wrongly modified" PTE marker,
	we'll treat it like a present PTE that maps some garbage page.

	This can, for example, be triggered by mapping a memfd backed by huge
	pages, registering uffd-wp, uffd-wp'ing an unmapped page and (a)
	uffd-wp'ing it a second time; or (b) uffd-unprotecting it; or (c)
	unregistering uffd-wp. Then, ff we trigger fallocate(FALLOC_FL_PUNCH_HOLE)
	on that file range, we will run into a VM_BUG_ON:

	[ 195.039560] page:00000000ba1f2987 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x0
	[ 195.039565] flags: 0x7ffffc0001000(reserved\|node=0\|zone=0\|lastcpupid=0x1fffff)
	[ 195.039568] raw: 0007ffffc0001000 ffffe742c0000008 ffffe742c0000008 0000000000000000
	[ 195.039569] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
	[ 195.039569] page dumped because: VM_BUG_ON_PAGE(compound && !PageHead(page))
	[ 195.039573] ------------[ cut here ]------------
	[ 195.039574] kernel BUG at mm/rmap.c:1346!
	[ 195.039579] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
	[ 195.039581] CPU: 7 PID: 4777 Comm: qemu-system-x86 Not tainted 6.0.12-200.fc36.x86_64 #1
	[ 195.039583] Hardware name: LENOVO 20WNS1F81N/20WNS1F81N, BIOS N35ET50W (1.50 ) 09/15/2022
	[ 195.039584] RIP: 0010:page_remove_rmap+0x45b/0x550
	[ 195.039588] Code: [...]
	[ 195.039589] RSP: 0018:ffffbc03c3633ba8 EFLAGS: 00010292
	[ 195.039591] RAX: 0000000000000040 RBX: ffffe742c0000000 RCX: 0000000000000000
	[ 195.039592] RDX: 0000000000000002 RSI: ffffffff8e7aac1a RDI: 00000000ffffffff
	[ 195.039592] RBP: 0000000000000001 R08: 0000000000000000 R09: ffffbc03c3633a08
	[ 195.039593] R10: 0000000000000003 R11: ffffffff8f146328 R12: ffff9b04c42754b0
	[ 195.039594] R13: ffffffff8fcc6328 R14: ffffbc03c3633c80 R15: ffff9b0484ab9100
	[ 195.039595] FS: 00007fc7aaf68640(0000) GS:ffff9b0bbf7c0000(0000) knlGS:0000000000000000
	[ 195.039596] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	[ 195.039597] CR2: 000055d402c49110 CR3: 0000000159392003 CR4: 0000000000772ee0
	[ 195.039598] PKRU: 55555554
	[ 195.039599] Call Trace:
	[ 195.039600] <TASK>
	[ 195.039602] __unmap_hugepage_range+0x33b/0x7d0
	[ 195.039605] unmap_hugepage_range+0x55/0x70
	[ 195.039608] hugetlb_vmdelete_list+0x77/0xa0
	[ 195.039611] hugetlbfs_fallocate+0x410/0x550
	[ 195.039612] ? _raw_spin_unlock_irqrestore+0x23/0x40
	[ 195.039616] vfs_fallocate+0x12e/0x360
	[ 195.039618] __x64_sys_fallocate+0x40/0x70
	[ 195.039620] do_syscall_64+0x58/0x80
	[ 195.039623] ? syscall_exit_to_user_mode+0x17/0x40
	[ 195.039624] ? do_syscall_64+0x67/0x80
	[ 195.039626] entry_SYSCALL_64_after_hwframe+0x63/0xcd
	[ 195.039628] RIP: 0033:0x7fc7b590651f
	[ 195.039653] Code: [...]
	[ 195.039654] RSP: 002b:00007fc7aaf66e70 EFLAGS: 00000293 ORIG_RAX: 000000000000011d
	[ 195.039655] RAX: ffffffffffffffda RBX: 0000558ef4b7f370 RCX: 00007fc7b590651f
	[ 195.039656] RDX: 0000000018000000 RSI: 0000000000000003 RDI: 000000000000000c
	[ 195.039657] RBP: 0000000008000000 R08: 0000000000000000 R09: 0000000000000073
	[ 195.039658] R10: 0000000008000000 R11: 0000000000000293 R12: 0000000018000000
	[ 195.039658] R13: 00007fb8bbe00000 R14: 000000000000000c R15: 0000000000001000
	[ 195.039661] </TASK>

	Fix it by not going into the "!huge_pte_none(pte)" case if we stumble over
	an exclusive marker. spin_unlock() + continue would get the job done.

	However, instead, make it clearer that there are no fall-through
	statements: we process each case (hwpoison, migration, marker, !none,
	none) and then unlock the page table to continue with the next PTE. Let's
	avoid "continue" statements and use a single spin_unlock() at the end.

	The Linux kernel CVE team has assigned CVE-2022-49760 to this issue.


	Affected and fixed versions
	===========================

	Issue introduced in 5.19 with commit 60dfaad65aa97fb6755b9798a6b3c9e79bcd5930 and fixed in 6.1.8 with commit 6062c992e912df1eedad52cf64efb3d48e8d35c5
	Issue introduced in 5.19 with commit 60dfaad65aa97fb6755b9798a6b3c9e79bcd5930 and fixed in 6.2 with commit 0e678153f5be7e6c8d28835f5a678618da4b7a9c

	Please see https://www.kernel.org for a full list of currently supported
	kernel versions by the kernel community.

	Unaffected versions might change over time as fixes are backported to
	older supported kernel versions. The official CVE entry at
	https://cve.org/CVERecord/?id=CVE-2022-49760
	will be updated if fixes are backported, please check that for the most
	up to date information about this issue.


	Affected files
	==============

	The file(s) affected by this issue are:
	mm/hugetlb.c


	Mitigation
	==========

	The Linux kernel CVE team recommends that you update to the latest
	stable kernel version for this, and many other bugfixes. Individual
	changes are never tested alone, but rather are part of a larger kernel
	release. Cherry-picking individual commits is not recommended or
	supported by the Linux kernel community at all. If however, updating to
	the latest release is impossible, the individual changes to resolve this
	issue can be found at these commits:
	https://git.kernel.org/stable/c/6062c992e912df1eedad52cf64efb3d48e8d35c5
	https://git.kernel.org/stable/c/0e678153f5be7e6c8d28835f5a678618da4b7a9c