cve/published/2025/CVE-2025-21932.mbox - pub/scm/linux/security/vulns - Git at Google

 From bippy-7c5fe7eed585 Mon Sep 17 00:00:00 2001
 From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 To: <linux-cve-announce@vger.kernel.org>
 Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org>
 Subject: CVE-2025-21932: mm: abort vma_modify() on merge out of memory failure

 Description
 ===========

 In the Linux kernel, the following vulnerability has been resolved:

 mm: abort vma_modify() on merge out of memory failure

 The remainder of vma_modify() relies upon the vmg state remaining pristine
 after a merge attempt.

 Usually this is the case, however in the one edge case scenario of a merge
 attempt failing not due to the specified range being unmergeable, but
 rather due to an out of memory error arising when attempting to commit the
 merge, this assumption becomes untrue.

 This results in vmg->start, end being modified, and thus the proceeding
 attempts to split the VMA will be done with invalid start/end values.

 Thankfully, it is likely practically impossible for us to hit this in
 reality, as it would require a maple tree node pre-allocation failure that
 would likely never happen due to it being 'too small to fail', i.e.  the
 kernel would simply keep retrying reclaim until it succeeded.

 However, this scenario remains theoretically possible, and what we are
 doing here is wrong so we must correct it.

 The safest option is, when this scenario occurs, to simply give up the
 operation.  If we cannot allocate memory to merge, then we cannot allocate
 memory to split either (perhaps moreso!).

 Any scenario where this would be happening would be under very extreme
 (likely fatal) memory pressure, so it's best we give up early.

 So there is no doubt it is appropriate to simply bail out in this
 scenario.

 However, in general we must if at all possible never assume VMG state is
 stable after a merge attempt, since merge operations update VMG fields.
 As a result, additionally also make this clear by storing start, end in
 local variables.

 The issue was reported originally by syzkaller, and by Brad Spengler (via
 an off-list discussion), and in both instances it manifested as a
 triggering of the assert:

 	VM_WARN_ON_VMG(start >= end, vmg);

 In vma_merge_existing_range().

 It seems at least one scenario in which this is occurring is one in which
 the merge being attempted is due to an madvise() across multiple VMAs
 which looks like this:

         start     end
           |<------>|
      |----------|------|
      |   vma    | next |
      |----------|------|

 When madvise_walk_vmas() is invoked, we first find vma in the above
 (determining prev to be equal to vma as we are offset into vma), and then
 enter the loop.

 We determine the end of vma that forms part of the range we are
 madvise()'ing by setting 'tmp' to this value:

 		/* Here vma->vm_start <= start < (end|vma->vm_end) */
 		tmp = vma->vm_end;

 We then invoke the madvise() operation via visit(), letting prev get
 updated to point to vma as part of the operation:

 		/* Here vma->vm_start <= start < tmp <= (end|vma->vm_end). */
 		error = visit(vma, &prev, start, tmp, arg);

 Where the visit() function pointer in this instance is
 madvise_vma_behavior().

 As observed in syzkaller reports, it is ultimately madvise_update_vma()
 that is invoked, calling vma_modify_flags_name() and vma_modify() in turn.

 Then, in vma_modify(), we attempt the merge:

 	merged = vma_merge_existing_range(vmg);
 	if (merged)
 		return merged;

 We invoke this with vmg->start, end set to start, tmp as such:

         start  tmp
           |<--->|
      |----------|------|
      |   vma    | next |
      |----------|------|

 We find ourselves in the merge right scenario, but the one in which we
 cannot remove the middle (we are offset into vma).

 Here we have a special case where vmg->start, end get set to perhaps
 unintuitive values - we intended to shrink the middle VMA and expand the
 next.

 This means vmg->start, end are set to...  vma->vm_start, start.

 Now the commit_merge() fails, and vmg->start, end are left like this.
 This means we return to the rest of vma_modify() with vmg->start, end
 (here denoted as start', end') set as:

   start' end'
      |<-->|
      |----------|------|
      |   vma    | next |
      |----------|------|

 So we now erroneously try to split accordingly.  This is where the
 unfortunate stuff begins.

 We start with:

 	/* Split any preceding portion of the VMA. */
 	if (vma->vm_start < vmg->start) {
 		...
 	}

 This doesn't trigger as we are no longer offset into vma at the start.

 But then we invoke:

 	/* Split any trailing portion of the VMA. */
 	if (vma->vm_end > vmg->end) {
 		...
 	}

 Which does get invoked. This leaves us with:

   start' end'
      |<-->|
      |----|-----|------|
      | vma| new | next |
      |----|-----|------|

 We then return ultimately to madvise_walk_vmas().  Here 'new' is unknown,
 and putting back the values known in this function we are faced with:

         start tmp end
           |     |  |
      |----|-----|------|
      | vma| new | next |
      |----|-----|------|
       prev

 Then:

 		start = tmp;

 So:

              start end
                 |  |
      |----|-----|------|
      | vma| new | next |
      |----|-----|------|
       prev

 The following code does not cause anything to happen:

 		if (prev && start < prev->vm_end)
 			start = prev->vm_end;
 		if (start >= end)
 			break;

 And then we invoke:

 		if (prev)
 			vma = find_vma(mm, prev->vm_end);

 Which is where a problem occurs - we don't know about 'new' so we
 essentially look for the vma after prev, which is new, whereas we actually
 intended to discover next!

 So we end up with:

              start end
                 |  |
      |----|-----|------|
      |prev| vma | next |
      |----|-----|------|

 And we have successfully bypassed all of the checks madvise_walk_vmas()
 has to ensure early exit should we end up moving out of range.

 We loop around, and hit:

 		/* Here vma->vm_start <= start < (end|vma->vm_end) */
 		tmp = vma->vm_end;

 Oh dear. Now we have:

               tmp
              start end
                 |  |
      |----|-----|------|
      |prev| vma | next |
      |----|-----|------|

 We then invoke:

 		/* Here vma->vm_start <= start < tmp <= (end|vma->vm_end). */
 		error = visit(vma, &prev, start, tmp, arg);

 Where start == tmp. That is, a zero range. This is not good.

 We invoke visit() which is madvise_vma_behavior() which does not check the
 range (for good reason, it assumes all checks have been done before it was
 called), which in turn finally calls madvise_update_vma().

 The madvise_update_vma() function calls vma_modify_flags_name() in turn,
 which ultimately invokes vma_modify() with...  start == end.

 vma_modify() calls vma_merge_existing_range() and finally we hit:

 	VM_WARN_ON_VMG(start >= end, vmg);

 Which triggers, as start == end.

 While it might be useful to add some CONFIG_DEBUG_VM asserts in these
 instances to catch this kind of error, since we have just eliminated any
 possibility of that happening, we will add such asserts separately as to
 reduce churn and aid backporting.

 The Linux kernel CVE team has assigned CVE-2025-21932 to this issue.


 Affected and fixed versions
 ===========================

 	Issue introduced in 6.12 with commit 2f1c6611b0a89afcb8641471af5f223c9caa01e0 and fixed in 6.12.19 with commit 79636d2981b066acd945117387a9533f56411f6f
 	Issue introduced in 6.12 with commit 2f1c6611b0a89afcb8641471af5f223c9caa01e0 and fixed in 6.13.7 with commit 53fd215f7886a1e8dea5a9ca1391dbb697fff601
 	Issue introduced in 6.12 with commit 2f1c6611b0a89afcb8641471af5f223c9caa01e0 and fixed in 6.14 with commit 47b16d0462a460000b8f05dfb1292377ac48f3ca

 Please see https://www.kernel.org for a full list of currently supported
 kernel versions by the kernel community.

 Unaffected versions might change over time as fixes are backported to
 older supported kernel versions.  The official CVE entry at
 	https://cve.org/CVERecord/?id=CVE-2025-21932
 will be updated if fixes are backported, please check that for the most
 up to date information about this issue.


 Affected files
 ==============

 The file(s) affected by this issue are:
 	mm/vma.c


 Mitigation
 ==========

 The Linux kernel CVE team recommends that you update to the latest
 stable kernel version for this, and many other bugfixes.  Individual
 changes are never tested alone, but rather are part of a larger kernel
 release.  Cherry-picking individual commits is not recommended or
 supported by the Linux kernel community at all.  If however, updating to
 the latest release is impossible, the individual changes to resolve this
 issue can be found at these commits:
 	https://git.kernel.org/stable/c/79636d2981b066acd945117387a9533f56411f6f
 	https://git.kernel.org/stable/c/53fd215f7886a1e8dea5a9ca1391dbb697fff601
 	https://git.kernel.org/stable/c/47b16d0462a460000b8f05dfb1292377ac48f3ca
	From bippy-7c5fe7eed585 Mon Sep 17 00:00:00 2001
	From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	To: <linux-cve-announce@vger.kernel.org>
	Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org>
	Subject: CVE-2025-21932: mm: abort vma_modify() on merge out of memory failure

	Description
	===========

	In the Linux kernel, the following vulnerability has been resolved:

	mm: abort vma_modify() on merge out of memory failure

	The remainder of vma_modify() relies upon the vmg state remaining pristine
	after a merge attempt.

	Usually this is the case, however in the one edge case scenario of a merge
	attempt failing not due to the specified range being unmergeable, but
	rather due to an out of memory error arising when attempting to commit the
	merge, this assumption becomes untrue.

	This results in vmg->start, end being modified, and thus the proceeding
	attempts to split the VMA will be done with invalid start/end values.

	Thankfully, it is likely practically impossible for us to hit this in
	reality, as it would require a maple tree node pre-allocation failure that
	would likely never happen due to it being 'too small to fail', i.e. the
	kernel would simply keep retrying reclaim until it succeeded.

	However, this scenario remains theoretically possible, and what we are
	doing here is wrong so we must correct it.

	The safest option is, when this scenario occurs, to simply give up the
	operation. If we cannot allocate memory to merge, then we cannot allocate
	memory to split either (perhaps moreso!).

	Any scenario where this would be happening would be under very extreme
	(likely fatal) memory pressure, so it's best we give up early.

	So there is no doubt it is appropriate to simply bail out in this
	scenario.

	However, in general we must if at all possible never assume VMG state is
	stable after a merge attempt, since merge operations update VMG fields.
	As a result, additionally also make this clear by storing start, end in
	local variables.

	The issue was reported originally by syzkaller, and by Brad Spengler (via
	an off-list discussion), and in both instances it manifested as a
	triggering of the assert:

	VM_WARN_ON_VMG(start >= end, vmg);

	In vma_merge_existing_range().

	It seems at least one scenario in which this is occurring is one in which
	the merge being attempted is due to an madvise() across multiple VMAs
	which looks like this:

	start end
	\|<------>\|
	\|----------\|------\|
	\| vma \| next \|
	\|----------\|------\|

	When madvise_walk_vmas() is invoked, we first find vma in the above
	(determining prev to be equal to vma as we are offset into vma), and then
	enter the loop.

	We determine the end of vma that forms part of the range we are
	madvise()'ing by setting 'tmp' to this value:

	/* Here vma->vm_start <= start < (end\|vma->vm_end) */
	tmp = vma->vm_end;

	We then invoke the madvise() operation via visit(), letting prev get
	updated to point to vma as part of the operation:

	/* Here vma->vm_start <= start < tmp <= (end\|vma->vm_end). */
	error = visit(vma, &prev, start, tmp, arg);

	Where the visit() function pointer in this instance is
	madvise_vma_behavior().

	As observed in syzkaller reports, it is ultimately madvise_update_vma()
	that is invoked, calling vma_modify_flags_name() and vma_modify() in turn.

	Then, in vma_modify(), we attempt the merge:

	merged = vma_merge_existing_range(vmg);
	if (merged)
	return merged;

	We invoke this with vmg->start, end set to start, tmp as such:

	start tmp
	\|<--->\|
	\|----------\|------\|
	\| vma \| next \|
	\|----------\|------\|

	We find ourselves in the merge right scenario, but the one in which we
	cannot remove the middle (we are offset into vma).

	Here we have a special case where vmg->start, end get set to perhaps
	unintuitive values - we intended to shrink the middle VMA and expand the
	next.

	This means vmg->start, end are set to... vma->vm_start, start.

	Now the commit_merge() fails, and vmg->start, end are left like this.
	This means we return to the rest of vma_modify() with vmg->start, end
	(here denoted as start', end') set as:

	start' end'
	\|<-->\|
	\|----------\|------\|
	\| vma \| next \|
	\|----------\|------\|

	So we now erroneously try to split accordingly. This is where the
	unfortunate stuff begins.

	We start with:

	/* Split any preceding portion of the VMA. */
	if (vma->vm_start < vmg->start) {
	...
	}

	This doesn't trigger as we are no longer offset into vma at the start.

	But then we invoke:

	/* Split any trailing portion of the VMA. */
	if (vma->vm_end > vmg->end) {
	...
	}

	Which does get invoked. This leaves us with:

	start' end'
	\|<-->\|
	\|----\|-----\|------\|
	\| vma\| new \| next \|
	\|----\|-----\|------\|

	We then return ultimately to madvise_walk_vmas(). Here 'new' is unknown,
	and putting back the values known in this function we are faced with:

	start tmp end
	\| \| \|
	\|----\|-----\|------\|
	\| vma\| new \| next \|
	\|----\|-----\|------\|
	prev

	Then:

	start = tmp;

	So:

	start end
	\| \|
	\|----\|-----\|------\|
	\| vma\| new \| next \|
	\|----\|-----\|------\|
	prev

	The following code does not cause anything to happen:

	if (prev && start < prev->vm_end)
	start = prev->vm_end;
	if (start >= end)
	break;

	And then we invoke:

	if (prev)
	vma = find_vma(mm, prev->vm_end);

	Which is where a problem occurs - we don't know about 'new' so we
	essentially look for the vma after prev, which is new, whereas we actually
	intended to discover next!

	So we end up with:

	start end
	\| \|
	\|----\|-----\|------\|
	\|prev\| vma \| next \|
	\|----\|-----\|------\|

	And we have successfully bypassed all of the checks madvise_walk_vmas()
	has to ensure early exit should we end up moving out of range.

	We loop around, and hit:

	/* Here vma->vm_start <= start < (end\|vma->vm_end) */
	tmp = vma->vm_end;

	Oh dear. Now we have:

	tmp
	start end
	\| \|
	\|----\|-----\|------\|
	\|prev\| vma \| next \|
	\|----\|-----\|------\|

	We then invoke:

	/* Here vma->vm_start <= start < tmp <= (end\|vma->vm_end). */
	error = visit(vma, &prev, start, tmp, arg);

	Where start == tmp. That is, a zero range. This is not good.

	We invoke visit() which is madvise_vma_behavior() which does not check the
	range (for good reason, it assumes all checks have been done before it was
	called), which in turn finally calls madvise_update_vma().

	The madvise_update_vma() function calls vma_modify_flags_name() in turn,
	which ultimately invokes vma_modify() with... start == end.

	vma_modify() calls vma_merge_existing_range() and finally we hit:

	VM_WARN_ON_VMG(start >= end, vmg);

	Which triggers, as start == end.

	While it might be useful to add some CONFIG_DEBUG_VM asserts in these
	instances to catch this kind of error, since we have just eliminated any
	possibility of that happening, we will add such asserts separately as to
	reduce churn and aid backporting.

	The Linux kernel CVE team has assigned CVE-2025-21932 to this issue.


	Affected and fixed versions
	===========================

	Issue introduced in 6.12 with commit 2f1c6611b0a89afcb8641471af5f223c9caa01e0 and fixed in 6.12.19 with commit 79636d2981b066acd945117387a9533f56411f6f
	Issue introduced in 6.12 with commit 2f1c6611b0a89afcb8641471af5f223c9caa01e0 and fixed in 6.13.7 with commit 53fd215f7886a1e8dea5a9ca1391dbb697fff601
	Issue introduced in 6.12 with commit 2f1c6611b0a89afcb8641471af5f223c9caa01e0 and fixed in 6.14 with commit 47b16d0462a460000b8f05dfb1292377ac48f3ca

	Please see https://www.kernel.org for a full list of currently supported
	kernel versions by the kernel community.

	Unaffected versions might change over time as fixes are backported to
	older supported kernel versions. The official CVE entry at
	https://cve.org/CVERecord/?id=CVE-2025-21932
	will be updated if fixes are backported, please check that for the most
	up to date information about this issue.


	Affected files
	==============

	The file(s) affected by this issue are:
	mm/vma.c


	Mitigation
	==========

	The Linux kernel CVE team recommends that you update to the latest
	stable kernel version for this, and many other bugfixes. Individual
	changes are never tested alone, but rather are part of a larger kernel
	release. Cherry-picking individual commits is not recommended or
	supported by the Linux kernel community at all. If however, updating to
	the latest release is impossible, the individual changes to resolve this
	issue can be found at these commits:
	https://git.kernel.org/stable/c/79636d2981b066acd945117387a9533f56411f6f
	https://git.kernel.org/stable/c/53fd215f7886a1e8dea5a9ca1391dbb697fff601
	https://git.kernel.org/stable/c/47b16d0462a460000b8f05dfb1292377ac48f3ca