| From: Yang Shi <yang@os.amperecomputing.com> |
| Subject: mm: hugetlb: avoid soft lockup when mprotect to large memory area |
| Date: Mon, 29 Sep 2025 13:24:02 -0700 |
| |
| When calling mprotect() to a large hugetlb memory area in our customer's |
| workload (~300GB hugetlb memory), soft lockup was observed: |
| |
| watchdog: BUG: soft lockup - CPU#98 stuck for 23s! [t2_new_sysv:126916] |
| |
| CPU: 98 PID: 126916 Comm: t2_new_sysv Kdump: loaded Not tainted 6.17-rc7 |
| Hardware name: GIGACOMPUTING R2A3-T40-AAV1/Jefferson CIO, BIOS 5.4.4.1 07/15/2025 |
| pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) |
| pc : mte_clear_page_tags+0x14/0x24 |
| lr : mte_sync_tags+0x1c0/0x240 |
| sp : ffff80003150bb80 |
| x29: ffff80003150bb80 x28: ffff00739e9705a8 x27: 0000ffd2d6a00000 |
| x26: 0000ff8e4bc00000 x25: 00e80046cde00f45 x24: 0000000000022458 |
| x23: 0000000000000000 x22: 0000000000000004 x21: 000000011b380000 |
| x20: ffff000000000000 x19: 000000011b379f40 x18: 0000000000000000 |
| x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 |
| x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 |
| x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc875e0aa5e2c |
| x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000 |
| x5 : fffffc01ce7a5c00 x4 : 00000000046cde00 x3 : fffffc0000000000 |
| x2 : 0000000000000004 x1 : 0000000000000040 x0 : ffff0046cde7c000 |
| |
| Call trace: |
| mte_clear_page_tags+0x14/0x24 |
| set_huge_pte_at+0x25c/0x280 |
| hugetlb_change_protection+0x220/0x430 |
| change_protection+0x5c/0x8c |
| mprotect_fixup+0x10c/0x294 |
| do_mprotect_pkey.constprop.0+0x2e0/0x3d4 |
| __arm64_sys_mprotect+0x24/0x44 |
| invoke_syscall+0x50/0x160 |
| el0_svc_common+0x48/0x144 |
| do_el0_svc+0x30/0xe0 |
| el0_svc+0x30/0xf0 |
| el0t_64_sync_handler+0xc4/0x148 |
| el0t_64_sync+0x1a4/0x1a8 |
| |
| Soft lockup is not triggered with THP or base page because there is |
| cond_resched() called for each PMD size. |
| |
| Although the soft lockup was triggered by MTE, it should be not MTE |
| specific. The other processing which takes long time in the loop may |
| trigger soft lockup too. |
| |
| So add cond_resched() for hugetlb to avoid soft lockup. |
| |
| Link: https://lkml.kernel.org/r/20250929202402.1663290-1-yang@os.amperecomputing.com |
| Fixes: 8f860591ffb2 ("[PATCH] Enable mprotect on huge pages") |
| Signed-off-by: Yang Shi <yang@os.amperecomputing.com> |
| Tested-by: Carl Worth <carl@os.amperecomputing.com> |
| Reviewed-by: Christoph Lameter (Ampere) <cl@gentwo.org> |
| Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> |
| Acked-by: David Hildenbrand <david@redhat.com> |
| Acked-by: Oscar Salvador <osalvador@suse.de> |
| Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> |
| Reviewed-by: Dev Jain <dev.jain@arm.com> |
| Cc: Muchun Song <muchun.song@linux.dev> |
| Cc: Will Deacon <will@kernel.org> |
| Cc: <stable@vger.kernel.org> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| --- |
| |
| mm/hugetlb.c | 2 ++ |
| 1 file changed, 2 insertions(+) |
| |
| --- a/mm/hugetlb.c~mm-hugetlb-avoid-soft-lockup-when-mprotect-to-large-memory-area |
| +++ a/mm/hugetlb.c |
| @@ -7222,6 +7222,8 @@ long hugetlb_change_protection(struct vm |
| psize); |
| } |
| spin_unlock(ptl); |
| + |
| + cond_resched(); |
| } |
| /* |
| * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare |
| _ |