blob: e791ed95fe32391a1bd41cc90ead773ea51cd4c4 [file] [log] [blame]
From 66ed013764d8f9577b2958033cb4ac827adf04d8 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <sean.j.christopherson@intel.com>
Date: Wed, 13 Nov 2019 11:30:32 -0800
Subject: [PATCH] KVM: x86/mmu: Take slots_lock when using
kvm_mmu_zap_all_fast()
commit ed69a6cb700880d052a0d085ff2e5bfc108ce238 upstream.
Acquire the per-VM slots_lock when zapping all shadow pages as part of
toggling nx_huge_pages. The fast zap algorithm relies on exclusivity
(via slots_lock) to identify obsolete vs. valid shadow pages, because it
uses a single bit for its generation number. Holding slots_lock also
obviates the need to acquire a read lock on the VM's srcu.
Failing to take slots_lock when toggling nx_huge_pages allows multiple
instances of kvm_mmu_zap_all_fast() to run concurrently, as the other
user, KVM_SET_USER_MEMORY_REGION, does not take the global kvm_lock.
(kvm_mmu_zap_all_fast() does take kvm->mmu_lock, but it can be
temporarily dropped by kvm_zap_obsolete_pages(), so it is not enough
to enforce exclusivity).
Concurrent fast zap instances causes obsolete shadow pages to be
incorrectly identified as valid due to the single bit generation number
wrapping, which results in stale shadow pages being left in KVM's MMU
and leads to all sorts of undesirable behavior.
The bug is easily confirmed by running with CONFIG_PROVE_LOCKING and
toggling nx_huge_pages via its module param.
Note, until commit 4ae5acbc4936 ("KVM: x86/mmu: Take slots_lock when
using kvm_mmu_zap_all_fast()", 2019-11-13) the fast zap algorithm used
an ulong-sized generation instead of relying on exclusivity for
correctness, but all callers except the recently added set_nx_huge_pages()
needed to hold slots_lock anyways. Therefore, this patch does not have
to be backported to stable kernels.
Given that toggling nx_huge_pages is by no means a fast path, force it
to conform to the current approach instead of reintroducing the previous
generation count.
Fixes: b8e8c8303ff28 ("kvm: mmu: ITLB_MULTIHIT mitigation", but NOT FOR STABLE)
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 35348302a6e0..767248c1fb97 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -6189,14 +6189,13 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
if (new_val != old_val) {
struct kvm *kvm;
- int idx;
spin_lock(&kvm_lock);
list_for_each_entry(kvm, &vm_list, vm_list) {
- idx = srcu_read_lock(&kvm->srcu);
+ mutex_lock(&kvm->slots_lock);
kvm_mmu_zap_all_fast(kvm);
- srcu_read_unlock(&kvm->srcu, idx);
+ mutex_unlock(&kvm->slots_lock);
wake_up_process(kvm->arch.nx_lpage_recovery_thread);
}
--
2.7.4