release/5.2.37/KVM-x86-mmu-Take-slots_lock-when-using-kvm_mmu_zap_a.patch - pub/scm/linux/kernel/git/paulg/longterm-queue-5.2 - Git at Google

 From 66ed013764d8f9577b2958033cb4ac827adf04d8 Mon Sep 17 00:00:00 2001
 From: Sean Christopherson <sean.j.christopherson@intel.com>
 Date: Wed, 13 Nov 2019 11:30:32 -0800
 Subject: [PATCH] KVM: x86/mmu: Take slots_lock when using
  kvm_mmu_zap_all_fast()

 commit ed69a6cb700880d052a0d085ff2e5bfc108ce238 upstream.

 Acquire the per-VM slots_lock when zapping all shadow pages as part of
 toggling nx_huge_pages.  The fast zap algorithm relies on exclusivity
 (via slots_lock) to identify obsolete vs. valid shadow pages, because it
 uses a single bit for its generation number. Holding slots_lock also
 obviates the need to acquire a read lock on the VM's srcu.

 Failing to take slots_lock when toggling nx_huge_pages allows multiple
 instances of kvm_mmu_zap_all_fast() to run concurrently, as the other
 user, KVM_SET_USER_MEMORY_REGION, does not take the global kvm_lock.
 (kvm_mmu_zap_all_fast() does take kvm->mmu_lock, but it can be
 temporarily dropped by kvm_zap_obsolete_pages(), so it is not enough
 to enforce exclusivity).

 Concurrent fast zap instances causes obsolete shadow pages to be
 incorrectly identified as valid due to the single bit generation number
 wrapping, which results in stale shadow pages being left in KVM's MMU
 and leads to all sorts of undesirable behavior.
 The bug is easily confirmed by running with CONFIG_PROVE_LOCKING and
 toggling nx_huge_pages via its module param.

 Note, until commit 4ae5acbc4936 ("KVM: x86/mmu: Take slots_lock when
 using kvm_mmu_zap_all_fast()", 2019-11-13) the fast zap algorithm used
 an ulong-sized generation instead of relying on exclusivity for
 correctness, but all callers except the recently added set_nx_huge_pages()
 needed to hold slots_lock anyways.  Therefore, this patch does not have
 to be backported to stable kernels.

 Given that toggling nx_huge_pages is by no means a fast path, force it
 to conform to the current approach instead of reintroducing the previous
 generation count.

 Fixes: b8e8c8303ff28 ("kvm: mmu: ITLB_MULTIHIT mitigation", but NOT FOR STABLE)
 Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
 Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index 35348302a6e0..767248c1fb97 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -6189,14 +6189,13 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)

  	if (new_val != old_val) {
  		struct kvm *kvm;
 -		int idx;

  		spin_lock(&kvm_lock);

  		list_for_each_entry(kvm, &vm_list, vm_list) {
 -			idx = srcu_read_lock(&kvm->srcu);
 +			mutex_lock(&kvm->slots_lock);
  			kvm_mmu_zap_all_fast(kvm);
 -			srcu_read_unlock(&kvm->srcu, idx);
 +			mutex_unlock(&kvm->slots_lock);

  			wake_up_process(kvm->arch.nx_lpage_recovery_thread);
  		}
 --
 2.7.4
	From 66ed013764d8f9577b2958033cb4ac827adf04d8 Mon Sep 17 00:00:00 2001
	From: Sean Christopherson <sean.j.christopherson@intel.com>
	Date: Wed, 13 Nov 2019 11:30:32 -0800
	Subject: [PATCH] KVM: x86/mmu: Take slots_lock when using
	kvm_mmu_zap_all_fast()

	commit ed69a6cb700880d052a0d085ff2e5bfc108ce238 upstream.

	Acquire the per-VM slots_lock when zapping all shadow pages as part of
	toggling nx_huge_pages. The fast zap algorithm relies on exclusivity
	(via slots_lock) to identify obsolete vs. valid shadow pages, because it
	uses a single bit for its generation number. Holding slots_lock also
	obviates the need to acquire a read lock on the VM's srcu.

	Failing to take slots_lock when toggling nx_huge_pages allows multiple
	instances of kvm_mmu_zap_all_fast() to run concurrently, as the other
	user, KVM_SET_USER_MEMORY_REGION, does not take the global kvm_lock.
	(kvm_mmu_zap_all_fast() does take kvm->mmu_lock, but it can be
	temporarily dropped by kvm_zap_obsolete_pages(), so it is not enough
	to enforce exclusivity).

	Concurrent fast zap instances causes obsolete shadow pages to be
	incorrectly identified as valid due to the single bit generation number
	wrapping, which results in stale shadow pages being left in KVM's MMU
	and leads to all sorts of undesirable behavior.
	The bug is easily confirmed by running with CONFIG_PROVE_LOCKING and
	toggling nx_huge_pages via its module param.

	Note, until commit 4ae5acbc4936 ("KVM: x86/mmu: Take slots_lock when
	using kvm_mmu_zap_all_fast()", 2019-11-13) the fast zap algorithm used
	an ulong-sized generation instead of relying on exclusivity for
	correctness, but all callers except the recently added set_nx_huge_pages()
	needed to hold slots_lock anyways. Therefore, this patch does not have
	to be backported to stable kernels.

	Given that toggling nx_huge_pages is by no means a fast path, force it
	to conform to the current approach instead of reintroducing the previous
	generation count.

	Fixes: b8e8c8303ff28 ("kvm: mmu: ITLB_MULTIHIT mitigation", but NOT FOR STABLE)
	Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
	Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
	Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

	diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
	index 35348302a6e0..767248c1fb97 100644
	--- a/arch/x86/kvm/mmu.c
	+++ b/arch/x86/kvm/mmu.c
	@@ -6189,14 +6189,13 @@ static int set_nx_huge_pages(const char val, const struct kernel_param kp)

	if (new_val != old_val) {
	struct kvm *kvm;
	- int idx;

	spin_lock(&kvm_lock);

	list_for_each_entry(kvm, &vm_list, vm_list) {
	- idx = srcu_read_lock(&kvm->srcu);
	+ mutex_lock(&kvm->slots_lock);
	kvm_mmu_zap_all_fast(kvm);
	- srcu_read_unlock(&kvm->srcu, idx);
	+ mutex_unlock(&kvm->slots_lock);

	wake_up_process(kvm->arch.nx_lpage_recovery_thread);
	}
	--
	2.7.4