kvm/x86: Avoid async PF to end RCU read-side critical section early in PREEMPT=n kernel

Currently, in PREEMPT=n kernel, kvm_async_pf_task_wait() could call
schedule() to reschedule in some cases, which could result in
accidentally ending the current RCU read-side critical section early.
And this could end up with random memory corruption in the guest.

The difficulty to handle this well is because we don't know whether an
async PF delivered in a RCU read-side critical section for
PREEMPT_COUNT=n kernel, since rcu_read_lock/unlock() are just no-ops in
that case.

To cure this, we treat any async PF interrupting a kernel context as one
delivered in a RCU read-side critical section, and we don't allow
kvm_async_pf_task_wait() to choose schedule path in that case for
PREEMPT_COUNT=n kernel, because that will introduce unvolunteerly
context switches and break the assumption for RCU to work properly.

To do so, a second parameter for kvm_async_pf_task_wait() is introduced,
so that we know whether it's called from a context interrupting the
kernel, and we set that parameter properly in all the callsites.

Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
3 files changed