| From d391f1207067268261add0485f0f34503539c5b0 Mon Sep 17 00:00:00 2001 |
| From: Vitaly Kuznetsov <vkuznets@redhat.com> |
| Date: Thu, 25 Jan 2018 16:37:07 +0100 |
| Subject: x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested |
| MIME-Version: 1.0 |
| Content-Type: text/plain; charset=UTF-8 |
| Content-Transfer-Encoding: 8bit |
| |
| From: Vitaly Kuznetsov <vkuznets@redhat.com> |
| |
| commit d391f1207067268261add0485f0f34503539c5b0 upstream. |
| |
| I was investigating an issue with seabios >= 1.10 which stopped working |
| for nested KVM on Hyper-V. The problem appears to be in |
| handle_ept_violation() function: when we do fast mmio we need to skip |
| the instruction so we do kvm_skip_emulated_instruction(). This, however, |
| depends on VM_EXIT_INSTRUCTION_LEN field being set correctly in VMCS. |
| However, this is not the case. |
| |
| Intel's manual doesn't mandate VM_EXIT_INSTRUCTION_LEN to be set when |
| EPT MISCONFIG occurs. While on real hardware it was observed to be set, |
| some hypervisors follow the spec and don't set it; we end up advancing |
| IP with some random value. |
| |
| I checked with Microsoft and they confirmed they don't fill |
| VM_EXIT_INSTRUCTION_LEN on EPT MISCONFIG. |
| |
| Fix the issue by doing instruction skip through emulator when running |
| nested. |
| |
| Fixes: 68c3b4d1676d870f0453c31d5a52e7e65c7448ae |
| Suggested-by: Radim Krčmář <rkrcmar@redhat.com> |
| Suggested-by: Paolo Bonzini <pbonzini@redhat.com> |
| Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> |
| Acked-by: Michael S. Tsirkin <mst@redhat.com> |
| Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> |
| Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> |
| [mhaboustak: backport to 4.9.y] |
| Signed-off-by: Mike Haboustak <haboustak@gmail.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| arch/x86/kvm/vmx.c | 19 +++++++++++++++++-- |
| arch/x86/kvm/x86.c | 3 ++- |
| 2 files changed, 19 insertions(+), 3 deletions(-) |
| |
| --- a/arch/x86/kvm/vmx.c |
| +++ b/arch/x86/kvm/vmx.c |
| @@ -6548,9 +6548,24 @@ static int handle_ept_misconfig(struct k |
| |
| gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); |
| if (!kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) { |
| - skip_emulated_instruction(vcpu); |
| trace_kvm_fast_mmio(gpa); |
| - return 1; |
| + /* |
| + * Doing kvm_skip_emulated_instruction() depends on undefined |
| + * behavior: Intel's manual doesn't mandate |
| + * VM_EXIT_INSTRUCTION_LEN to be set in VMCS when EPT MISCONFIG |
| + * occurs and while on real hardware it was observed to be set, |
| + * other hypervisors (namely Hyper-V) don't set it, we end up |
| + * advancing IP with some random value. Disable fast mmio when |
| + * running nested and keep it for real hardware in hope that |
| + * VM_EXIT_INSTRUCTION_LEN will always be set correctly. |
| + */ |
| + if (!static_cpu_has(X86_FEATURE_HYPERVISOR)) { |
| + skip_emulated_instruction(vcpu); |
| + return 1; |
| + } |
| + else |
| + return x86_emulate_instruction(vcpu, gpa, EMULTYPE_SKIP, |
| + NULL, 0) == EMULATE_DONE; |
| } |
| |
| ret = handle_mmio_page_fault(vcpu, gpa, true); |
| --- a/arch/x86/kvm/x86.c |
| +++ b/arch/x86/kvm/x86.c |
| @@ -5707,7 +5707,8 @@ int x86_emulate_instruction(struct kvm_v |
| * handle watchpoints yet, those would be handled in |
| * the emulate_ops. |
| */ |
| - if (kvm_vcpu_check_breakpoint(vcpu, &r)) |
| + if (!(emulation_type & EMULTYPE_SKIP) && |
| + kvm_vcpu_check_breakpoint(vcpu, &r)) |
| return r; |
| |
| ctxt->interruptibility = 0; |