| From ff30ef40deca4658e27b0c596e7baf39115e858f Mon Sep 17 00:00:00 2001 |
| From: Quentin Casasnovas <quentin.casasnovas@oracle.com> |
| Date: Sat, 18 Jun 2016 11:01:05 +0200 |
| Subject: KVM: nVMX: VMX instructions: fix segment checks when L1 is in long mode. |
| MIME-Version: 1.0 |
| Content-Type: text/plain; charset=UTF-8 |
| Content-Transfer-Encoding: 8bit |
| |
| From: Quentin Casasnovas <quentin.casasnovas@oracle.com> |
| |
| commit ff30ef40deca4658e27b0c596e7baf39115e858f upstream. |
| |
| I couldn't get Xen to boot a L2 HVM when it was nested under KVM - it was |
| getting a GP(0) on a rather unspecial vmread from Xen: |
| |
| (XEN) ----[ Xen-4.7.0-rc x86_64 debug=n Not tainted ]---- |
| (XEN) CPU: 1 |
| (XEN) RIP: e008:[<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450 |
| (XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor (d1v0) |
| (XEN) rax: ffff82d0801e6288 rbx: ffff83003ffbfb7c rcx: fffffffffffab928 |
| (XEN) rdx: 0000000000000000 rsi: 0000000000000000 rdi: ffff83000bdd0000 |
| (XEN) rbp: ffff83000bdd0000 rsp: ffff83003ffbfab0 r8: ffff830038813910 |
| (XEN) r9: ffff83003faf3958 r10: 0000000a3b9f7640 r11: ffff83003f82d418 |
| (XEN) r12: 0000000000000000 r13: ffff83003ffbffff r14: 0000000000004802 |
| (XEN) r15: 0000000000000008 cr0: 0000000080050033 cr4: 00000000001526e0 |
| (XEN) cr3: 000000003fc79000 cr2: 0000000000000000 |
| (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 |
| (XEN) Xen code around <ffff82d0801e629e> (vmx_get_segment_register+0x14e/0x450): |
| (XEN) 00 00 41 be 02 48 00 00 <44> 0f 78 74 24 08 0f 86 38 56 00 00 b8 08 68 00 |
| (XEN) Xen stack trace from rsp=ffff83003ffbfab0: |
| |
| ... |
| |
| (XEN) Xen call trace: |
| (XEN) [<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450 |
| (XEN) [<ffff82d0801f3695>] get_page_from_gfn_p2m+0x165/0x300 |
| (XEN) [<ffff82d0801bfe32>] hvmemul_get_seg_reg+0x52/0x60 |
| (XEN) [<ffff82d0801bfe93>] hvm_emulate_prepare+0x53/0x70 |
| (XEN) [<ffff82d0801ccacb>] handle_mmio+0x2b/0xd0 |
| (XEN) [<ffff82d0801be591>] emulate.c#_hvm_emulate_one+0x111/0x2c0 |
| (XEN) [<ffff82d0801cd6a4>] handle_hvm_io_completion+0x274/0x2a0 |
| (XEN) [<ffff82d0801f334a>] __get_gfn_type_access+0xfa/0x270 |
| (XEN) [<ffff82d08012f3bb>] timer.c#add_entry+0x4b/0xb0 |
| (XEN) [<ffff82d08012f80c>] timer.c#remove_entry+0x7c/0x90 |
| (XEN) [<ffff82d0801c8433>] hvm_do_resume+0x23/0x140 |
| (XEN) [<ffff82d0801e4fe7>] vmx_do_resume+0xa7/0x140 |
| (XEN) [<ffff82d080164aeb>] context_switch+0x13b/0xe40 |
| (XEN) [<ffff82d080128e6e>] schedule.c#schedule+0x22e/0x570 |
| (XEN) [<ffff82d08012c0cc>] softirq.c#__do_softirq+0x5c/0x90 |
| (XEN) [<ffff82d0801602c5>] domain.c#idle_loop+0x25/0x50 |
| (XEN) |
| (XEN) |
| (XEN) **************************************** |
| (XEN) Panic on CPU 1: |
| (XEN) GENERAL PROTECTION FAULT |
| (XEN) [error_code=0000] |
| (XEN) **************************************** |
| |
| Tracing my host KVM showed it was the one injecting the GP(0) when |
| emulating the VMREAD and checking the destination segment permissions in |
| get_vmx_mem_address(): |
| |
| 3) | vmx_handle_exit() { |
| 3) | handle_vmread() { |
| 3) | nested_vmx_check_permission() { |
| 3) | vmx_get_segment() { |
| 3) 0.074 us | vmx_read_guest_seg_base(); |
| 3) 0.065 us | vmx_read_guest_seg_selector(); |
| 3) 0.066 us | vmx_read_guest_seg_ar(); |
| 3) 1.636 us | } |
| 3) 0.058 us | vmx_get_rflags(); |
| 3) 0.062 us | vmx_read_guest_seg_ar(); |
| 3) 3.469 us | } |
| 3) | vmx_get_cs_db_l_bits() { |
| 3) 0.058 us | vmx_read_guest_seg_ar(); |
| 3) 0.662 us | } |
| 3) | get_vmx_mem_address() { |
| 3) 0.068 us | vmx_cache_reg(); |
| 3) | vmx_get_segment() { |
| 3) 0.074 us | vmx_read_guest_seg_base(); |
| 3) 0.068 us | vmx_read_guest_seg_selector(); |
| 3) 0.071 us | vmx_read_guest_seg_ar(); |
| 3) 1.756 us | } |
| 3) | kvm_queue_exception_e() { |
| 3) 0.066 us | kvm_multiple_exception(); |
| 3) 0.684 us | } |
| 3) 4.085 us | } |
| 3) 9.833 us | } |
| 3) + 10.366 us | } |
| |
| Cross-checking the KVM/VMX VMREAD emulation code with the Intel Software |
| Developper Manual Volume 3C - "VMREAD - Read Field from Virtual-Machine |
| Control Structure", I found that we're enforcing that the destination |
| operand is NOT located in a read-only data segment or any code segment when |
| the L1 is in long mode - BUT that check should only happen when it is in |
| protected mode. |
| |
| Shuffling the code a bit to make our emulation follow the specification |
| allows me to boot a Xen dom0 in a nested KVM and start HVM L2 guests |
| without problems. |
| |
| Fixes: f9eb4af67c9d ("KVM: nVMX: VMX instructions: add checks for #GP/#SS exceptions") |
| Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> |
| Cc: Eugene Korenevsky <ekorenevsky@gmail.com> |
| Cc: Paolo Bonzini <pbonzini@redhat.com> |
| Cc: Radim Krčmář <rkrcmar@redhat.com> |
| Cc: Thomas Gleixner <tglx@linutronix.de> |
| Cc: Ingo Molnar <mingo@redhat.com> |
| Cc: H. Peter Anvin <hpa@zytor.com> |
| Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| |
| --- |
| arch/x86/kvm/vmx.c | 23 +++++++++++------------ |
| 1 file changed, 11 insertions(+), 12 deletions(-) |
| |
| --- a/arch/x86/kvm/vmx.c |
| +++ b/arch/x86/kvm/vmx.c |
| @@ -6579,7 +6579,13 @@ static int get_vmx_mem_address(struct kv |
| |
| /* Checks for #GP/#SS exceptions. */ |
| exn = false; |
| - if (is_protmode(vcpu)) { |
| + if (is_long_mode(vcpu)) { |
| + /* Long mode: #GP(0)/#SS(0) if the memory address is in a |
| + * non-canonical form. This is the only check on the memory |
| + * destination for long mode! |
| + */ |
| + exn = is_noncanonical_address(*ret); |
| + } else if (is_protmode(vcpu)) { |
| /* Protected mode: apply checks for segment validity in the |
| * following order: |
| * - segment type check (#GP(0) may be thrown) |
| @@ -6596,17 +6602,10 @@ static int get_vmx_mem_address(struct kv |
| * execute-only code segment |
| */ |
| exn = ((s.type & 0xa) == 8); |
| - } |
| - if (exn) { |
| - kvm_queue_exception_e(vcpu, GP_VECTOR, 0); |
| - return 1; |
| - } |
| - if (is_long_mode(vcpu)) { |
| - /* Long mode: #GP(0)/#SS(0) if the memory address is in a |
| - * non-canonical form. This is an only check for long mode. |
| - */ |
| - exn = is_noncanonical_address(*ret); |
| - } else if (is_protmode(vcpu)) { |
| + if (exn) { |
| + kvm_queue_exception_e(vcpu, GP_VECTOR, 0); |
| + return 1; |
| + } |
| /* Protected mode: #GP(0)/#SS(0) if the segment is unusable. |
| */ |
| exn = (s.unusable != 0); |