| From 7f1fc268c47491fd5e63548f6415fc8604e13003 Mon Sep 17 00:00:00 2001 |
| From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> |
| Date: Sun, 5 May 2013 09:30:09 -0400 |
| Subject: xen/vcpu/pvhvm: Fix vcpu hotplugging hanging. |
| |
| From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> |
| |
| commit 7f1fc268c47491fd5e63548f6415fc8604e13003 upstream. |
| |
| If a user did: |
| |
| echo 0 > /sys/devices/system/cpu/cpu1/online |
| echo 1 > /sys/devices/system/cpu/cpu1/online |
| |
| we would (this a build with DEBUG enabled) get to: |
| smpboot: ++++++++++++++++++++=_---CPU UP 1 |
| .. snip.. |
| smpboot: Stack at about ffff880074c0ff44 |
| smpboot: CPU1: has booted. |
| |
| and hang. The RCU mechanism would kick in an try to IPI the CPU1 |
| but the IPIs (and all other interrupts) would never arrive at the |
| CPU1. At first glance at least. A bit digging in the hypervisor |
| trace shows that (using xenanalyze): |
| |
| [vla] d4v1 vec 243 injecting |
| 0.043163027 --|x d4v1 intr_window vec 243 src 5(vector) intr f3 |
| ] 0.043163639 --|x d4v1 vmentry cycles 1468 |
| ] 0.043164913 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254 |
| 0.043164913 --|x d4v1 inj_virq vec 243 real |
| [vla] d4v1 vec 243 injecting |
| 0.043164913 --|x d4v1 intr_window vec 243 src 5(vector) intr f3 |
| ] 0.043165526 --|x d4v1 vmentry cycles 1472 |
| ] 0.043166800 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254 |
| 0.043166800 --|x d4v1 inj_virq vec 243 real |
| [vla] d4v1 vec 243 injecting |
| |
| there is a pending event (subsequent debugging shows it is the IPI |
| from the VCPU0 when smpboot.c on VCPU1 has done |
| "set_cpu_online(smp_processor_id(), true)") and the guest VCPU1 is |
| interrupted with the callback IPI (0xf3 aka 243) which ends up calling |
| __xen_evtchn_do_upcall. |
| |
| The __xen_evtchn_do_upcall seems to do *something* but not acknowledge |
| the pending events. And the moment the guest does a 'cli' (that is the |
| ffffffff81673254 in the log above) the hypervisor is invoked again to |
| inject the IPI (0xf3) to tell the guest it has pending interrupts. |
| This repeats itself forever. |
| |
| The culprit was the per_cpu(xen_vcpu, cpu) pointer. At the bootup |
| we set each per_cpu(xen_vcpu, cpu) to point to the |
| shared_info->vcpu_info[vcpu] but later on use the VCPUOP_register_vcpu_info |
| to register per-CPU structures (xen_vcpu_setup). |
| This is used to allow events for more than 32 VCPUs and for performance |
| optimizations reasons. |
| |
| When the user performs the VCPU hotplug we end up calling the |
| the xen_vcpu_setup once more. We make the hypercall which returns |
| -EINVAL as it does not allow multiple registration calls (and |
| already has re-assigned where the events are being set). We pick |
| the fallback case and set per_cpu(xen_vcpu, cpu) to point to the |
| shared_info->vcpu_info[vcpu] (which is a good fallback during bootup). |
| However the hypervisor is still setting events in the register |
| per-cpu structure (per_cpu(xen_vcpu_info, cpu)). |
| |
| As such when the events are set by the hypervisor (such as timer one), |
| and when we iterate in __xen_evtchn_do_upcall we end up reading stale |
| events from the shared_info->vcpu_info[vcpu] instead of the |
| per_cpu(xen_vcpu_info, cpu) structures. Hence we never acknowledge the |
| events that the hypervisor has set and the hypervisor keeps on reminding |
| us to ack the events which we never do. |
| |
| The fix is simple. Don't on the second time when xen_vcpu_setup is |
| called over-write the per_cpu(xen_vcpu, cpu) if it points to |
| per_cpu(xen_vcpu_info). |
| |
| Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> |
| Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| |
| --- |
| arch/x86/xen/enlighten.c | 15 +++++++++++++++ |
| 1 file changed, 15 insertions(+) |
| |
| --- a/arch/x86/xen/enlighten.c |
| +++ b/arch/x86/xen/enlighten.c |
| @@ -156,6 +156,21 @@ static void xen_vcpu_setup(int cpu) |
| |
| BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info); |
| |
| + /* |
| + * This path is called twice on PVHVM - first during bootup via |
| + * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being |
| + * hotplugged: cpu_up -> xen_hvm_cpu_notify. |
| + * As we can only do the VCPUOP_register_vcpu_info once lets |
| + * not over-write its result. |
| + * |
| + * For PV it is called during restore (xen_vcpu_restore) and bootup |
| + * (xen_setup_vcpu_info_placement). The hotplug mechanism does not |
| + * use this function. |
| + */ |
| + if (xen_hvm_domain()) { |
| + if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu)) |
| + return; |
| + } |
| if (cpu < MAX_VIRT_CPUS) |
| per_cpu(xen_vcpu,cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; |
| |