| From 7899891c7d161752f29abcc9bc0a9c6c3a3af26c Mon Sep 17 00:00:00 2001 |
| From: "Tian, Kevin" <kevin.tian@intel.com> |
| Date: Thu, 12 May 2011 10:56:08 +0800 |
| Subject: xen mmu: fix a race window causing leave_mm BUG() |
| |
| From: "Tian, Kevin" <kevin.tian@intel.com> |
| |
| commit 7899891c7d161752f29abcc9bc0a9c6c3a3af26c upstream. |
| |
| There's a race window in xen_drop_mm_ref, where remote cpu may exit |
| dirty bitmap between the check on this cpu and the point where remote |
| cpu handles drop request. So in drop_other_mm_ref we need check |
| whether TLB state is still lazy before calling into leave_mm. This |
| bug is rarely observed in earlier kernel, but exaggerated by the |
| commit 831d52bc153971b70e64eccfbed2b232394f22f8 |
| ("x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm") |
| which clears bitmap after changing the TLB state. the call trace is as below: |
| |
| --------------------------------- |
| kernel BUG at arch/x86/mm/tlb.c:61! |
| invalid opcode: 0000 [#1] SMP |
| last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_kb |
| CPU 1 |
| Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_core pcs pkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase [last unloaded: freq_table] |
| Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285 |
| RIP: e030:[<ffffffff8103a3cb>] [<ffffffff8103a3cb>] leave_mm+0x15/0x46 |
| RSP: e02b:ffff88002805be48 EFLAGS: 00010046 |
| RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 |
| RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 |
| RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 |
| R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 |
| R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 |
| FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000 |
| CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b |
| CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 |
| DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000 00 |
| DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 |
| Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b92db40) |
| Stack: |
| ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 |
| <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78 |
| <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108 |
| Call Trace: |
| <IRQ> |
| [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53 |
| [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc |
| [<ffffffff81010108>] xen_call_function_single_interrupt+0x13/0x28 |
| [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120 |
| [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e |
| [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d |
| [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46 |
| [<ffffffff81013efe>] xen_do_hyper visor_callback+0x1e/0x30 |
| <EOI> |
| [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17 |
| [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 |
| [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500 |
| [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef |
| [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef |
| [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17ef |
| [<ffffffff81042fcf>] ? need_resched+0x23/0x2d |
| [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7 |
| [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef |
| [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255 |
| [<ffffffff81114362>] ? do_execve+0x1c3/0x29e |
| [<ffffffff8101155d>] ? sys_execve+0x43/0x5d |
| [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f |
| [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0 |
| [<ffffffff 8106fc45>] ? __call_usermodehelper+0x0/0x6f |
| [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 |
| [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e |
| [<ffffffff81013daa>] ? child_rip+0xa/0x20 |
| [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f |
| [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b |
| [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6 |
| [<ffffffff81013da0>] ? child_rip+0x0/0x20 |
| Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 |
| RIP [<ffffffff8103a3cb>] leave_mm+0x15/0x46 |
| RSP <ffff88002805be48> |
| ---[ end trace ce9cee6832a9c503 ]--- |
| |
| Tested-by: Maoxiaoyun<tinnycloud@hotmail.com> |
| Signed-off-by: Kevin Tian <kevin.tian@intel.com> |
| [v1: Fleshed out the git description a bit] |
| Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> |
| |
| --- |
| arch/x86/xen/mmu.c | 2 +- |
| 1 file changed, 1 insertion(+), 1 deletion(-) |
| |
| --- a/arch/x86/xen/mmu.c |
| +++ b/arch/x86/xen/mmu.c |
| @@ -1141,7 +1141,7 @@ static void drop_other_mm_ref(void *info |
| |
| active_mm = percpu_read(cpu_tlbstate.active_mm); |
| |
| - if (active_mm == mm) |
| + if (active_mm == mm && percpu_read(cpu_tlbstate.state) != TLBSTATE_OK) |
| leave_mm(smp_processor_id()); |
| |
| /* If this cpu still has a stale cr3 reference, then make sure |