| From: Borislav Petkov <bp@suse.de> |
| Date: Fri, 22 Jun 2018 11:54:28 +0200 |
| Subject: x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out() |
| |
| commit 1f74c8a64798e2c488f86efc97e308b85fb7d7aa upstream. |
| |
| mce_no_way_out() does a quick check during #MC to see whether some of |
| the MCEs logged would require the kernel to panic immediately. And it |
| passes a struct mce where MCi_STATUS gets written. |
| |
| However, after having saved a valid status value, the next iteration |
| of the loop which goes over the MCA banks on the CPU, overwrites the |
| valid status value because we're using struct mce as storage instead of |
| a temporary variable. |
| |
| Which leads to MCE records with an empty status value: |
| |
| mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000 |
| mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10} |
| |
| In order to prevent the loss of the status register value, return |
| immediately when severity is a panic one so that we can panic |
| immediately with the first fatal MCE logged. This is also the intention |
| of this function and not to noodle over the banks while a fatal MCE is |
| already logged. |
| |
| Tony: read the rest of the MCA bank to populate the struct mce fully. |
| |
| Suggested-by: Tony Luck <tony.luck@intel.com> |
| Signed-off-by: Borislav Petkov <bp@suse.de> |
| Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
| Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de |
| [bwh: Backported to 3.16: adjust context] |
| Signed-off-by: Ben Hutchings <ben@decadent.org.uk> |
| --- |
| arch/x86/kernel/cpu/mcheck/mce.c | 18 ++++++++++-------- |
| 1 file changed, 10 insertions(+), 8 deletions(-) |
| |
| --- a/arch/x86/kernel/cpu/mcheck/mce.c |
| +++ b/arch/x86/kernel/cpu/mcheck/mce.c |
| @@ -666,23 +666,25 @@ EXPORT_SYMBOL_GPL(machine_check_poll); |
| static int mce_no_way_out(struct mce *m, char **msg, unsigned long *validp, |
| struct pt_regs *regs) |
| { |
| - int i, ret = 0; |
| char *tmp; |
| + int i; |
| |
| for (i = 0; i < mca_cfg.banks; i++) { |
| m->status = mce_rdmsrl(MSR_IA32_MCx_STATUS(i)); |
| - if (m->status & MCI_STATUS_VAL) { |
| - __set_bit(i, validp); |
| - if (quirk_no_way_out) |
| - quirk_no_way_out(i, m, regs); |
| - } |
| + if (!(m->status & MCI_STATUS_VAL)) |
| + continue; |
| + |
| + __set_bit(i, validp); |
| + if (quirk_no_way_out) |
| + quirk_no_way_out(i, m, regs); |
| |
| if (mce_severity(m, mca_cfg.tolerant, &tmp) >= MCE_PANIC_SEVERITY) { |
| + mce_read_aux(m, i); |
| *msg = tmp; |
| - ret = 1; |
| + return 1; |
| } |
| } |
| - return ret; |
| + return 0; |
| } |
| |
| /* |