| From foo@baz Tue Aug 14 16:14:56 CEST 2018 |
| From: Andi Kleen <ak@linux.intel.com> |
| Date: Wed, 13 Jun 2018 15:48:21 -0700 |
| Subject: x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_SHIFT |
| |
| From: Andi Kleen <ak@linux.intel.com> |
| |
| commit 50896e180c6aa3a9c61a26ced99e15d602666a4c upstream |
| |
| L1 Terminal Fault (L1TF) is a speculation related vulnerability. The CPU |
| speculates on PTE entries which do not have the PRESENT bit set, if the |
| content of the resulting physical address is available in the L1D cache. |
| |
| The OS side mitigation makes sure that a !PRESENT PTE entry points to a |
| physical address outside the actually existing and cachable memory |
| space. This is achieved by inverting the upper bits of the PTE. Due to the |
| address space limitations this only works for 64bit and 32bit PAE kernels, |
| but not for 32bit non PAE. |
| |
| This mitigation applies to both host and guest kernels, but in case of a |
| 64bit host (hypervisor) and a 32bit PAE guest, inverting the upper bits of |
| the PAE address space (44bit) is not enough if the host has more than 43 |
| bits of populated memory address space, because the speculation treats the |
| PTE content as a physical host address bypassing EPT. |
| |
| The host (hypervisor) protects itself against the guest by flushing L1D as |
| needed, but pages inside the guest are not protected against attacks from |
| other processes inside the same guest. |
| |
| For the guest the inverted PTE mask has to match the host to provide the |
| full protection for all pages the host could possibly map into the |
| guest. The hosts populated address space is not known to the guest, so the |
| mask must cover the possible maximal host address space, i.e. 52 bit. |
| |
| On 32bit PAE the maximum PTE mask is currently set to 44 bit because that |
| is the limit imposed by 32bit unsigned long PFNs in the VMs. This limits |
| the mask to be below what the host could possible use for physical pages. |
| |
| The L1TF PROT_NONE protection code uses the PTE masks to determine which |
| bits to invert to make sure the higher bits are set for unmapped entries to |
| prevent L1TF speculation attacks against EPT inside guests. |
| |
| In order to invert all bits that could be used by the host, increase |
| __PHYSICAL_PAGE_SHIFT to 52 to match 64bit. |
| |
| The real limit for a 32bit PAE kernel is still 44 bits because all Linux |
| PTEs are created from unsigned long PFNs, so they cannot be higher than 44 |
| bits on a 32bit kernel. So these extra PFN bits should be never set. The |
| only users of this macro are using it to look at PTEs, so it's safe. |
| |
| [ tglx: Massaged changelog ] |
| |
| Signed-off-by: Andi Kleen <ak@linux.intel.com> |
| Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
| Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> |
| Acked-by: Michal Hocko <mhocko@suse.com> |
| Acked-by: Dave Hansen <dave.hansen@intel.com> |
| Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| arch/x86/include/asm/page_32_types.h | 9 +++++++-- |
| 1 file changed, 7 insertions(+), 2 deletions(-) |
| |
| --- a/arch/x86/include/asm/page_32_types.h |
| +++ b/arch/x86/include/asm/page_32_types.h |
| @@ -28,8 +28,13 @@ |
| #define N_EXCEPTION_STACKS 1 |
| |
| #ifdef CONFIG_X86_PAE |
| -/* 44=32+12, the limit we can fit into an unsigned long pfn */ |
| -#define __PHYSICAL_MASK_SHIFT 44 |
| +/* |
| + * This is beyond the 44 bit limit imposed by the 32bit long pfns, |
| + * but we need the full mask to make sure inverted PROT_NONE |
| + * entries have all the host bits set in a guest. |
| + * The real limit is still 44 bits. |
| + */ |
| +#define __PHYSICAL_MASK_SHIFT 52 |
| #define __VIRTUAL_MASK_SHIFT 32 |
| |
| #else /* !CONFIG_X86_PAE */ |