| From: Sean Christopherson <sean.j.christopherson@intel.com> |
| Date: Fri, 17 Aug 2018 10:27:36 -0700 |
| Subject: x86/speculation/l1tf: Exempt zeroed PTEs from inversion |
| |
| commit f19f5c49bbc3ffcc9126cc245fc1b24cc29f4a37 upstream. |
| |
| It turns out that we should *not* invert all not-present mappings, |
| because the all zeroes case is obviously special. |
| |
| clear_page() does not undergo the XOR logic to invert the address bits, |
| i.e. PTE, PMD and PUD entries that have not been individually written |
| will have val=0 and so will trigger __pte_needs_invert(). As a result, |
| {pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones |
| (adjusted by the max PFN mask) instead of zero. A zeroed entry is ok |
| because the page at physical address 0 is reserved early in boot |
| specifically to mitigate L1TF, so explicitly exempt them from the |
| inversion when reading the PFN. |
| |
| Manifested as an unexpected mprotect(..., PROT_NONE) failure when called |
| on a VMA that has VM_PFNMAP and was mmap'd to as something other than |
| PROT_NONE but never used. mprotect() sends the PROT_NONE request down |
| prot_none_walk(), which walks the PTEs to check the PFNs. |
| prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns |
| -EACCES because it thinks mprotect() is trying to adjust a high MMIO |
| address. |
| |
| [ This is a very modified version of Sean's original patch, but all |
| credit goes to Sean for doing this and also pointing out that |
| sometimes the __pte_needs_invert() function only gets the protection |
| bits, not the full eventual pte. But zero remains special even in |
| just protection bits, so that's ok. - Linus ] |
| |
| Fixes: f22cc87f6c1f ("x86/speculation/l1tf: Invert all not present mappings") |
| Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> |
| Acked-by: Andi Kleen <ak@linux.intel.com> |
| Cc: Thomas Gleixner <tglx@linutronix.de> |
| Cc: Josh Poimboeuf <jpoimboe@redhat.com> |
| Cc: Michal Hocko <mhocko@suse.com> |
| Cc: Vlastimil Babka <vbabka@suse.cz> |
| Cc: Dave Hansen <dave.hansen@intel.com> |
| Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
| Signed-off-by: Ben Hutchings <ben@decadent.org.uk> |
| --- |
| arch/x86/include/asm/pgtable-invert.h | 11 ++++++++++- |
| 1 file changed, 10 insertions(+), 1 deletion(-) |
| |
| --- a/arch/x86/include/asm/pgtable-invert.h |
| +++ b/arch/x86/include/asm/pgtable-invert.h |
| @@ -4,9 +4,18 @@ |
| |
| #ifndef __ASSEMBLY__ |
| |
| +/* |
| + * A clear pte value is special, and doesn't get inverted. |
| + * |
| + * Note that even users that only pass a pgprot_t (rather |
| + * than a full pte) won't trigger the special zero case, |
| + * because even PAGE_NONE has _PAGE_PROTNONE | _PAGE_ACCESSED |
| + * set. So the all zero case really is limited to just the |
| + * cleared page table entry case. |
| + */ |
| static inline bool __pte_needs_invert(u64 val) |
| { |
| - return !(val & _PAGE_PRESENT); |
| + return val && !(val & _PAGE_PRESENT); |
| } |
| |
| /* Get a mask to xor with the page table entry to get the correct pfn. */ |