| From 29b32839725f8c89a41cb6ee054c85f3116ea8b5 Mon Sep 17 00:00:00 2001 |
| From: Nadav Amit <namit@vmware.com> |
| Date: Wed, 27 Jan 2021 09:53:17 -0800 |
| Subject: iommu/vt-d: Do not use flush-queue when caching-mode is on |
| |
| From: Nadav Amit <namit@vmware.com> |
| |
| commit 29b32839725f8c89a41cb6ee054c85f3116ea8b5 upstream. |
| |
| When an Intel IOMMU is virtualized, and a physical device is |
| passed-through to the VM, changes of the virtual IOMMU need to be |
| propagated to the physical IOMMU. The hypervisor therefore needs to |
| monitor PTE mappings in the IOMMU page-tables. Intel specifications |
| provide "caching-mode" capability that a virtual IOMMU uses to report |
| that the IOMMU is virtualized and a TLB flush is needed after mapping to |
| allow the hypervisor to propagate virtual IOMMU mappings to the physical |
| IOMMU. To the best of my knowledge no real physical IOMMU reports |
| "caching-mode" as turned on. |
| |
| Synchronizing the virtual and the physical IOMMU tables is expensive if |
| the hypervisor is unaware which PTEs have changed, as the hypervisor is |
| required to walk all the virtualized tables and look for changes. |
| Consequently, domain flushes are much more expensive than page-specific |
| flushes on virtualized IOMMUs with passthrough devices. The kernel |
| therefore exploited the "caching-mode" indication to avoid domain |
| flushing and use page-specific flushing in virtualized environments. See |
| commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching |
| mode.") |
| |
| This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use |
| of iova deferred flushing"). Now, when batched TLB flushing is used (the |
| default), full TLB domain flushes are performed frequently, requiring |
| the hypervisor to perform expensive synchronization between the virtual |
| TLB and the physical one. |
| |
| Getting batched TLB flushes to use page-specific invalidations again in |
| such circumstances is not easy, since the TLB invalidation scheme |
| assumes that "full" domain TLB flushes are performed for scalability. |
| |
| Disable batched TLB flushes when caching-mode is on, as the performance |
| benefit from using batched TLB invalidations is likely to be much |
| smaller than the overhead of the virtual-to-physical IOMMU page-tables |
| synchronization. |
| |
| Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") |
| Signed-off-by: Nadav Amit <namit@vmware.com> |
| Cc: David Woodhouse <dwmw2@infradead.org> |
| Cc: Lu Baolu <baolu.lu@linux.intel.com> |
| Cc: Joerg Roedel <joro@8bytes.org> |
| Cc: Will Deacon <will@kernel.org> |
| Cc: stable@vger.kernel.org |
| Acked-by: Lu Baolu <baolu.lu@linux.intel.com> |
| Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com |
| Signed-off-by: Joerg Roedel <jroedel@suse.de> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| |
| --- |
| drivers/iommu/intel/iommu.c | 5 +++++ |
| 1 file changed, 5 insertions(+) |
| |
| --- a/drivers/iommu/intel/iommu.c |
| +++ b/drivers/iommu/intel/iommu.c |
| @@ -3350,6 +3350,11 @@ static int __init init_dmars(void) |
| |
| if (!ecap_pass_through(iommu->ecap)) |
| hw_pass_through = 0; |
| + |
| + if (!intel_iommu_strict && cap_caching_mode(iommu->cap)) { |
| + pr_warn("Disable batched IOTLB flush due to virtualization"); |
| + intel_iommu_strict = 1; |
| + } |
| intel_svm_check(iommu); |
| } |
| |