iommu/dma: Use a large flush queue and timeout for shadow_on_flush

Flush queues currently use a fixed compile time size of 256 entries.
This being a power of 2 allows the compiler to use shift and mask
instead of more expensive modulo operations. With per-CPU flush queues
larger queue sizes would hit per-CPU allocation limits, with a single
flush queue these limits do not apply however. Also with single queues
being particularly suitable for virtualized environments with expensive
IOTLB flushes these benefit especially from larger queues and thus fewer
flushes.

To this end re-order struct iova_fq so we can use a dynamic array and
introduce the flush queue size and timeouts as new options in the
iommu_dma_options struct. So as not to lose the shift and mask
optimization, use a power of 2 for the length and use explicit shift and
mask instead of letting the compiler optimize this.

A large queue size and 1 second timeout is then set for the shadow on
flush case set by s390 paged memory guests. This then brings performance
on par with the previous s390 specific DMA API implementation.

Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> #s390
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
1 file changed