| From e4b3efc06c0b1942ce9616a1fcbebfc5eaac99d5 Mon Sep 17 00:00:00 2001 |
| From: Sasha Levin <sashal@kernel.org> |
| Date: Mon, 21 Jul 2025 09:34:54 +0000 |
| Subject: xen/netfront: Fix TX response spurious interrupts |
| |
| From: Anthoine Bourgeois <anthoine.bourgeois@vates.tech> |
| |
| [ Upstream commit 114a2de6fa86d99ed9546cc9113a3cad58beef79 ] |
| |
| We found at Vates that there are lot of spurious interrupts when |
| benchmarking the xen-net PV driver frontend. This issue appeared with a |
| patch that addresses security issue XSA-391 (b27d47950e48 "xen/netfront: |
| harden netfront against event channel storms"). On an iperf benchmark, |
| spurious interrupts can represent up to 50% of the interrupts. |
| |
| Spurious interrupts are interrupts that are rised for nothing, there is |
| no work to do. This appends because the function that handles the |
| interrupts ("xennet_tx_buf_gc") is also called at the end of the request |
| path to garbage collect the responses received during the transmission |
| load. |
| |
| The request path is doing the work that the interrupt handler should |
| have done otherwise. This is particurary true when there is more than |
| one vcpu and get worse linearly with the number of vcpu/queue. |
| |
| Moreover, this problem is amplifyed by the penalty imposed by a spurious |
| interrupt. When an interrupt is found spurious the interrupt chip will |
| delay the EOI to slowdown the backend. This delay will allow more |
| responses to be handled by the request path and then there will be more |
| chance the next interrupt will not find any work to do, creating a new |
| spurious interrupt. |
| |
| This causes performance issue. The solution here is to remove the calls |
| from the request path and let the interrupt handler do the processing of |
| the responses. This approch removes most of the spurious interrupts |
| (<0.05%) and also has the benefit of freeing up cycles in the request |
| path, allowing it to process more work, which improves performance |
| compared to masking the spurious interrupt one way or another. |
| |
| This optimization changes a part of the code that is present since the |
| net frontend driver was upstreamed. There is no similar pattern in the |
| other xen PV drivers. Since the first commit of xen-netfront is a blob |
| that doesn't explain all the design choices I can only guess why this |
| specific mecanism was here. This could have been introduce to compensate |
| a slow backend at the time (maybe the backend was fixed or optimize |
| later) or a small queue. In 18 years, both frontend and backend gain lot |
| of features and optimizations that could have obsolete the feature of |
| reaping completions from the TX path. |
| |
| Some vif throughput performance figures from a 8 vCPUs, 4GB of RAM HVM |
| guest(s): |
| |
| Without this patch on the : |
| vm -> dom0: 4.5Gb/s |
| vm -> vm: 7.0Gb/s |
| |
| Without XSA-391 patch (revert of b27d47950e48): |
| vm -> dom0: 8.3Gb/s |
| vm -> vm: 8.7Gb/s |
| |
| With XSA-391 and this patch: |
| vm -> dom0: 11.5Gb/s |
| vm -> vm: 12.6Gb/s |
| |
| v2: |
| - add revewed and tested by tags |
| - resend with the maintainers in the recipients list |
| |
| v3: |
| - remove Fixes tag but keep the commit ref in the explanation |
| - add a paragraph on why this code was here |
| |
| Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@vates.tech> |
| Reviewed-by: Juergen Gross <jgross@suse.com> |
| Tested-by: Elliott Mitchell <ehem+xen@m5p.com> |
| Signed-off-by: Juergen Gross <jgross@suse.com> |
| Message-ID: <20250721093316.23560-1-anthoine.bourgeois@vates.tech> |
| Signed-off-by: Sasha Levin <sashal@kernel.org> |
| --- |
| drivers/net/xen-netfront.c | 5 ----- |
| 1 file changed, 5 deletions(-) |
| |
| diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c |
| index 69ef50fb2e1b..74925e166462 100644 |
| --- a/drivers/net/xen-netfront.c |
| +++ b/drivers/net/xen-netfront.c |
| @@ -637,8 +637,6 @@ static int xennet_xdp_xmit_one(struct net_device *dev, |
| tx_stats->packets++; |
| u64_stats_update_end(&tx_stats->syncp); |
| |
| - xennet_tx_buf_gc(queue); |
| - |
| return 0; |
| } |
| |
| @@ -848,9 +846,6 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev |
| tx_stats->packets++; |
| u64_stats_update_end(&tx_stats->syncp); |
| |
| - /* Note: It is not safe to access skb after xennet_tx_buf_gc()! */ |
| - xennet_tx_buf_gc(queue); |
| - |
| if (!netfront_tx_slot_available(queue)) |
| netif_tx_stop_queue(netdev_get_tx_queue(dev, queue->id)); |
| |
| -- |
| 2.39.5 |
| |