| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-55642: block: Prevent potential deadlocks in zone write plug error recovery |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| block: Prevent potential deadlocks in zone write plug error recovery |
| |
| Zone write plugging for handling writes to zones of a zoned block |
| device always execute a zone report whenever a write BIO to a zone |
| fails. The intent of this is to ensure that the tracking of a zone write |
| pointer is always correct to ensure that the alignment to a zone write |
| pointer of write BIOs can be checked on submission and that we can |
| always correctly emulate zone append operations using regular write |
| BIOs. |
| |
| However, this error recovery scheme introduces a potential deadlock if a |
| device queue freeze is initiated while BIOs are still plugged in a zone |
| write plug and one of these write operation fails. In such case, the |
| disk zone write plug error recovery work is scheduled and executes a |
| report zone. This in turn can result in a request allocation in the |
| underlying driver to issue the report zones command to the device. But |
| with the device queue freeze already started, this allocation will |
| block, preventing the report zone execution and the continuation of the |
| processing of the plugged BIOs. As plugged BIOs hold a queue usage |
| reference, the queue freeze itself will never complete, resulting in a |
| deadlock. |
| |
| Avoid this problem by completely removing from the zone write plugging |
| code the use of report zones operations after a failed write operation, |
| instead relying on the device user to either execute a report zones, |
| reset the zone, finish the zone, or give up writing to the device (which |
| is a fairly common pattern for file systems which degrade to read-only |
| after write failures). This is not an unreasonnable requirement as all |
| well-behaved applications, FSes and device mapper already use report |
| zones to recover from write errors whenever possible by comparing the |
| current position of a zone write pointer with what their assumption |
| about the position is. |
| |
| The changes to remove the automatic error recovery are as follows: |
| - Completely remove the error recovery work and its associated |
| resources (zone write plug list head, disk error list, and disk |
| zone_wplugs_work work struct). This also removes the functions |
| disk_zone_wplug_set_error() and disk_zone_wplug_clear_error(). |
| |
| - Change the BLK_ZONE_WPLUG_ERROR zone write plug flag into |
| BLK_ZONE_WPLUG_NEED_WP_UPDATE. This new flag is set for a zone write |
| plug whenever a write opration targetting the zone of the zone write |
| plug fails. This flag indicates that the zone write pointer offset is |
| not reliable and that it must be updated when the next report zone, |
| reset zone, finish zone or disk revalidation is executed. |
| |
| - Modify blk_zone_write_plug_bio_endio() to set the |
| BLK_ZONE_WPLUG_NEED_WP_UPDATE flag for the target zone of a failed |
| write BIO. |
| |
| - Modify the function disk_zone_wplug_set_wp_offset() to clear this |
| new flag, thus implementing recovery of a correct write pointer |
| offset with the reset (all) zone and finish zone operations. |
| |
| - Modify blkdev_report_zones() to always use the disk_report_zones_cb() |
| callback so that disk_zone_wplug_sync_wp_offset() can be called for |
| any zone marked with the BLK_ZONE_WPLUG_NEED_WP_UPDATE flag. |
| This implements recovery of a correct write pointer offset for zone |
| write plugs marked with BLK_ZONE_WPLUG_NEED_WP_UPDATE and within |
| the range of the report zones operation executed by the user. |
| |
| - Modify blk_revalidate_seq_zone() to call |
| disk_zone_wplug_sync_wp_offset() for all sequential write required |
| zones when a zoned block device is revalidated, thus always resolving |
| any inconsistency between the write pointer offset of zone write |
| plugs and the actual write pointer position of sequential zones. |
| |
| The Linux kernel CVE team has assigned CVE-2024-55642 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 6.10 with commit dd291d77cc90eb6a86e9860ba8e6e38eebd57d12 and fixed in 6.12.6 with commit 7fa80134cf266325fa61139320091001c9b3c477 |
| Issue introduced in 6.10 with commit dd291d77cc90eb6a86e9860ba8e6e38eebd57d12 and fixed in 6.13 with commit fe0418eb9bd69a19a948b297c8de815e05f3cde1 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-55642 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| block/blk-zoned.c |
| include/linux/blkdev.h |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/7fa80134cf266325fa61139320091001c9b3c477 |
| https://git.kernel.org/stable/c/fe0418eb9bd69a19a948b297c8de815e05f3cde1 |