| From bippy-1.1.0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@kernel.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2022-49789: scsi: zfcp: Fix double free of FSF request when qdio send fails |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| scsi: zfcp: Fix double free of FSF request when qdio send fails |
| |
| We used to use the wrong type of integer in 'zfcp_fsf_req_send()' to cache |
| the FSF request ID when sending a new FSF request. This is used in case the |
| sending fails and we need to remove the request from our internal hash |
| table again (so we don't keep an invalid reference and use it when we free |
| the request again). |
| |
| In 'zfcp_fsf_req_send()' we used to cache the ID as 'int' (signed and 32 |
| bit wide), but the rest of the zfcp code (and the firmware specification) |
| handles the ID as 'unsigned long'/'u64' (unsigned and 64 bit wide [s390x |
| ELF ABI]). For one this has the obvious problem that when the ID grows |
| past 32 bit (this can happen reasonably fast) it is truncated to 32 bit |
| when storing it in the cache variable and so doesn't match the original ID |
| anymore. The second less obvious problem is that even when the original ID |
| has not yet grown past 32 bit, as soon as the 32nd bit is set in the |
| original ID (0x80000000 = 2'147'483'648) we will have a mismatch when we |
| cast it back to 'unsigned long'. As the cached variable is of a signed |
| type, the compiler will choose a sign-extending instruction to load the 32 |
| bit variable into a 64 bit register (e.g.: 'lgf %r11,188(%r15)'). So once |
| we pass the cached variable into 'zfcp_reqlist_find_rm()' to remove the |
| request again all the leading zeros will be flipped to ones to extend the |
| sign and won't match the original ID anymore (this has been observed in |
| practice). |
| |
| If we can't successfully remove the request from the hash table again after |
| 'zfcp_qdio_send()' fails (this happens regularly when zfcp cannot notify |
| the adapter about new work because the adapter is already gone during |
| e.g. a ChpID toggle) we will end up with a double free. We unconditionally |
| free the request in the calling function when 'zfcp_fsf_req_send()' fails, |
| but because the request is still in the hash table we end up with a stale |
| memory reference, and once the zfcp adapter is either reset during recovery |
| or shutdown we end up freeing the same memory twice. |
| |
| The resulting stack traces vary depending on the kernel and have no direct |
| correlation to the place where the bug occurs. Here are three examples that |
| have been seen in practice: |
| |
| list_del corruption. next->prev should be 00000001b9d13800, but was 00000000dead4ead. (next=00000001bd131a00) |
| ------------[ cut here ]------------ |
| kernel BUG at lib/list_debug.c:62! |
| monitor event: 0040 ilc:2 [#1] PREEMPT SMP |
| Modules linked in: ... |
| CPU: 9 PID: 1617 Comm: zfcperp0.0.1740 Kdump: loaded |
| Hardware name: ... |
| Krnl PSW : 0704d00180000000 00000003cbeea1f8 (__list_del_entry_valid+0x98/0x140) |
| R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 |
| Krnl GPRS: 00000000916d12f1 0000000080000000 000000000000006d 00000003cb665cd6 |
| 0000000000000001 0000000000000000 0000000000000000 00000000d28d21e8 |
| 00000000d3844000 00000380099efd28 00000001bd131a00 00000001b9d13800 |
| 00000000d3290100 0000000000000000 00000003cbeea1f4 00000380099efc70 |
| Krnl Code: 00000003cbeea1e8: c020004f68a7 larl %r2,00000003cc8d7336 |
| 00000003cbeea1ee: c0e50027fd65 brasl %r14,00000003cc3e9cb8 |
| #00000003cbeea1f4: af000000 mc 0,0 |
| >00000003cbeea1f8: c02000920440 larl %r2,00000003cd12aa78 |
| 00000003cbeea1fe: c0e500289c25 brasl %r14,00000003cc3fda48 |
| 00000003cbeea204: b9040043 lgr %r4,%r3 |
| 00000003cbeea208: b9040051 lgr %r5,%r1 |
| 00000003cbeea20c: b9040032 lgr %r3,%r2 |
| Call Trace: |
| [<00000003cbeea1f8>] __list_del_entry_valid+0x98/0x140 |
| ([<00000003cbeea1f4>] __list_del_entry_valid+0x94/0x140) |
| [<000003ff7ff502fe>] zfcp_fsf_req_dismiss_all+0xde/0x150 [zfcp] |
| [<000003ff7ff49cd0>] zfcp_erp_strategy_do_action+0x160/0x280 [zfcp] |
| [<000003ff7ff4a22e>] zfcp_erp_strategy+0x21e/0xca0 [zfcp] |
| [<000003ff7ff4ad34>] zfcp_erp_thread+0x84/0x1a0 [zfcp] |
| [<00000003cb5eece8>] kthread+0x138/0x150 |
| [<00000003cb557f3c>] __ret_from_fork+0x3c/0x60 |
| [<00000003cc4172ea>] ret_from_fork+0xa/0x40 |
| INFO: lockdep is turned off. |
| Last Breaking-Event-Address: |
| [<00000003cc3e9d04>] _printk+0x4c/0x58 |
| Kernel panic - not syncing: Fatal exception: panic_on_oops |
| |
| or: |
| |
| Unable to handle kernel pointer dereference in virtual kernel address space |
| Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803 |
| Fault in home space mode while using kernel ASCE. |
| AS:0000000063b10007 R3:0000000000000024 |
| Oops: 0038 ilc:3 [#1] SMP |
| Modules linked in: ... |
| CPU: 10 PID: 0 Comm: swapper/10 Kdump: loaded |
| Hardware name: ... |
| Krnl PSW : 0404d00180000000 000003ff7febaf8e (zfcp_fsf_reqid_check+0x86/0x158 [zfcp]) |
| R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 |
| Krnl GPRS: 5a6f1cfa89c49ac3 00000000aff2c4c8 6b6b6b6b6b6b6b6b 00000000000002a8 |
| 0000000000000000 0000000000000055 0000000000000000 00000000a8515800 |
| 0700000000000000 00000000a6e14500 00000000aff2c000 000000008003c44c |
| 000000008093c700 0000000000000010 00000380009ebba8 00000380009ebb48 |
| Krnl Code: 000003ff7febaf7e: a7f4003d brc 15,000003ff7febaff8 |
| 000003ff7febaf82: e32020000004 lg %r2,0(%r2) |
| #000003ff7febaf88: ec2100388064 cgrj %r2,%r1,8,000003ff7febaff8 |
| >000003ff7febaf8e: e3b020100020 cg %r11,16(%r2) |
| 000003ff7febaf94: a774fff7 brc 7,000003ff7febaf82 |
| 000003ff7febaf98: ec280030007c cgij %r2,0,8,000003ff7febaff8 |
| 000003ff7febaf9e: e31020080004 lg %r1,8(%r2) |
| 000003ff7febafa4: e33020000004 lg %r3,0(%r2) |
| Call Trace: |
| [<000003ff7febaf8e>] zfcp_fsf_reqid_check+0x86/0x158 [zfcp] |
| [<000003ff7febbdbc>] zfcp_qdio_int_resp+0x6c/0x170 [zfcp] |
| [<000003ff7febbf90>] zfcp_qdio_irq_tasklet+0xd0/0x108 [zfcp] |
| [<0000000061d90a04>] tasklet_action_common.constprop.0+0xdc/0x128 |
| [<000000006292f300>] __do_softirq+0x130/0x3c0 |
| [<0000000061d906c6>] irq_exit_rcu+0xfe/0x118 |
| [<000000006291e818>] do_io_irq+0xc8/0x168 |
| [<000000006292d516>] io_int_handler+0xd6/0x110 |
| [<000000006292d596>] psw_idle_exit+0x0/0xa |
| ([<0000000061d3be50>] arch_cpu_idle+0x40/0xd0) |
| [<000000006292ceea>] default_idle_call+0x52/0xf8 |
| [<0000000061de4fa4>] do_idle+0xd4/0x168 |
| [<0000000061de51fe>] cpu_startup_entry+0x36/0x40 |
| [<0000000061d4faac>] smp_start_secondary+0x12c/0x138 |
| [<000000006292d88e>] restart_int_handler+0x6e/0x90 |
| Last Breaking-Event-Address: |
| [<000003ff7febaf94>] zfcp_fsf_reqid_check+0x8c/0x158 [zfcp] |
| Kernel panic - not syncing: Fatal exception in interrupt |
| |
| or: |
| |
| Unable to handle kernel pointer dereference in virtual kernel address space |
| Failing address: 523b05d3ae76a000 TEID: 523b05d3ae76a803 |
| Fault in home space mode while using kernel ASCE. |
| AS:0000000077c40007 R3:0000000000000024 |
| Oops: 0038 ilc:3 [#1] SMP |
| Modules linked in: ... |
| CPU: 3 PID: 453 Comm: kworker/3:1H Kdump: loaded |
| Hardware name: ... |
| Workqueue: kblockd blk_mq_run_work_fn |
| Krnl PSW : 0404d00180000000 0000000076fc0312 (__kmalloc+0xd2/0x398) |
| R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 |
| Krnl GPRS: ffffffffffffffff 523b05d3ae76abf6 0000000000000000 0000000000092a20 |
| 0000000000000002 00000007e49b5cc0 00000007eda8f000 0000000000092a20 |
| 00000007eda8f000 00000003b02856b9 00000000000000a8 523b05d3ae76abf6 |
| 00000007dd662000 00000007eda8f000 0000000076fc02b2 000003e0037637a0 |
| Krnl Code: 0000000076fc0302: c004000000d4 brcl 0,76fc04aa |
| 0000000076fc0308: b904001b lgr %r1,%r11 |
| #0000000076fc030c: e3106020001a algf %r1,32(%r6) |
| >0000000076fc0312: e31010000082 xg %r1,0(%r1) |
| 0000000076fc0318: b9040001 lgr %r0,%r1 |
| 0000000076fc031c: e30061700082 xg %r0,368(%r6) |
| 0000000076fc0322: ec59000100d9 aghik %r5,%r9,1 |
| 0000000076fc0328: e34003b80004 lg %r4,952 |
| Call Trace: |
| [<0000000076fc0312>] __kmalloc+0xd2/0x398 |
| [<0000000076f318f2>] mempool_alloc+0x72/0x1f8 |
| [<000003ff8027c5f8>] zfcp_fsf_req_create.isra.7+0x40/0x268 [zfcp] |
| [<000003ff8027f1bc>] zfcp_fsf_fcp_cmnd+0xac/0x3f0 [zfcp] |
| [<000003ff80280f1a>] zfcp_scsi_queuecommand+0x122/0x1d0 [zfcp] |
| [<000003ff800b4218>] scsi_queue_rq+0x778/0xa10 [scsi_mod] |
| [<00000000771782a0>] __blk_mq_try_issue_directly+0x130/0x208 |
| [<000000007717a124>] blk_mq_request_issue_directly+0x4c/0xa8 |
| [<000003ff801302e2>] dm_mq_queue_rq+0x2ea/0x468 [dm_mod] |
| [<0000000077178c12>] blk_mq_dispatch_rq_list+0x33a/0x818 |
| [<000000007717f064>] __blk_mq_do_dispatch_sched+0x284/0x2f0 |
| [<000000007717f44c>] __blk_mq_sched_dispatch_requests+0x1c4/0x218 |
| [<000000007717fa7a>] blk_mq_sched_dispatch_requests+0x52/0x90 |
| [<0000000077176d74>] __blk_mq_run_hw_queue+0x9c/0xc0 |
| [<0000000076da6d74>] process_one_work+0x274/0x4d0 |
| [<0000000076da7018>] worker_thread+0x48/0x560 |
| [<0000000076daef18>] kthread+0x140/0x160 |
| [<000000007751d144>] ret_from_fork+0x28/0x30 |
| Last Breaking-Event-Address: |
| [<0000000076fc0474>] __kmalloc+0x234/0x398 |
| Kernel panic - not syncing: Fatal exception: panic_on_oops |
| |
| To fix this, simply change the type of the cache variable to 'unsigned |
| long', like the rest of zfcp and also the argument for |
| 'zfcp_reqlist_find_rm()'. This prevents truncation and wrong sign extension |
| and so can successfully remove the request from the hash table. |
| |
| The Linux kernel CVE team has assigned CVE-2022-49789 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 2.6.34 with commit e60a6d69f1f84c2ef1cc63aefaadfe7ae9f12934 and fixed in 5.4.225 with commit 1bf8ed585501bb2dd0b5f67c824eab45adfbdccd |
| Issue introduced in 2.6.34 with commit e60a6d69f1f84c2ef1cc63aefaadfe7ae9f12934 and fixed in 5.10.156 with commit d2c7d8f58e9cde8ac8d1f75e9d66c2a813ffe0ab |
| Issue introduced in 2.6.34 with commit e60a6d69f1f84c2ef1cc63aefaadfe7ae9f12934 and fixed in 5.15.80 with commit 11edbdee4399401f533adda9bffe94567aa08b96 |
| Issue introduced in 2.6.34 with commit e60a6d69f1f84c2ef1cc63aefaadfe7ae9f12934 and fixed in 6.0.10 with commit 90a49a6b015fa439cd62e45121390284c125a91f |
| Issue introduced in 2.6.34 with commit e60a6d69f1f84c2ef1cc63aefaadfe7ae9f12934 and fixed in 6.1 with commit 0954256e970ecf371b03a6c9af2cf91b9c4085ff |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2022-49789 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| drivers/s390/scsi/zfcp_fsf.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/1bf8ed585501bb2dd0b5f67c824eab45adfbdccd |
| https://git.kernel.org/stable/c/d2c7d8f58e9cde8ac8d1f75e9d66c2a813ffe0ab |
| https://git.kernel.org/stable/c/11edbdee4399401f533adda9bffe94567aa08b96 |
| https://git.kernel.org/stable/c/90a49a6b015fa439cd62e45121390284c125a91f |
| https://git.kernel.org/stable/c/0954256e970ecf371b03a6c9af2cf91b9c4085ff |