| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2021-47041: nvmet-tcp: fix incorrect locking in state_change sk callback |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| nvmet-tcp: fix incorrect locking in state_change sk callback |
| |
| We are not changing anything in the TCP connection state so |
| we should not take a write_lock but rather a read lock. |
| |
| This caused a deadlock when running nvmet-tcp and nvme-tcp |
| on the same system, where state_change callbacks on the |
| host and on the controller side have causal relationship |
| and made lockdep report on this with blktests: |
| |
| ================================ |
| WARNING: inconsistent lock state |
| 5.12.0-rc3 #1 Tainted: G I |
| -------------------------------- |
| inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-R} usage. |
| nvme/1324 [HC0[0]:SC0[0]:HE1:SE1] takes: |
| ffff888363151000 (clock-AF_INET){++-?}-{2:2}, at: nvme_tcp_state_change+0x21/0x150 [nvme_tcp] |
| {IN-SOFTIRQ-W} state was registered at: |
| __lock_acquire+0x79b/0x18d0 |
| lock_acquire+0x1ca/0x480 |
| _raw_write_lock_bh+0x39/0x80 |
| nvmet_tcp_state_change+0x21/0x170 [nvmet_tcp] |
| tcp_fin+0x2a8/0x780 |
| tcp_data_queue+0xf94/0x1f20 |
| tcp_rcv_established+0x6ba/0x1f00 |
| tcp_v4_do_rcv+0x502/0x760 |
| tcp_v4_rcv+0x257e/0x3430 |
| ip_protocol_deliver_rcu+0x69/0x6a0 |
| ip_local_deliver_finish+0x1e2/0x2f0 |
| ip_local_deliver+0x1a2/0x420 |
| ip_rcv+0x4fb/0x6b0 |
| __netif_receive_skb_one_core+0x162/0x1b0 |
| process_backlog+0x1ff/0x770 |
| __napi_poll.constprop.0+0xa9/0x5c0 |
| net_rx_action+0x7b3/0xb30 |
| __do_softirq+0x1f0/0x940 |
| do_softirq+0xa1/0xd0 |
| __local_bh_enable_ip+0xd8/0x100 |
| ip_finish_output2+0x6b7/0x18a0 |
| __ip_queue_xmit+0x706/0x1aa0 |
| __tcp_transmit_skb+0x2068/0x2e20 |
| tcp_write_xmit+0xc9e/0x2bb0 |
| __tcp_push_pending_frames+0x92/0x310 |
| inet_shutdown+0x158/0x300 |
| __nvme_tcp_stop_queue+0x36/0x270 [nvme_tcp] |
| nvme_tcp_stop_queue+0x87/0xb0 [nvme_tcp] |
| nvme_tcp_teardown_admin_queue+0x69/0xe0 [nvme_tcp] |
| nvme_do_delete_ctrl+0x100/0x10c [nvme_core] |
| nvme_sysfs_delete.cold+0x8/0xd [nvme_core] |
| kernfs_fop_write_iter+0x2c7/0x460 |
| new_sync_write+0x36c/0x610 |
| vfs_write+0x5c0/0x870 |
| ksys_write+0xf9/0x1d0 |
| do_syscall_64+0x33/0x40 |
| entry_SYSCALL_64_after_hwframe+0x44/0xae |
| irq event stamp: 10687 |
| hardirqs last enabled at (10687): [<ffffffff9ec376bd>] _raw_spin_unlock_irqrestore+0x2d/0x40 |
| hardirqs last disabled at (10686): [<ffffffff9ec374d8>] _raw_spin_lock_irqsave+0x68/0x90 |
| softirqs last enabled at (10684): [<ffffffff9f000608>] __do_softirq+0x608/0x940 |
| softirqs last disabled at (10649): [<ffffffff9cdedd31>] do_softirq+0xa1/0xd0 |
| |
| other info that might help us debug this: |
| Possible unsafe locking scenario: |
| |
| CPU0 |
| ---- |
| lock(clock-AF_INET); |
| <Interrupt> |
| lock(clock-AF_INET); |
| |
| *** DEADLOCK *** |
| |
| 5 locks held by nvme/1324: |
| #0: ffff8884a01fe470 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xf9/0x1d0 |
| #1: ffff8886e435c090 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x216/0x460 |
| #2: ffff888104d90c38 (kn->active#255){++++}-{0:0}, at: kernfs_remove_self+0x22d/0x330 |
| #3: ffff8884634538d0 (&queue->queue_lock){+.+.}-{3:3}, at: nvme_tcp_stop_queue+0x52/0xb0 [nvme_tcp] |
| #4: ffff888363150d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: inet_shutdown+0x59/0x300 |
| |
| stack backtrace: |
| CPU: 26 PID: 1324 Comm: nvme Tainted: G I 5.12.0-rc3 #1 |
| Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS 2.10.0 11/12/2020 |
| Call Trace: |
| dump_stack+0x93/0xc2 |
| mark_lock_irq.cold+0x2c/0xb3 |
| ? verify_lock_unused+0x390/0x390 |
| ? stack_trace_consume_entry+0x160/0x160 |
| ? lock_downgrade+0x100/0x100 |
| ? save_trace+0x88/0x5e0 |
| ? _raw_spin_unlock_irqrestore+0x2d/0x40 |
| mark_lock+0x530/0x1470 |
| ? mark_lock_irq+0x1d10/0x1d10 |
| ? enqueue_timer+0x660/0x660 |
| mark_usage+0x215/0x2a0 |
| __lock_acquire+0x79b/0x18d0 |
| ? tcp_schedule_loss_probe.part.0+0x38c/0x520 |
| lock_acquire+0x1ca/0x480 |
| ? nvme_tcp_state_change+0x21/0x150 [nvme_tcp] |
| ? rcu_read_unlock+0x40/0x40 |
| ? tcp_mtu_probe+0x1ae0/0x1ae0 |
| ? kmalloc_reserve+0xa0/0xa0 |
| ? sysfs_file_ops+0x170/0x170 |
| _raw_read_lock+0x3d/0xa0 |
| ? nvme_tcp_state_change+0x21/0x150 [nvme_tcp] |
| nvme_tcp_state_change+0x21/0x150 [nvme_tcp] |
| ? sysfs_file_ops+0x170/0x170 |
| inet_shutdown+0x189/0x300 |
| __nvme_tcp_stop_queue+0x36/0x270 [nvme_tcp] |
| nvme_tcp_stop_queue+0x87/0xb0 [nvme_tcp] |
| nvme_tcp_teardown_admin_queue+0x69/0xe0 [nvme_tcp] |
| nvme_do_delete_ctrl+0x100/0x10c [nvme_core] |
| nvme_sysfs_delete.cold+0x8/0xd [nvme_core] |
| kernfs_fop_write_iter+0x2c7/0x460 |
| new_sync_write+0x36c/0x610 |
| ? new_sync_read+0x600/0x600 |
| ? lock_acquire+0x1ca/0x480 |
| ? rcu_read_unlock+0x40/0x40 |
| ? lock_is_held_type+0x9a/0x110 |
| vfs_write+0x5c0/0x870 |
| ksys_write+0xf9/0x1d0 |
| ? __ia32_sys_read+0xa0/0xa0 |
| ? lockdep_hardirqs_on_prepare.part.0+0x198/0x340 |
| ? syscall_enter_from_user_mode+0x27/0x70 |
| do_syscall_64+0x33/0x40 |
| entry_SYSCALL_64_after_hwframe+0x44/0xae |
| |
| The Linux kernel CVE team has assigned CVE-2021-47041 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.0 with commit 872d26a391da92ed8f0c0f5cb5fef428067b7f30 and fixed in 5.4.119 with commit 999d606a820c36ae9b9e9611360c8b3d8d4bb777 |
| Issue introduced in 5.0 with commit 872d26a391da92ed8f0c0f5cb5fef428067b7f30 and fixed in 5.10.37 with commit 60ade0d56b06537a28884745059b3801c78e03bc |
| Issue introduced in 5.0 with commit 872d26a391da92ed8f0c0f5cb5fef428067b7f30 and fixed in 5.11.21 with commit 06beaa1a9f6e501213195e47c30416032fd2bbd5 |
| Issue introduced in 5.0 with commit 872d26a391da92ed8f0c0f5cb5fef428067b7f30 and fixed in 5.12.4 with commit 906c538340dde6d891df89fe7dac8eaa724e40da |
| Issue introduced in 5.0 with commit 872d26a391da92ed8f0c0f5cb5fef428067b7f30 and fixed in 5.13 with commit b5332a9f3f3d884a1b646ce155e664cc558c1722 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2021-47041 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| drivers/nvme/target/tcp.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/999d606a820c36ae9b9e9611360c8b3d8d4bb777 |
| https://git.kernel.org/stable/c/60ade0d56b06537a28884745059b3801c78e03bc |
| https://git.kernel.org/stable/c/06beaa1a9f6e501213195e47c30416032fd2bbd5 |
| https://git.kernel.org/stable/c/906c538340dde6d891df89fe7dac8eaa724e40da |
| https://git.kernel.org/stable/c/b5332a9f3f3d884a1b646ce155e664cc558c1722 |