| { |
| "containers": { |
| "cna": { |
| "providerMetadata": { |
| "orgId": "f4215fc3-5b6b-47ff-a258-f7189bd81038" |
| }, |
| "descriptions": [ |
| { |
| "lang": "en", |
| "value": "In the Linux kernel, the following vulnerability has been resolved:\n\ncgroup/bpf: use a dedicated workqueue for cgroup bpf destruction\n\nA hung_task problem shown below was found:\n\nINFO: task kworker/0:0:8 blocked for more than 327 seconds.\n\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\" disables this message.\nWorkqueue: events cgroup_bpf_release\nCall Trace:\n <TASK>\n __schedule+0x5a2/0x2050\n ? find_held_lock+0x33/0x100\n ? wq_worker_sleeping+0x9e/0xe0\n schedule+0x9f/0x180\n schedule_preempt_disabled+0x25/0x50\n __mutex_lock+0x512/0x740\n ? cgroup_bpf_release+0x1e/0x4d0\n ? cgroup_bpf_release+0xcf/0x4d0\n ? process_scheduled_works+0x161/0x8a0\n ? cgroup_bpf_release+0x1e/0x4d0\n ? mutex_lock_nested+0x2b/0x40\n ? __pfx_delay_tsc+0x10/0x10\n mutex_lock_nested+0x2b/0x40\n cgroup_bpf_release+0xcf/0x4d0\n ? process_scheduled_works+0x161/0x8a0\n ? trace_event_raw_event_workqueue_execute_start+0x64/0xd0\n ? process_scheduled_works+0x161/0x8a0\n process_scheduled_works+0x23a/0x8a0\n worker_thread+0x231/0x5b0\n ? __pfx_worker_thread+0x10/0x10\n kthread+0x14d/0x1c0\n ? __pfx_kthread+0x10/0x10\n ret_from_fork+0x59/0x70\n ? __pfx_kthread+0x10/0x10\n ret_from_fork_asm+0x1b/0x30\n </TASK>\n\nThis issue can be reproduced by the following pressuse test:\n1. A large number of cpuset cgroups are deleted.\n2. Set cpu on and off repeatly.\n3. Set watchdog_thresh repeatly.\nThe scripts can be obtained at LINK mentioned above the signature.\n\nThe reason for this issue is cgroup_mutex and cpu_hotplug_lock are\nacquired in different tasks, which may lead to deadlock.\nIt can lead to a deadlock through the following steps:\n1. A large number of cpusets are deleted asynchronously, which puts a\n large number of cgroup_bpf_release works into system_wq. The max_active\n of system_wq is WQ_DFL_ACTIVE(256). Consequently, all active works are\n cgroup_bpf_release works, and many cgroup_bpf_release works will be put\n into inactive queue. As illustrated in the diagram, there are 256 (in\n the acvtive queue) + n (in the inactive queue) works.\n2. Setting watchdog_thresh will hold cpu_hotplug_lock.read and put\n smp_call_on_cpu work into system_wq. However step 1 has already filled\n system_wq, 'sscs.work' is put into inactive queue. 'sscs.work' has\n to wait until the works that were put into the inacvtive queue earlier\n have executed (n cgroup_bpf_release), so it will be blocked for a while.\n3. Cpu offline requires cpu_hotplug_lock.write, which is blocked by step 2.\n4. Cpusets that were deleted at step 1 put cgroup_release works into\n cgroup_destroy_wq. They are competing to get cgroup_mutex all the time.\n When cgroup_metux is acqured by work at css_killed_work_fn, it will\n call cpuset_css_offline, which needs to acqure cpu_hotplug_lock.read.\n However, cpuset_css_offline will be blocked for step 3.\n5. At this moment, there are 256 works in active queue that are\n cgroup_bpf_release, they are attempting to acquire cgroup_mutex, and as\n a result, all of them are blocked. Consequently, sscs.work can not be\n executed. Ultimately, this situation leads to four processes being\n blocked, forming a deadlock.\n\nsystem_wq(step1)\t\tWatchDog(step2)\t\t\tcpu offline(step3)\tcgroup_destroy_wq(step4)\n...\n2000+ cgroups deleted asyn\n256 actives + n inactives\n\t\t\t\t__lockup_detector_reconfigure\n\t\t\t\tP(cpu_hotplug_lock.read)\n\t\t\t\tput sscs.work into system_wq\n256 + n + 1(sscs.work)\nsscs.work wait to be executed\n\t\t\t\twarting sscs.work finish\n\t\t\t\t\t\t\t\tpercpu_down_write\n\t\t\t\t\t\t\t\tP(cpu_hotplug_lock.write)\n\t\t\t\t\t\t\t\t...blocking...\n\t\t\t\t\t\t\t\t\t\t\tcss_killed_work_fn\n\t\t\t\t\t\t\t\t\t\t\tP(cgroup_mutex)\n\t\t\t\t\t\t\t\t\t\t\tcpuset_css_offline\n\t\t\t\t\t\t\t\t\t\t\tP(cpu_hotplug_lock.read)\n\t\t\t\t\t\t\t\t\t\t\t...blocking...\n256 cgroup_bpf_release\nmutex_lock(&cgroup_mutex);\n..blocking...\n\nTo fix the problem, place cgroup_bpf_release works on a dedicated\nworkqueue which can break the loop and solve the problem. System wqs are\nfor misc things which shouldn't create a large number of concurrent work\nitems. If something is going to generate >\n---truncated---" |
| } |
| ], |
| "affected": [ |
| { |
| "product": "Linux", |
| "vendor": "Linux", |
| "defaultStatus": "unaffected", |
| "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git", |
| "programFiles": [ |
| "kernel/bpf/cgroup.c" |
| ], |
| "versions": [ |
| { |
| "version": "4bfc0bb2c60e", |
| "lessThan": "71f14a9f5c7d", |
| "status": "affected", |
| "versionType": "git" |
| }, |
| { |
| "version": "4bfc0bb2c60e", |
| "lessThan": "0d86cd70fc6a", |
| "status": "affected", |
| "versionType": "git" |
| }, |
| { |
| "version": "4bfc0bb2c60e", |
| "lessThan": "6dab3331523b", |
| "status": "affected", |
| "versionType": "git" |
| }, |
| { |
| "version": "4bfc0bb2c60e", |
| "lessThan": "117932eea99b", |
| "status": "affected", |
| "versionType": "git" |
| } |
| ] |
| }, |
| { |
| "product": "Linux", |
| "vendor": "Linux", |
| "defaultStatus": "affected", |
| "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git", |
| "programFiles": [ |
| "kernel/bpf/cgroup.c" |
| ], |
| "versions": [ |
| { |
| "version": "5.3", |
| "status": "affected" |
| }, |
| { |
| "version": "0", |
| "lessThan": "5.3", |
| "status": "unaffected", |
| "versionType": "semver" |
| }, |
| { |
| "version": "6.1.116", |
| "lessThanOrEqual": "6.1.*", |
| "status": "unaffected", |
| "versionType": "semver" |
| }, |
| { |
| "version": "6.6.60", |
| "lessThanOrEqual": "6.6.*", |
| "status": "unaffected", |
| "versionType": "semver" |
| }, |
| { |
| "version": "6.11.7", |
| "lessThanOrEqual": "6.11.*", |
| "status": "unaffected", |
| "versionType": "semver" |
| }, |
| { |
| "version": "6.12", |
| "lessThanOrEqual": "*", |
| "status": "unaffected", |
| "versionType": "original_commit_for_fix" |
| } |
| ] |
| } |
| ], |
| "references": [ |
| { |
| "url": "https://git.kernel.org/stable/c/71f14a9f5c7db72fdbc56e667d4ed42a1a760494" |
| }, |
| { |
| "url": "https://git.kernel.org/stable/c/0d86cd70fc6a7ba18becb52ad8334d5ad3eca530" |
| }, |
| { |
| "url": "https://git.kernel.org/stable/c/6dab3331523ba73db1345d19e6f586dcd5f6efb4" |
| }, |
| { |
| "url": "https://git.kernel.org/stable/c/117932eea99b729ee5d12783601a4f7f5fd58a23" |
| } |
| ], |
| "title": "cgroup/bpf: use a dedicated workqueue for cgroup bpf destruction", |
| "x_generator": { |
| "engine": "bippy-8e903de6a542" |
| } |
| } |
| }, |
| "cveMetadata": { |
| "assignerOrgId": "f4215fc3-5b6b-47ff-a258-f7189bd81038", |
| "cveID": "CVE-2024-53054", |
| "requesterUserId": "gregkh@kernel.org", |
| "serial": "1", |
| "state": "PUBLISHED" |
| }, |
| "dataType": "CVE_RECORD", |
| "dataVersion": "5.0" |
| } |