cve/rejected/2024/CVE-2024-53054.json - pub/scm/linux/security/vulns - Git at Google

 {
    "containers": {
       "cna": {
          "providerMetadata": {
             "orgId": "f4215fc3-5b6b-47ff-a258-f7189bd81038"
          },
          "descriptions": [
             {
                "lang": "en",
                "value": "In the Linux kernel, the following vulnerability has been resolved:\n\ncgroup/bpf: use a dedicated workqueue for cgroup bpf destruction\n\nA hung_task problem shown below was found:\n\nINFO: task kworker/0:0:8 blocked for more than 327 seconds.\n\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\" disables this message.\nWorkqueue: events cgroup_bpf_release\nCall Trace:\n <TASK>\n __schedule+0x5a2/0x2050\n ? find_held_lock+0x33/0x100\n ? wq_worker_sleeping+0x9e/0xe0\n schedule+0x9f/0x180\n schedule_preempt_disabled+0x25/0x50\n __mutex_lock+0x512/0x740\n ? cgroup_bpf_release+0x1e/0x4d0\n ? cgroup_bpf_release+0xcf/0x4d0\n ? process_scheduled_works+0x161/0x8a0\n ? cgroup_bpf_release+0x1e/0x4d0\n ? mutex_lock_nested+0x2b/0x40\n ? __pfx_delay_tsc+0x10/0x10\n mutex_lock_nested+0x2b/0x40\n cgroup_bpf_release+0xcf/0x4d0\n ? process_scheduled_works+0x161/0x8a0\n ? trace_event_raw_event_workqueue_execute_start+0x64/0xd0\n ? process_scheduled_works+0x161/0x8a0\n process_scheduled_works+0x23a/0x8a0\n worker_thread+0x231/0x5b0\n ? __pfx_worker_thread+0x10/0x10\n kthread+0x14d/0x1c0\n ? __pfx_kthread+0x10/0x10\n ret_from_fork+0x59/0x70\n ? __pfx_kthread+0x10/0x10\n ret_from_fork_asm+0x1b/0x30\n </TASK>\n\nThis issue can be reproduced by the following pressuse test:\n1. A large number of cpuset cgroups are deleted.\n2. Set cpu on and off repeatly.\n3. Set watchdog_thresh repeatly.\nThe scripts can be obtained at LINK mentioned above the signature.\n\nThe reason for this issue is cgroup_mutex and cpu_hotplug_lock are\nacquired in different tasks, which may lead to deadlock.\nIt can lead to a deadlock through the following steps:\n1. A large number of cpusets are deleted asynchronously, which puts a\n   large number of cgroup_bpf_release works into system_wq. The max_active\n   of system_wq is WQ_DFL_ACTIVE(256). Consequently, all active works are\n   cgroup_bpf_release works, and many cgroup_bpf_release works will be put\n   into inactive queue. As illustrated in the diagram, there are 256 (in\n   the acvtive queue) + n (in the inactive queue) works.\n2. Setting watchdog_thresh will hold cpu_hotplug_lock.read and put\n   smp_call_on_cpu work into system_wq. However step 1 has already filled\n   system_wq, 'sscs.work' is put into inactive queue. 'sscs.work' has\n   to wait until the works that were put into the inacvtive queue earlier\n   have executed (n cgroup_bpf_release), so it will be blocked for a while.\n3. Cpu offline requires cpu_hotplug_lock.write, which is blocked by step 2.\n4. Cpusets that were deleted at step 1 put cgroup_release works into\n   cgroup_destroy_wq. They are competing to get cgroup_mutex all the time.\n   When cgroup_metux is acqured by work at css_killed_work_fn, it will\n   call cpuset_css_offline, which needs to acqure cpu_hotplug_lock.read.\n   However, cpuset_css_offline will be blocked for step 3.\n5. At this moment, there are 256 works in active queue that are\n   cgroup_bpf_release, they are attempting to acquire cgroup_mutex, and as\n   a result, all of them are blocked. Consequently, sscs.work can not be\n   executed. Ultimately, this situation leads to four processes being\n   blocked, forming a deadlock.\n\nsystem_wq(step1)\t\tWatchDog(step2)\t\t\tcpu offline(step3)\tcgroup_destroy_wq(step4)\n...\n2000+ cgroups deleted asyn\n256 actives + n inactives\n\t\t\t\t__lockup_detector_reconfigure\n\t\t\t\tP(cpu_hotplug_lock.read)\n\t\t\t\tput sscs.work into system_wq\n256 + n + 1(sscs.work)\nsscs.work wait to be executed\n\t\t\t\twarting sscs.work finish\n\t\t\t\t\t\t\t\tpercpu_down_write\n\t\t\t\t\t\t\t\tP(cpu_hotplug_lock.write)\n\t\t\t\t\t\t\t\t...blocking...\n\t\t\t\t\t\t\t\t\t\t\tcss_killed_work_fn\n\t\t\t\t\t\t\t\t\t\t\tP(cgroup_mutex)\n\t\t\t\t\t\t\t\t\t\t\tcpuset_css_offline\n\t\t\t\t\t\t\t\t\t\t\tP(cpu_hotplug_lock.read)\n\t\t\t\t\t\t\t\t\t\t\t...blocking...\n256 cgroup_bpf_release\nmutex_lock(&cgroup_mutex);\n..blocking...\n\nTo fix the problem, place cgroup_bpf_release works on a dedicated\nworkqueue which can break the loop and solve the problem. System wqs are\nfor misc things which shouldn't create a large number of concurrent work\nitems. If something is going to generate >\n---truncated---"
             }
          ],
          "affected": [
             {
                "product": "Linux",
                "vendor": "Linux",
                "defaultStatus": "unaffected",
                "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
                "programFiles": [
                   "kernel/bpf/cgroup.c"
                ],
                "versions": [
                   {
                      "version": "4bfc0bb2c60e",
                      "lessThan": "71f14a9f5c7d",
                      "status": "affected",
                      "versionType": "git"
                   },
                   {
                      "version": "4bfc0bb2c60e",
                      "lessThan": "0d86cd70fc6a",
                      "status": "affected",
                      "versionType": "git"
                   },
                   {
                      "version": "4bfc0bb2c60e",
                      "lessThan": "6dab3331523b",
                      "status": "affected",
                      "versionType": "git"
                   },
                   {
                      "version": "4bfc0bb2c60e",
                      "lessThan": "117932eea99b",
                      "status": "affected",
                      "versionType": "git"
                   }
                ]
             },
             {
                "product": "Linux",
                "vendor": "Linux",
                "defaultStatus": "affected",
                "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
                "programFiles": [
                   "kernel/bpf/cgroup.c"
                ],
                "versions": [
                   {
                      "version": "5.3",
                      "status": "affected"
                   },
                   {
                      "version": "0",
                      "lessThan": "5.3",
                      "status": "unaffected",
                      "versionType": "semver"
                   },
                   {
                      "version": "6.1.116",
                      "lessThanOrEqual": "6.1.*",
                      "status": "unaffected",
                      "versionType": "semver"
                   },
                   {
                      "version": "6.6.60",
                      "lessThanOrEqual": "6.6.*",
                      "status": "unaffected",
                      "versionType": "semver"
                   },
                   {
                      "version": "6.11.7",
                      "lessThanOrEqual": "6.11.*",
                      "status": "unaffected",
                      "versionType": "semver"
                   },
                   {
                      "version": "6.12",
                      "lessThanOrEqual": "*",
                      "status": "unaffected",
                      "versionType": "original_commit_for_fix"
                   }
                ]
             }
          ],
          "references": [
             {
                "url": "https://git.kernel.org/stable/c/71f14a9f5c7db72fdbc56e667d4ed42a1a760494"
             },
             {
                "url": "https://git.kernel.org/stable/c/0d86cd70fc6a7ba18becb52ad8334d5ad3eca530"
             },
             {
                "url": "https://git.kernel.org/stable/c/6dab3331523ba73db1345d19e6f586dcd5f6efb4"
             },
             {
                "url": "https://git.kernel.org/stable/c/117932eea99b729ee5d12783601a4f7f5fd58a23"
             }
          ],
          "title": "cgroup/bpf: use a dedicated workqueue for cgroup bpf destruction",
          "x_generator": {
             "engine": "bippy-8e903de6a542"
          }
       }
    },
    "cveMetadata": {
       "assignerOrgId": "f4215fc3-5b6b-47ff-a258-f7189bd81038",
       "cveID": "CVE-2024-53054",
       "requesterUserId": "gregkh@kernel.org",
       "serial": "1",
       "state": "PUBLISHED"
    },
    "dataType": "CVE_RECORD",
    "dataVersion": "5.0"
 }
	{
	"containers": {
	"cna": {
	"providerMetadata": {
	"orgId": "f4215fc3-5b6b-47ff-a258-f7189bd81038"
	},
	"descriptions": [
	{
	"lang": "en",
	"value": "In the Linux kernel, the following vulnerability has been resolved:\n\ncgroup/bpf: use a dedicated workqueue for cgroup bpf destruction\n\nA hung_task problem shown below was found:\n\nINFO: task kworker/0:0:8 blocked for more than 327 seconds.\n\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\" disables this message.\nWorkqueue: events cgroup_bpf_release\nCall Trace:\n <TASK>\n __schedule+0x5a2/0x2050\n ? find_held_lock+0x33/0x100\n ? wq_worker_sleeping+0x9e/0xe0\n schedule+0x9f/0x180\n schedule_preempt_disabled+0x25/0x50\n __mutex_lock+0x512/0x740\n ? cgroup_bpf_release+0x1e/0x4d0\n ? cgroup_bpf_release+0xcf/0x4d0\n ? process_scheduled_works+0x161/0x8a0\n ? cgroup_bpf_release+0x1e/0x4d0\n ? mutex_lock_nested+0x2b/0x40\n ? __pfx_delay_tsc+0x10/0x10\n mutex_lock_nested+0x2b/0x40\n cgroup_bpf_release+0xcf/0x4d0\n ? process_scheduled_works+0x161/0x8a0\n ? trace_event_raw_event_workqueue_execute_start+0x64/0xd0\n ? process_scheduled_works+0x161/0x8a0\n process_scheduled_works+0x23a/0x8a0\n worker_thread+0x231/0x5b0\n ? __pfx_worker_thread+0x10/0x10\n kthread+0x14d/0x1c0\n ? __pfx_kthread+0x10/0x10\n ret_from_fork+0x59/0x70\n ? __pfx_kthread+0x10/0x10\n ret_from_fork_asm+0x1b/0x30\n </TASK>\n\nThis issue can be reproduced by the following pressuse test:\n1. A large number of cpuset cgroups are deleted.\n2. Set cpu on and off repeatly.\n3. Set watchdog_thresh repeatly.\nThe scripts can be obtained at LINK mentioned above the signature.\n\nThe reason for this issue is cgroup_mutex and cpu_hotplug_lock are\nacquired in different tasks, which may lead to deadlock.\nIt can lead to a deadlock through the following steps:\n1. A large number of cpusets are deleted asynchronously, which puts a\n large number of cgroup_bpf_release works into system_wq. The max_active\n of system_wq is WQ_DFL_ACTIVE(256). Consequently, all active works are\n cgroup_bpf_release works, and many cgroup_bpf_release works will be put\n into inactive queue. As illustrated in the diagram, there are 256 (in\n the acvtive queue) + n (in the inactive queue) works.\n2. Setting watchdog_thresh will hold cpu_hotplug_lock.read and put\n smp_call_on_cpu work into system_wq. However step 1 has already filled\n system_wq, 'sscs.work' is put into inactive queue. 'sscs.work' has\n to wait until the works that were put into the inacvtive queue earlier\n have executed (n cgroup_bpf_release), so it will be blocked for a while.\n3. Cpu offline requires cpu_hotplug_lock.write, which is blocked by step 2.\n4. Cpusets that were deleted at step 1 put cgroup_release works into\n cgroup_destroy_wq. They are competing to get cgroup_mutex all the time.\n When cgroup_metux is acqured by work at css_killed_work_fn, it will\n call cpuset_css_offline, which needs to acqure cpu_hotplug_lock.read.\n However, cpuset_css_offline will be blocked for step 3.\n5. At this moment, there are 256 works in active queue that are\n cgroup_bpf_release, they are attempting to acquire cgroup_mutex, and as\n a result, all of them are blocked. Consequently, sscs.work can not be\n executed. Ultimately, this situation leads to four processes being\n blocked, forming a deadlock.\n\nsystem_wq(step1)\t\tWatchDog(step2)\t\t\tcpu offline(step3)\tcgroup_destroy_wq(step4)\n...\n2000+ cgroups deleted asyn\n256 actives + n inactives\n\t\t\t\t__lockup_detector_reconfigure\n\t\t\t\tP(cpu_hotplug_lock.read)\n\t\t\t\tput sscs.work into system_wq\n256 + n + 1(sscs.work)\nsscs.work wait to be executed\n\t\t\t\twarting sscs.work finish\n\t\t\t\t\t\t\t\tpercpu_down_write\n\t\t\t\t\t\t\t\tP(cpu_hotplug_lock.write)\n\t\t\t\t\t\t\t\t...blocking...\n\t\t\t\t\t\t\t\t\t\t\tcss_killed_work_fn\n\t\t\t\t\t\t\t\t\t\t\tP(cgroup_mutex)\n\t\t\t\t\t\t\t\t\t\t\tcpuset_css_offline\n\t\t\t\t\t\t\t\t\t\t\tP(cpu_hotplug_lock.read)\n\t\t\t\t\t\t\t\t\t\t\t...blocking...\n256 cgroup_bpf_release\nmutex_lock(&cgroup_mutex);\n..blocking...\n\nTo fix the problem, place cgroup_bpf_release works on a dedicated\nworkqueue which can break the loop and solve the problem. System wqs are\nfor misc things which shouldn't create a large number of concurrent work\nitems. If something is going to generate >\n---truncated---"
	}
	],
	"affected": [
	{
	"product": "Linux",
	"vendor": "Linux",
	"defaultStatus": "unaffected",
	"repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
	"programFiles": [
	"kernel/bpf/cgroup.c"
	],
	"versions": [
	{
	"version": "4bfc0bb2c60e",
	"lessThan": "71f14a9f5c7d",
	"status": "affected",
	"versionType": "git"
	},
	{
	"version": "4bfc0bb2c60e",
	"lessThan": "0d86cd70fc6a",
	"status": "affected",
	"versionType": "git"
	},
	{
	"version": "4bfc0bb2c60e",
	"lessThan": "6dab3331523b",
	"status": "affected",
	"versionType": "git"
	},
	{
	"version": "4bfc0bb2c60e",
	"lessThan": "117932eea99b",
	"status": "affected",
	"versionType": "git"
	}
	]
	},
	{
	"product": "Linux",
	"vendor": "Linux",
	"defaultStatus": "affected",
	"repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
	"programFiles": [
	"kernel/bpf/cgroup.c"
	],
	"versions": [
	{
	"version": "5.3",
	"status": "affected"
	},
	{
	"version": "0",
	"lessThan": "5.3",
	"status": "unaffected",
	"versionType": "semver"
	},
	{
	"version": "6.1.116",
	"lessThanOrEqual": "6.1.*",
	"status": "unaffected",
	"versionType": "semver"
	},
	{
	"version": "6.6.60",
	"lessThanOrEqual": "6.6.*",
	"status": "unaffected",
	"versionType": "semver"
	},
	{
	"version": "6.11.7",
	"lessThanOrEqual": "6.11.*",
	"status": "unaffected",
	"versionType": "semver"
	},
	{
	"version": "6.12",
	"lessThanOrEqual": "*",
	"status": "unaffected",
	"versionType": "original_commit_for_fix"
	}
	]
	}
	],
	"references": [
	{
	"url": "https://git.kernel.org/stable/c/71f14a9f5c7db72fdbc56e667d4ed42a1a760494"
	},
	{
	"url": "https://git.kernel.org/stable/c/0d86cd70fc6a7ba18becb52ad8334d5ad3eca530"
	},
	{
	"url": "https://git.kernel.org/stable/c/6dab3331523ba73db1345d19e6f586dcd5f6efb4"
	},
	{
	"url": "https://git.kernel.org/stable/c/117932eea99b729ee5d12783601a4f7f5fd58a23"
	}
	],
	"title": "cgroup/bpf: use a dedicated workqueue for cgroup bpf destruction",
	"x_generator": {
	"engine": "bippy-8e903de6a542"
	}
	}
	},
	"cveMetadata": {
	"assignerOrgId": "f4215fc3-5b6b-47ff-a258-f7189bd81038",
	"cveID": "CVE-2024-53054",
	"requesterUserId": "gregkh@kernel.org",
	"serial": "1",
	"state": "PUBLISHED"
	},
	"dataType": "CVE_RECORD",
	"dataVersion": "5.0"
	}