releases/5.9.5/md-raid5-fix-oops-during-stripe-resizing.patch - pub/scm/linux/kernel/git/stable/stable-queue - Git at Google

 From b44c018cdf748b96b676ba09fdbc5b34fc443ada Mon Sep 17 00:00:00 2001
 From: Song Liu <songliubraving@fb.com>
 Date: Mon, 5 Oct 2020 09:35:21 -0700
 Subject: md/raid5: fix oops during stripe resizing

 From: Song Liu <songliubraving@fb.com>

 commit b44c018cdf748b96b676ba09fdbc5b34fc443ada upstream.

 KoWei reported crash during raid5 reshape:

 [ 1032.252932] Oops: 0002 [#1] SMP PTI
 [...]
 [ 1032.252943] RIP: 0010:memcpy_erms+0x6/0x10
 [...]
 [ 1032.252947] RSP: 0018:ffffba1ac0c03b78 EFLAGS: 00010286
 [ 1032.252949] RAX: 0000784ac0000000 RBX: ffff91bec3d09740 RCX: 0000000000001000
 [ 1032.252951] RDX: 0000000000001000 RSI: ffff91be6781c000 RDI: 0000784ac0000000
 [ 1032.252953] RBP: ffffba1ac0c03bd8 R08: 0000000000001000 R09: ffffba1ac0c03bf8
 [ 1032.252954] R10: 0000000000000000 R11: 0000000000000000 R12: ffffba1ac0c03bf8
 [ 1032.252955] R13: 0000000000001000 R14: 0000000000000000 R15: 0000000000000000
 [ 1032.252958] FS:  0000000000000000(0000) GS:ffff91becf500000(0000) knlGS:0000000000000000
 [ 1032.252959] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [ 1032.252961] CR2: 0000784ac0000000 CR3: 000000031780a002 CR4: 00000000001606e0
 [ 1032.252962] Call Trace:
 [ 1032.252969]  ? async_memcpy+0x179/0x1000 [async_memcpy]
 [ 1032.252977]  ? raid5_release_stripe+0x8e/0x110 [raid456]
 [ 1032.252982]  handle_stripe_expansion+0x15a/0x1f0 [raid456]
 [ 1032.252988]  handle_stripe+0x592/0x1270 [raid456]
 [ 1032.252993]  handle_active_stripes.isra.0+0x3cb/0x5a0 [raid456]
 [ 1032.252999]  raid5d+0x35c/0x550 [raid456]
 [ 1032.253002]  ? schedule+0x42/0xb0
 [ 1032.253006]  ? schedule_timeout+0x10e/0x160
 [ 1032.253011]  md_thread+0x97/0x160
 [ 1032.253015]  ? wait_woken+0x80/0x80
 [ 1032.253019]  kthread+0x104/0x140
 [ 1032.253022]  ? md_start_sync+0x60/0x60
 [ 1032.253024]  ? kthread_park+0x90/0x90
 [ 1032.253027]  ret_from_fork+0x35/0x40

 This is because cache_size_mutex was unlocked too early in resize_stripes,
 which races with grow_one_stripe() that grow_one_stripe() allocates a
 stripe with wrong pool_size.

 Fix this issue by unlocking cache_size_mutex after updating pool_size.

 Cc: <stable@vger.kernel.org> # v4.4+
 Reported-by: KoWei Sung <winders@amazon.com>
 Signed-off-by: Song Liu <songliubraving@fb.com>
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 ---
  drivers/md/raid5.c |    4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 --- a/drivers/md/raid5.c
 +++ b/drivers/md/raid5.c
 @@ -2429,8 +2429,6 @@ static int resize_stripes(struct r5conf
  	} else
  		err = -ENOMEM;

 -	mutex_unlock(&conf->cache_size_mutex);
 -
  	conf->slab_cache = sc;
  	conf->active_name = 1-conf->active_name;

 @@ -2453,6 +2451,8 @@ static int resize_stripes(struct r5conf

  	if (!err)
  		conf->pool_size = newsize;
 +	mutex_unlock(&conf->cache_size_mutex);
 +
  	return err;
  }
	From b44c018cdf748b96b676ba09fdbc5b34fc443ada Mon Sep 17 00:00:00 2001
	From: Song Liu <songliubraving@fb.com>
	Date: Mon, 5 Oct 2020 09:35:21 -0700
	Subject: md/raid5: fix oops during stripe resizing

	From: Song Liu <songliubraving@fb.com>

	commit b44c018cdf748b96b676ba09fdbc5b34fc443ada upstream.

	KoWei reported crash during raid5 reshape:

	[ 1032.252932] Oops: 0002 [#1] SMP PTI
	[...]
	[ 1032.252943] RIP: 0010:memcpy_erms+0x6/0x10
	[...]
	[ 1032.252947] RSP: 0018:ffffba1ac0c03b78 EFLAGS: 00010286
	[ 1032.252949] RAX: 0000784ac0000000 RBX: ffff91bec3d09740 RCX: 0000000000001000
	[ 1032.252951] RDX: 0000000000001000 RSI: ffff91be6781c000 RDI: 0000784ac0000000
	[ 1032.252953] RBP: ffffba1ac0c03bd8 R08: 0000000000001000 R09: ffffba1ac0c03bf8
	[ 1032.252954] R10: 0000000000000000 R11: 0000000000000000 R12: ffffba1ac0c03bf8
	[ 1032.252955] R13: 0000000000001000 R14: 0000000000000000 R15: 0000000000000000
	[ 1032.252958] FS: 0000000000000000(0000) GS:ffff91becf500000(0000) knlGS:0000000000000000
	[ 1032.252959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	[ 1032.252961] CR2: 0000784ac0000000 CR3: 000000031780a002 CR4: 00000000001606e0
	[ 1032.252962] Call Trace:
	[ 1032.252969] ? async_memcpy+0x179/0x1000 [async_memcpy]
	[ 1032.252977] ? raid5_release_stripe+0x8e/0x110 [raid456]
	[ 1032.252982] handle_stripe_expansion+0x15a/0x1f0 [raid456]
	[ 1032.252988] handle_stripe+0x592/0x1270 [raid456]
	[ 1032.252993] handle_active_stripes.isra.0+0x3cb/0x5a0 [raid456]
	[ 1032.252999] raid5d+0x35c/0x550 [raid456]
	[ 1032.253002] ? schedule+0x42/0xb0
	[ 1032.253006] ? schedule_timeout+0x10e/0x160
	[ 1032.253011] md_thread+0x97/0x160
	[ 1032.253015] ? wait_woken+0x80/0x80
	[ 1032.253019] kthread+0x104/0x140
	[ 1032.253022] ? md_start_sync+0x60/0x60
	[ 1032.253024] ? kthread_park+0x90/0x90
	[ 1032.253027] ret_from_fork+0x35/0x40

	This is because cache_size_mutex was unlocked too early in resize_stripes,
	which races with grow_one_stripe() that grow_one_stripe() allocates a
	stripe with wrong pool_size.

	Fix this issue by unlocking cache_size_mutex after updating pool_size.

	Cc: <stable@vger.kernel.org> # v4.4+
	Reported-by: KoWei Sung <winders@amazon.com>
	Signed-off-by: Song Liu <songliubraving@fb.com>
	Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

	---
	drivers/md/raid5.c \| 4 ++--
	1 file changed, 2 insertions(+), 2 deletions(-)

	--- a/drivers/md/raid5.c
	+++ b/drivers/md/raid5.c
	@@ -2429,8 +2429,6 @@ static int resize_stripes(struct r5conf
	} else
	err = -ENOMEM;

	- mutex_unlock(&conf->cache_size_mutex);
	-
	conf->slab_cache = sc;
	conf->active_name = 1-conf->active_name;

	@@ -2453,6 +2451,8 @@ static int resize_stripes(struct r5conf

	if (!err)
	conf->pool_size = newsize;
	+ mutex_unlock(&conf->cache_size_mutex);
	+
	return err;
	}