| From cefdc26e86728812aea54248a534fd4a5da2a43d Mon Sep 17 00:00:00 2001 |
| From: Martin Brandenburg <martin@omnibond.com> |
| Date: Thu, 6 Apr 2017 18:11:00 -0400 |
| Subject: orangefs: move features validation to fix filesystem hang |
| |
| From: Martin Brandenburg <martin@omnibond.com> |
| |
| commit cefdc26e86728812aea54248a534fd4a5da2a43d upstream. |
| |
| Without this fix (and another to the userspace component itself |
| described later), the kernel will be unable to process any OrangeFS |
| requests after the userspace component is restarted (due to a crash or |
| at the administrator's behest). |
| |
| The bug here is that inside orangefs_remount, the orangefs_request_mutex |
| is locked. When the userspace component restarts while the filesystem |
| is mounted, it sends a ORANGEFS_DEV_REMOUNT_ALL ioctl to the device, |
| which causes the kernel to send it a few requests aimed at synchronizing |
| the state between the two. While this is happening the |
| orangefs_request_mutex is locked to prevent any other requests going |
| through. |
| |
| This is only half of the bugfix. The other half is in the userspace |
| component which outright ignores(!) requests made before it considers |
| the filesystem remounted, which is after the ioctl returns. Of course |
| the ioctl doesn't return until after the userspace component responds to |
| the request it ignores. The userspace component has been changed to |
| allow ORANGEFS_VFS_OP_FEATURES regardless of the mount status. |
| |
| Mike Marshall says: |
| "I've tested this patch against the fixed userspace part. This patch is |
| real important, I hope it can make it into 4.11... |
| |
| Here's what happens when the userspace daemon is restarted, without |
| the patch: |
| |
| ============================================= |
| [ INFO: possible recursive locking detected ] |
| [ 4.10.0-00007-ge98bdb3 #1 Not tainted ] |
| --------------------------------------------- |
| pvfs2-client-co/29032 is trying to acquire lock: |
| (orangefs_request_mutex){+.+.+.}, at: service_operation+0x3c7/0x7b0 [orangefs] |
| but task is already holding lock: |
| (orangefs_request_mutex){+.+.+.}, at: dispatch_ioctl_command+0x1bf/0x330 [orangefs] |
| |
| CPU: 0 PID: 29032 Comm: pvfs2-client-co Not tainted 4.10.0-00007-ge98bdb3 #1 |
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014 |
| Call Trace: |
| __lock_acquire+0x7eb/0x1290 |
| lock_acquire+0xe8/0x1d0 |
| mutex_lock_killable_nested+0x6f/0x6e0 |
| service_operation+0x3c7/0x7b0 [orangefs] |
| orangefs_remount+0xea/0x150 [orangefs] |
| dispatch_ioctl_command+0x227/0x330 [orangefs] |
| orangefs_devreq_ioctl+0x29/0x70 [orangefs] |
| do_vfs_ioctl+0xa3/0x6e0 |
| SyS_ioctl+0x79/0x90" |
| |
| Signed-off-by: Martin Brandenburg <martin@omnibond.com> |
| Acked-by: Mike Marshall <hubcap@omnibond.com> |
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| |
| --- |
| fs/orangefs/super.c | 9 +++++++-- |
| 1 file changed, 7 insertions(+), 2 deletions(-) |
| |
| --- a/fs/orangefs/super.c |
| +++ b/fs/orangefs/super.c |
| @@ -263,8 +263,13 @@ int orangefs_remount(struct orangefs_sb_ |
| if (!new_op) |
| return -ENOMEM; |
| new_op->upcall.req.features.features = 0; |
| - ret = service_operation(new_op, "orangefs_features", 0); |
| - orangefs_features = new_op->downcall.resp.features.features; |
| + ret = service_operation(new_op, "orangefs_features", |
| + ORANGEFS_OP_PRIORITY | ORANGEFS_OP_NO_MUTEX); |
| + if (!ret) |
| + orangefs_features = |
| + new_op->downcall.resp.features.features; |
| + else |
| + orangefs_features = 0; |
| op_release(new_op); |
| } else { |
| orangefs_features = 0; |