| From 3236c3e1adc0c7ec83eaff1de2d06746b7c5bb28 Mon Sep 17 00:00:00 2001 |
| From: Jeff Layton <jlayton@redhat.com> |
| Date: Tue, 11 Oct 2011 09:49:21 -0400 |
| Subject: nfs: don't redirty inode when ncommit == 0 in nfs_commit_unstable_pages |
| |
| From: Jeff Layton <jlayton@redhat.com> |
| |
| commit 3236c3e1adc0c7ec83eaff1de2d06746b7c5bb28 upstream. |
| |
| commit 420e3646 allowed the kernel to reduce the number of unnecessary |
| commit calls by skipping the commit when there are a large number of |
| outstanding pages. |
| |
| However, the current test in nfs_commit_unstable_pages does not handle |
| the edge condition properly. When ncommit == 0, then that means that the |
| kernel doesn't need to do anything more for the inode. The current test |
| though in the WB_SYNC_NONE case will return true, and the inode will end |
| up being marked dirty. Once that happens the inode will never be clean |
| until there's a WB_SYNC_ALL flush. |
| |
| Fix this by immediately returning from nfs_commit_unstable_pages when |
| ncommit == 0. |
| |
| Mike noticed this problem initially in RHEL5 (2.6.18-based kernel) which |
| has a backported version of 420e3646. The inode cache there was growing |
| very large. The inode cache was unable to be shrunk since the inodes |
| were all marked dirty. Calling sync() would essentially "fix" the |
| problem -- the WB_SYNC_ALL flush would result in the inodes all being |
| marked clean. |
| |
| What I'm not clear on is how big a problem this is in mainline kernels |
| as the writeback code there is very different. Either way, it seems |
| incorrect to re-mark the inode dirty in this case. |
| |
| Reported-by: Mike McLean <mikem@redhat.com> |
| Signed-off-by: Jeff Layton <jlayton@redhat.com> |
| Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> |
| |
| --- |
| fs/nfs/write.c | 4 ++++ |
| 1 file changed, 4 insertions(+) |
| |
| --- a/fs/nfs/write.c |
| +++ b/fs/nfs/write.c |
| @@ -1577,6 +1577,10 @@ static int nfs_commit_unstable_pages(str |
| int flags = FLUSH_SYNC; |
| int ret = 0; |
| |
| + /* no commits means nothing needs to be done */ |
| + if (!nfsi->ncommit) |
| + return ret; |
| + |
| if (wbc->sync_mode == WB_SYNC_NONE) { |
| /* Don't commit yet if this is a non-blocking flush and there |
| * are a lot of outstanding writes for this mapping. |