| From david@fromorbit.com Fri Apr 2 11:09:55 2010 |
| From: Dave Chinner <david@fromorbit.com> |
| Date: Fri, 12 Mar 2010 09:42:10 +1100 |
| Subject: xfs: reclaim all inodes by background tree walks |
| To: stable@kernel.org |
| Cc: xfs@oss.sgi.com |
| Message-ID: <1268347337-7160-13-git-send-email-david@fromorbit.com> |
| |
| From: Dave Chinner <david@fromorbit.com> |
| |
| commit 57817c68229984818fea9e614d6f95249c3fb098 upstream |
| |
| We cannot do direct inode reclaim without taking the flush lock to |
| ensure that we do not reclaim an inode under IO. We check the inode |
| is clean before doing direct reclaim, but this is not good enough |
| because the inode flush code marks the inode clean once it has |
| copied the in-core dirty state to the backing buffer. |
| |
| It is the flush lock that determines whether the inode is still |
| under IO, even though it is marked clean, and the inode is still |
| required at IO completion so we can't reclaim it even though it is |
| clean in core. Hence the requirement that we need to take the flush |
| lock even on clean inodes because this guarantees that the inode |
| writeback IO has completed and it is safe to reclaim the inode. |
| |
| With delayed write inode flushing, we could end up waiting a long |
| time on the flush lock even for a clean inode. The background |
| reclaim already handles this efficiently, so avoid all the problems |
| by killing the direct reclaim path altogether. |
| |
| Signed-off-by: Dave Chinner <david@fromorbit.com> |
| Reviewed-by: Christoph Hellwig <hch@lst.de> |
| Signed-off-by: Alex Elder <aelder@sgi.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> |
| |
| --- |
| fs/xfs/linux-2.6/xfs_super.c | 14 ++++++-------- |
| 1 file changed, 6 insertions(+), 8 deletions(-) |
| |
| --- a/fs/xfs/linux-2.6/xfs_super.c |
| +++ b/fs/xfs/linux-2.6/xfs_super.c |
| @@ -953,16 +953,14 @@ xfs_fs_destroy_inode( |
| ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIM)); |
| |
| /* |
| - * If we have nothing to flush with this inode then complete the |
| - * teardown now, otherwise delay the flush operation. |
| + * We always use background reclaim here because even if the |
| + * inode is clean, it still may be under IO and hence we have |
| + * to take the flush lock. The background reclaim path handles |
| + * this more efficiently than we can here, so simply let background |
| + * reclaim tear down all inodes. |
| */ |
| - if (!xfs_inode_clean(ip)) { |
| - xfs_inode_set_reclaim_tag(ip); |
| - return; |
| - } |
| - |
| out_reclaim: |
| - xfs_ireclaim(ip); |
| + xfs_inode_set_reclaim_tag(ip); |
| } |
| |
| /* |