releases/2.6.32.12/xfs-reclaim-all-inodes-by-background-tree-walks.patch - pub/scm/linux/kernel/git/stable/stable-queue - Git at Google

 From david@fromorbit.com  Fri Apr  2 11:09:55 2010
 From: Dave Chinner <david@fromorbit.com>
 Date: Fri, 12 Mar 2010 09:42:10 +1100
 Subject: xfs: reclaim all inodes by background tree walks
 To: stable@kernel.org
 Cc: xfs@oss.sgi.com
 Message-ID: <1268347337-7160-13-git-send-email-david@fromorbit.com>

 From: Dave Chinner <david@fromorbit.com>

 commit 57817c68229984818fea9e614d6f95249c3fb098 upstream

 We cannot do direct inode reclaim without taking the flush lock to
 ensure that we do not reclaim an inode under IO. We check the inode
 is clean before doing direct reclaim, but this is not good enough
 because the inode flush code marks the inode clean once it has
 copied the in-core dirty state to the backing buffer.

 It is the flush lock that determines whether the inode is still
 under IO, even though it is marked clean, and the inode is still
 required at IO completion so we can't reclaim it even though it is
 clean in core. Hence the requirement that we need to take the flush
 lock even on clean inodes because this guarantees that the inode
 writeback IO has completed and it is safe to reclaim the inode.

 With delayed write inode flushing, we could end up waiting a long
 time on the flush lock even for a clean inode. The background
 reclaim already handles this efficiently, so avoid all the problems
 by killing the direct reclaim path altogether.

 Signed-off-by: Dave Chinner <david@fromorbit.com>
 Reviewed-by: Christoph Hellwig <hch@lst.de>
 Signed-off-by: Alex Elder <aelder@sgi.com>
 Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

 ---
  fs/xfs/linux-2.6/xfs_super.c |   14 ++++++--------
  1 file changed, 6 insertions(+), 8 deletions(-)

 --- a/fs/xfs/linux-2.6/xfs_super.c
 +++ b/fs/xfs/linux-2.6/xfs_super.c
 @@ -953,16 +953,14 @@ xfs_fs_destroy_inode(
  	ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIM));

  	/*
 -	 * If we have nothing to flush with this inode then complete the
 -	 * teardown now, otherwise delay the flush operation.
 +	 * We always use background reclaim here because even if the
 +	 * inode is clean, it still may be under IO and hence we have
 +	 * to take the flush lock. The background reclaim path handles
 +	 * this more efficiently than we can here, so simply let background
 +	 * reclaim tear down all inodes.
  	 */
 -	if (!xfs_inode_clean(ip)) {
 -		xfs_inode_set_reclaim_tag(ip);
 -		return;
 -	}
 -
  out_reclaim:
 -	xfs_ireclaim(ip);
 +	xfs_inode_set_reclaim_tag(ip);
  }

  /*
	From david@fromorbit.com Fri Apr 2 11:09:55 2010
	From: Dave Chinner <david@fromorbit.com>
	Date: Fri, 12 Mar 2010 09:42:10 +1100
	Subject: xfs: reclaim all inodes by background tree walks
	To: stable@kernel.org
	Cc: xfs@oss.sgi.com
	Message-ID: <1268347337-7160-13-git-send-email-david@fromorbit.com>

	From: Dave Chinner <david@fromorbit.com>

	commit 57817c68229984818fea9e614d6f95249c3fb098 upstream

	We cannot do direct inode reclaim without taking the flush lock to
	ensure that we do not reclaim an inode under IO. We check the inode
	is clean before doing direct reclaim, but this is not good enough
	because the inode flush code marks the inode clean once it has
	copied the in-core dirty state to the backing buffer.

	It is the flush lock that determines whether the inode is still
	under IO, even though it is marked clean, and the inode is still
	required at IO completion so we can't reclaim it even though it is
	clean in core. Hence the requirement that we need to take the flush
	lock even on clean inodes because this guarantees that the inode
	writeback IO has completed and it is safe to reclaim the inode.

	With delayed write inode flushing, we could end up waiting a long
	time on the flush lock even for a clean inode. The background
	reclaim already handles this efficiently, so avoid all the problems
	by killing the direct reclaim path altogether.

	Signed-off-by: Dave Chinner <david@fromorbit.com>
	Reviewed-by: Christoph Hellwig <hch@lst.de>
	Signed-off-by: Alex Elder <aelder@sgi.com>
	Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

	---
	fs/xfs/linux-2.6/xfs_super.c \| 14 ++++++--------
	1 file changed, 6 insertions(+), 8 deletions(-)

	--- a/fs/xfs/linux-2.6/xfs_super.c
	+++ b/fs/xfs/linux-2.6/xfs_super.c
	@@ -953,16 +953,14 @@ xfs_fs_destroy_inode(
	ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIM));

	/*
	- * If we have nothing to flush with this inode then complete the
	- * teardown now, otherwise delay the flush operation.
	+ * We always use background reclaim here because even if the
	+ * inode is clean, it still may be under IO and hence we have
	+ * to take the flush lock. The background reclaim path handles
	+ * this more efficiently than we can here, so simply let background
	+ * reclaim tear down all inodes.
	*/
	- if (!xfs_inode_clean(ip)) {
	- xfs_inode_set_reclaim_tag(ip);
	- return;
	- }
	-
	out_reclaim:
	- xfs_ireclaim(ip);
	+ xfs_inode_set_reclaim_tag(ip);
	}

	/*