| From e47da9553dd9b62b6785b33ce6c7566d547e4994 Mon Sep 17 00:00:00 2001 |
| From: Sasha Levin <sashal@kernel.org> |
| Date: Mon, 31 May 2021 16:50:55 +0800 |
| Subject: btrfs: don't clear page extent mapped if we're not invalidating the |
| full page |
| |
| From: Qu Wenruo <wqu@suse.com> |
| |
| [ Upstream commit bcd77455d590eaa0422a5e84ae852007cfce574a ] |
| |
| [BUG] |
| With current btrfs subpage rw support, the following script can lead to |
| fs hang: |
| |
| $ mkfs.btrfs -f -s 4k $dev |
| $ mount $dev -o nospace_cache $mnt |
| $ fsstress -w -n 100 -p 1 -s 1608140256 -v -d $mnt |
| |
| The fs will hang at btrfs_start_ordered_extent(). |
| |
| [CAUSE] |
| In above test case, btrfs_invalidate() will be called with the following |
| parameters: |
| |
| offset = 0 length = 53248 page dirty = 1 subpage dirty bitmap = 0x2000 |
| |
| Since @offset is 0, btrfs_invalidate() will try to invalidate the full |
| page, and finally call clear_page_extent_mapped() which will detach |
| subpage structure from the page. |
| |
| And since the page no longer has subpage structure, the subpage dirty |
| bitmap will be cleared, preventing the dirty range from being written |
| back, thus no way to wake up the ordered extent. |
| |
| [FIX] |
| Just follow other filesystems, only to invalidate the page if the range |
| covers the full page. |
| |
| There are cases like truncate_setsize() which can call |
| btrfs_invalidatepage() with offset == 0 and length != 0 for the last |
| page of an inode. |
| |
| Although the old code will still try to invalidate the full page, we are |
| still safe to just wait for ordered extent to finish. |
| So it shouldn't cause extra problems. |
| |
| Tested-by: Ritesh Harjani <riteshh@linux.ibm.com> # [ppc64] |
| Tested-by: Anand Jain <anand.jain@oracle.com> # [aarch64] |
| Signed-off-by: Qu Wenruo <wqu@suse.com> |
| Signed-off-by: David Sterba <dsterba@suse.com> |
| Signed-off-by: Sasha Levin <sashal@kernel.org> |
| --- |
| fs/btrfs/inode.c | 14 +++++++++++++- |
| 1 file changed, 13 insertions(+), 1 deletion(-) |
| |
| diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c |
| index a03d3bad2139..4f21b8fbfd4b 100644 |
| --- a/fs/btrfs/inode.c |
| +++ b/fs/btrfs/inode.c |
| @@ -8213,7 +8213,19 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset, |
| */ |
| wait_on_page_writeback(page); |
| |
| - if (offset) { |
| + /* |
| + * For subpage case, we have call sites like |
| + * btrfs_punch_hole_lock_range() which passes range not aligned to |
| + * sectorsize. |
| + * If the range doesn't cover the full page, we don't need to and |
| + * shouldn't clear page extent mapped, as page->private can still |
| + * record subpage dirty bits for other part of the range. |
| + * |
| + * For cases that can invalidate the full even the range doesn't |
| + * cover the full page, like invalidating the last page, we're |
| + * still safe to wait for ordered extent to finish. |
| + */ |
| + if (!(offset == 0 && length == PAGE_SIZE)) { |
| btrfs_releasepage(page, GFP_NOFS); |
| return; |
| } |
| -- |
| 2.30.2 |
| |