| From 05e59fa65b785414ee788bcd91bbb8ca81c7bf59 Mon Sep 17 00:00:00 2001 |
| From: Sasha Levin <sashal@kernel.org> |
| Date: Thu, 23 Jan 2020 15:33:02 -0500 |
| Subject: btrfs: flush write bio if we loop in extent_write_cache_pages |
| |
| From: Josef Bacik <josef@toxicpanda.com> |
| |
| [ Upstream commit 96bf313ecb33567af4cb53928b0c951254a02759 ] |
| |
| There exists a deadlock with range_cyclic that has existed forever. If |
| we loop around with a bio already built we could deadlock with a writer |
| who has the page locked that we're attempting to write but is waiting on |
| a page in our bio to be written out. The task traces are as follows |
| |
| PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5" |
| #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f |
| #1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3 |
| #2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42 |
| #3 [ffffc900297bb708] __lock_page at ffffffff811f145b |
| #4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502 |
| #5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684 |
| #6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff |
| #7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0 |
| #8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2 |
| #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd |
| |
| PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND: |
| "aio-dio-invalid" |
| #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f |
| #1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3 |
| #2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42 |
| #3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6 |
| #4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7 |
| #5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359 |
| #6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933 |
| #7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8 |
| #8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d |
| #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032 |
| |
| I used drgn to find the respective pages we were stuck on |
| |
| page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901 |
| page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874 |
| |
| As you can see the kworker is waiting for bit 0 (PG_locked) on index |
| 7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index |
| 8148. aio-dio-invalid has 7680, and the kworker epd looks like the |
| following |
| |
| crash> struct extent_page_data ffffc900297bbbb0 |
| struct extent_page_data { |
| bio = 0xffff889f747ed830, |
| tree = 0xffff889eed6ba448, |
| extent_locked = 0, |
| sync_io = 0 |
| } |
| |
| Probably worth mentioning as well that it waits for writeback of the |
| page to complete while holding a lock on it (at prepare_pages()). |
| |
| Using drgn I walked the bio pages looking for page |
| 0xffffea00fbfc7500 which is the one we're waiting for writeback on |
| |
| bio = Object(prog, 'struct bio', address=0xffff889f747ed830) |
| for i in range(0, bio.bi_vcnt.value_()): |
| bv = bio.bi_io_vec[i] |
| if bv.bv_page.value_() == 0xffffea00fbfc7500: |
| print("FOUND IT") |
| |
| which validated what I suspected. |
| |
| The fix for this is simple, flush the epd before we loop back around to |
| the beginning of the file during writeout. |
| |
| Fixes: b293f02e1423 ("Btrfs: Add writepages support") |
| CC: stable@vger.kernel.org # 4.4+ |
| Reviewed-by: Filipe Manana <fdmanana@suse.com> |
| Signed-off-by: Josef Bacik <josef@toxicpanda.com> |
| Signed-off-by: David Sterba <dsterba@suse.com> |
| Signed-off-by: Sasha Levin <sashal@kernel.org> |
| --- |
| fs/btrfs/extent_io.c | 8 ++++++++ |
| 1 file changed, 8 insertions(+) |
| |
| diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c |
| index fed44390c0492..11efb4f5041c7 100644 |
| --- a/fs/btrfs/extent_io.c |
| +++ b/fs/btrfs/extent_io.c |
| @@ -4014,6 +4014,14 @@ static int extent_write_cache_pages(struct address_space *mapping, |
| */ |
| scanned = 1; |
| index = 0; |
| + |
| + /* |
| + * If we're looping we could run into a page that is locked by a |
| + * writer and that writer could be waiting on writeback for a |
| + * page in our current bio, and thus deadlock, so flush the |
| + * write bio here. |
| + */ |
| + flush_write_bio(epd); |
| goto retry; |
| } |
| |
| -- |
| 2.20.1 |
| |