writeback: make DIRTY_PAGES tracking cgroup writeback aware
I_DIRTY_PAGES on inode->i_state tracks whether its address_space
contains dirty pages. When cgroup writeback is used, an address_space
can be dirtied against multiple wb's (bdi_writeback's) and we want to
be able to track dirty state per iwbl (inode_wb_link).
This patch adds IWBL_DIRTY_PAGES which tracks whether an iwbl has
dirty pages. It's set along with I_DIRTY_PAGES when an inode gets
dirtied but because the radix tree tags can't carry which iwbl's pages
are dirtied against, whether an iwbl became clean can't be decided by
testing PAGECACHE_TAG_DIRTY. Instead, it's opportunistically cleared
after a whole address_space writeback and when I_DIRTY_PAGES is
cleared. This isn't ideal but the cost of inaccuracies should be
reasonable. See the comment on top of I_DIRTY_PAGES definition for
more info.
Note that non-root iwbl's are only attributed with dirty pages, the
metadata dirtiness - I_DIRTY_SYNC and I_DIRTY_DATASYNC - are always
attributed to the root iwbl. This means that when an inode gets
dirtied for both metadata and dirty pages from non-root cgroup, it
will dirty both the root iwbl for the metadata and the matching cgroup
iwbl for the dirty pages.
This encapsulates I_DIRTY_* manipulations and testing through new
functions - iwbl_has_enough_dirty(), iwbl_set_dirty() and
iwbl_still_has_dirty_pages() - and introduces another mb which is
paired with the one in __mark_inode_dirty_dctx() to interlock
IWBL_DIRTY_PAGES testing and clearing. Comments for the mb's are
updated to reflect it.
write_cache_pages() is updated to use
mapping_writeback_{maybe|confirm}_whole() to clear IWBL_DIRTY_PAGES
opportunistically. Filesystems which implement custom writepages
should be updated similarly to support cgroup writeback.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
4 files changed