| From 60d9f50308e5df19bc18c2fefab0eba4a843900a Mon Sep 17 00:00:00 2001 |
| From: Filipe Manana <fdmanana@suse.com> |
| Date: Thu, 16 May 2019 15:48:55 +0100 |
| Subject: Btrfs: fix fsync not persisting changed attributes of a directory |
| |
| From: Filipe Manana <fdmanana@suse.com> |
| |
| commit 60d9f50308e5df19bc18c2fefab0eba4a843900a upstream. |
| |
| While logging an inode we follow its ancestors and for each one we mark |
| it as logged in the current transaction, even if we have not logged it. |
| As a consequence if we change an attribute of an ancestor, such as the |
| UID or GID for example, and then explicitly fsync it, we end up not |
| logging the inode at all despite returning success to user space, which |
| results in the attribute being lost if a power failure happens after |
| the fsync. |
| |
| Sample reproducer: |
| |
| $ mkfs.btrfs -f /dev/sdb |
| $ mount /dev/sdb /mnt |
| |
| $ mkdir /mnt/dir |
| $ chown 6007:6007 /mnt/dir |
| |
| $ sync |
| |
| $ chown 9003:9003 /mnt/dir |
| $ touch /mnt/dir/file |
| $ xfs_io -c fsync /mnt/dir/file |
| |
| # fsync our directory after fsync'ing the new file, should persist the |
| # new values for the uid and gid. |
| $ xfs_io -c fsync /mnt/dir |
| |
| <power failure> |
| |
| $ mount /dev/sdb /mnt |
| $ stat -c %u:%g /mnt/dir |
| 6007:6007 |
| |
| --> should be 9003:9003, the uid and gid were not persisted, despite |
| the explicit fsync on the directory prior to the power failure |
| |
| Fix this by not updating the logged_trans field of ancestor inodes when |
| logging an inode, since we have not logged them. Let only future calls to |
| btrfs_log_inode() to mark inodes as logged. |
| |
| This could be triggered by my recent fsync fuzz tester for fstests, for |
| which an fstests patch exists titled "fstests: generic, fsync fuzz tester |
| with fsstress". |
| |
| Fixes: 12fcfd22fe5b ("Btrfs: tree logging unlink/rename fixes") |
| CC: stable@vger.kernel.org # 4.4+ |
| Signed-off-by: Filipe Manana <fdmanana@suse.com> |
| Signed-off-by: David Sterba <dsterba@suse.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| |
| --- |
| fs/btrfs/tree-log.c | 12 ------------ |
| 1 file changed, 12 deletions(-) |
| |
| --- a/fs/btrfs/tree-log.c |
| +++ b/fs/btrfs/tree-log.c |
| @@ -5312,7 +5312,6 @@ static noinline int check_parent_dirs_fo |
| { |
| int ret = 0; |
| struct dentry *old_parent = NULL; |
| - struct btrfs_inode *orig_inode = inode; |
| |
| /* |
| * for regular files, if its inode is already on disk, we don't |
| @@ -5332,16 +5331,6 @@ static noinline int check_parent_dirs_fo |
| } |
| |
| while (1) { |
| - /* |
| - * If we are logging a directory then we start with our inode, |
| - * not our parent's inode, so we need to skip setting the |
| - * logged_trans so that further down in the log code we don't |
| - * think this inode has already been logged. |
| - */ |
| - if (inode != orig_inode) |
| - inode->logged_trans = trans->transid; |
| - smp_mb(); |
| - |
| if (btrfs_must_commit_transaction(trans, inode)) { |
| ret = 1; |
| break; |
| @@ -6070,7 +6059,6 @@ void btrfs_record_unlink_dir(struct btrf |
| * if this directory was already logged any new |
| * names for this file/dir will get recorded |
| */ |
| - smp_mb(); |
| if (dir->logged_trans == trans->transid) |
| return; |
| |