| From 372a03e01853f860560eade508794dd274e9b390 Mon Sep 17 00:00:00 2001 |
| From: Lukas Czerner <lczerner@redhat.com> |
| Date: Thu, 14 Mar 2019 23:20:25 -0400 |
| Subject: ext4: fix data corruption caused by unaligned direct AIO |
| |
| From: Lukas Czerner <lczerner@redhat.com> |
| |
| commit 372a03e01853f860560eade508794dd274e9b390 upstream. |
| |
| Ext4 needs to serialize unaligned direct AIO because the zeroing of |
| partial blocks of two competing unaligned AIOs can result in data |
| corruption. |
| |
| However it decides not to serialize if the potentially unaligned aio is |
| past i_size with the rationale that no pending writes are possible past |
| i_size. Unfortunately if the i_size is not block aligned and the second |
| unaligned write lands past i_size, but still into the same block, it has |
| the potential of corrupting the previous unaligned write to the same |
| block. |
| |
| This is (very simplified) reproducer from Frank |
| |
| // 41472 = (10 * 4096) + 512 |
| // 37376 = 41472 - 4096 |
| |
| ftruncate(fd, 41472); |
| io_prep_pwrite(iocbs[0], fd, buf[0], 4096, 37376); |
| io_prep_pwrite(iocbs[1], fd, buf[1], 4096, 41472); |
| |
| io_submit(io_ctx, 1, &iocbs[1]); |
| io_submit(io_ctx, 1, &iocbs[2]); |
| |
| io_getevents(io_ctx, 2, 2, events, NULL); |
| |
| Without this patch the 512B range from 40960 up to the start of the |
| second unaligned write (41472) is going to be zeroed overwriting the data |
| written by the first write. This is a data corruption. |
| |
| 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |
| * |
| 00009200 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 |
| * |
| 0000a000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |
| * |
| 0000a200 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 |
| |
| With this patch the data corruption is avoided because we will recognize |
| the unaligned_aio and wait for the unwritten extent conversion. |
| |
| 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |
| * |
| 00009200 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 |
| * |
| 0000a200 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 |
| * |
| 0000b200 |
| |
| Reported-by: Frank Sorenson <fsorenso@redhat.com> |
| Signed-off-by: Lukas Czerner <lczerner@redhat.com> |
| Signed-off-by: Theodore Ts'o <tytso@mit.edu> |
| Fixes: e9e3bcecf44c ("ext4: serialize unaligned asynchronous DIO") |
| Cc: stable@vger.kernel.org |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| |
| --- |
| fs/ext4/file.c | 2 +- |
| 1 file changed, 1 insertion(+), 1 deletion(-) |
| |
| --- a/fs/ext4/file.c |
| +++ b/fs/ext4/file.c |
| @@ -79,7 +79,7 @@ ext4_unaligned_aio(struct inode *inode, |
| struct super_block *sb = inode->i_sb; |
| int blockmask = sb->s_blocksize - 1; |
| |
| - if (pos >= i_size_read(inode)) |
| + if (pos >= ALIGN(i_size_read(inode), sb->s_blocksize)) |
| return 0; |
| |
| if ((pos | iov_iter_alignment(from)) & blockmask) |