Re: 2.5.60-BK reproducible oops, during LTP run

Andrew Morton (akpm@digeo.com)
Wed, 12 Feb 2003 23:07:10 -0800


Jeff Garzik <jgarzik@pobox.com> wrote:
>
> I have reproduced the BUG in fs/buffer.c:2533 twice now. Test
> conditions exactly the same, fsx-linux in one window, LTP in another window.
>

Thanks. It's a long-standing but benign bogon which was exposed by recent
ext3 simplifications. This needs lots of testing.

when ext3_writepage races with truncate, block_write_full_page() will see
that the page is outside i_size and will bale out with -EIO. But
ext3_writepage() will ignore this and will proceed to add the buffers to the
transaction.

Later, kjournald tries to write them out and goes BUG() because those buffers
are not mapped to disk.

The fix is to not attach the buffers to the transaction in ext3_writepage()
if block_write_full_page() failed.

So far so good, but that page now has dirty, unmapped buffers (the buffers
were attached in a dirty state by ext3_writepage()). So teach
block_write_full_page() to clean the buffers against the page if it is wholly
outside i_size.

(A simpler fix to all of this might be to just bale out of ext3_writepage()
if the page is outside i_size. But that is racy against
block_write_full_page()'s subsequent execution of the same comparison).

buffer.c | 6 ++++++
ext3/inode.c | 13 ++++++++++---
2 files changed, 16 insertions(+), 3 deletions(-)

diff -puN fs/ext3/inode.c~ext3-eio-fix fs/ext3/inode.c
--- 25/fs/ext3/inode.c~ext3-eio-fix 2003-02-12 22:32:07.000000000 -0800
+++ 25-akpm/fs/ext3/inode.c 2003-02-12 22:48:40.000000000 -0800
@@ -1357,10 +1357,17 @@ static int ext3_writepage(struct page *p
handle = ext3_journal_current_handle();
lock_kernel();

- /* And attach them to the current transaction */
+ /*
+ * And attach them to the current transaction. But only if
+ * block_write_full_page() succeeded. Otherwise they are unmapped,
+ * and generally junk.
+ */
if (order_data) {
- err = walk_page_buffers(handle, page_bufs,
- 0, PAGE_CACHE_SIZE, NULL, ext3_journal_dirty_data);
+ if (ret == 0) {
+ err = walk_page_buffers(handle, page_bufs,
+ 0, PAGE_CACHE_SIZE, NULL,
+ ext3_journal_dirty_data);
+ }
walk_page_buffers(handle, page_bufs, 0,
PAGE_CACHE_SIZE, NULL, bput_one);
if (!ret)
diff -puN fs/buffer.c~ext3-eio-fix fs/buffer.c
--- 25/fs/buffer.c~ext3-eio-fix 2003-02-12 22:55:03.000000000 -0800
+++ 25-akpm/fs/buffer.c 2003-02-12 22:59:23.000000000 -0800
@@ -2502,6 +2502,12 @@ int block_write_full_page(struct page *p
/* Is the page fully outside i_size? (truncate in progress) */
offset = inode->i_size & (PAGE_CACHE_SIZE-1);
if (page->index >= end_index+1 || !offset) {
+ /*
+ * The page may have dirty, unmapped buffers. For example,
+ * they may have been added in ext3_writepage(). Make them
+ * freeable here, so the page does not leak.
+ */
+ block_invalidatepage(page, 0);
unlock_page(page);
return -EIO;
}

_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/