Re: 2.6.19 file content corruption on ext3

2006-12-29 Thread Dave Jones
On Fri, Dec 29, 2006 at 07:52:15PM +0100, maximilian attems wrote: > > The only -mm stuff I recall being in the Fedora 2.6.18 is > > the inode-diet stuff which ended up in 2.6.19, though the xmas > > break has left my head somewhat empty so I may be forgetting something. > > What patch in par

Re: 2.6.19 file content corruption on ext3

2006-12-29 Thread maximilian attems
On Fri, Dec 29, 2006 at 10:02:53AM -0500, Dave Jones wrote: > On Fri, Dec 29, 2006 at 10:23:14AM +0100, maximilian attems wrote: > > > On Thu, Dec 28, 2006 at 11:21:21AM -0800, Linus Torvalds wrote: > > > > > That was a Fedora kernel. Has anyone seen the corruption in vanilla > 2.6.18 > > >

Re: 2.6.19 file content corruption on ext3

2006-12-29 Thread Guillaume Chazarain
Linus Torvalds a écrit : going back to Linux-2.6.5 at least, according to one tester). I apologize for the confusion, but it just occurred to me that I was actually experiencing a totally different problem: I set a root filesystem of 3Mib for qemu, so the test program just didn't have eno

Re: 2.6.19 file content corruption on ext3

2006-12-29 Thread Dave Jones
On Fri, Dec 29, 2006 at 10:23:14AM +0100, maximilian attems wrote: > > On Thu, Dec 28, 2006 at 11:21:21AM -0800, Linus Torvalds wrote: > > > > > > > > > On Thu, 28 Dec 2006, Petri Kaukasoina wrote: > > > > > me up), and that seems to show the corruption going way way back > > (ie going

Re: 2.6.19 file content corruption on ext3

2006-12-29 Thread maximilian attems
> On Thu, Dec 28, 2006 at 11:21:21AM -0800, Linus Torvalds wrote: > > > > > > On Thu, 28 Dec 2006, Petri Kaukasoina wrote: > > > > me up), and that seems to show the corruption going way way back (ie > going > > > > back to Linux-2.6.5 at least, according to one tester). > > > > > > Tha

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Andrew Morton
On Thu, 28 Dec 2006 17:38:38 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: > in > the hope that somebody else is working on this corruption issue and is > interested.. What corruption issue? ;) I'm finding that the corruption happens trivially with your test app, but apparently doesn'

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Linus Torvalds
Btw, much cleaned-up page tracing patch here, in case anybody cares (and "test.c" attached, although I don't think it changed since last time). The test.c output is a bit hard to read at times, since it will give offsets in bytes as hex (ie "00a77664" means page frame 0a77, and byte 664

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Anton Altaparmakov
On Thu, 28 Dec 2006, Linus Torvalds wrote: > Ok, > with the ugly trace capture patch, I've actually captured this corruption > in action, I think. > > I did a full trace of all pages involved in one run, and picked one > corruption at random: > > Chunk 14465 corrupted (0-75) (01423fb4-0

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Linus Torvalds
On Thu, 28 Dec 2006, Anton Altaparmakov wrote: > > But are chunks 3 and 4 in separate buffer heads? Sorry could not see it > immediately from the output you showed... No, this is a 4kB filesystem. A single bh per page. > It is just that there may be a different cause rather than buffer dirty

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Linus Torvalds
On Thu, 28 Dec 2006, David Miller wrote: > > What happens when we writeback, to the PTEs? Not a damn thing. We clear the PTE's _before_ we even start the write. The writeback does nothing to them. If the user dirties the page while writeback is in progress, we'll take the page fault and re-d

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread David Miller
From: Linus Torvalds <[EMAIL PROTECTED]> Date: Thu, 28 Dec 2006 14:37:37 -0800 (PST) > So if we're not losing any dirty bits, what's going on? What happens when we writeback, to the PTEs? page_mkclean_file() iterates the VMAs and when it finds a shared one it goes: entry = ptep_

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Linus Torvalds
Ok, with the ugly trace capture patch, I've actually captured this corruption in action, I think. I did a full trace of all pages involved in one run, and picked one corruption at random: Chunk 14465 corrupted (0-75) (01423fb4-01423fff) Expected 129, got 0 Written as

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Russell King
On Thu, Dec 28, 2006 at 01:24:30PM -0800, Linus Torvalds wrote: > On Thu, 28 Dec 2006, Linus Torvalds wrote: > > > > What we need now is actually looking at the source code, and people who > > understand the VM, I'm afraid. I'm gathering traces now that I have a good > > test-case. I'll post my

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Linus Torvalds
On Thu, 28 Dec 2006, Linus Torvalds wrote: > > What we need now is actually looking at the source code, and people who > understand the VM, I'm afraid. I'm gathering traces now that I have a good > test-case. I'll post my trace tools once I've tested that they work, in > case others want to h

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Arjan van de Ven
On Thu, 2006-12-28 at 14:39 -0500, Dave Jones wrote: > On Thu, Dec 28, 2006 at 11:21:21AM -0800, Linus Torvalds wrote: > > > > > > On Thu, 28 Dec 2006, Petri Kaukasoina wrote: > > > > me up), and that seems to show the corruption going way way back (ie > going > > > > back to Linux-2.6.5 a

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Dave Jones
On Thu, Dec 28, 2006 at 11:21:21AM -0800, Linus Torvalds wrote: > > > On Thu, 28 Dec 2006, Petri Kaukasoina wrote: > > > me up), and that seems to show the corruption going way way back (ie > > > going > > > back to Linux-2.6.5 at least, according to one tester). > > > > That was a Fed

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Linus Torvalds
On Thu, 28 Dec 2006, Petri Kaukasoina wrote: > > me up), and that seems to show the corruption going way way back (ie going > > back to Linux-2.6.5 at least, according to one tester). > > That was a Fedora kernel. Has anyone seen the corruption in vanilla 2.6.18 > (or older)? Well, that was a

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Petri Kaukasoina
On Thu, Dec 28, 2006 at 11:00:46AM -0800, Linus Torvalds wrote: > And I have a test-program that shows the corruption _much_ easier (at > least according to my own testing, and that of several reporters that back > me up), and that seems to show the corruption going way way back (ie going > back

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Linus Torvalds
On Thu, 28 Dec 2006, Marc Haber wrote: > > After being up for ten days, I have now encountered the file > corruption of pkgcache.bin for the first time again. The 256 MB i386 > box is like 26M in swap, is under very moderate load. > > I am running plain vanilla 2.6.19.1. Is there a patch that I

Re: 2.6.19 file content corruption on ext3

2006-12-28 Thread Marc Haber
On Tue, Dec 19, 2006 at 09:51:49AM +0100, Marc Haber wrote: > On Sun, Dec 17, 2006 at 09:43:08PM -0800, Andrew Morton wrote: > > Six hours here of fsx-linux plus high memory pressure on SMP on 1k > > blocksize ext3, mainline. Zero failures. It's unlikely that this testing > > would pass, yet peop

Re: 2.6.19 file content corruption on ext3

2006-12-22 Thread Linus Torvalds
On Mon, 18 Dec 2006, Gene Heskett wrote: > > What about the mm/rmap.c one liner, in or out? The one that just removes the "pte_mkclean()"? That's definitely out, it was just a test-patch to verify that the pte dirty bits seemed to matter at all (and they do). Linus - To unsubs

Re: 2.6.19 file content corruption on ext3

2006-12-22 Thread Marc Haber
On Sat, Dec 16, 2006 at 06:43:10PM +, Martin Michlmayr wrote: > * Marc Haber <[EMAIL PROTECTED]> [2006-12-09 10:26]: > > Unfortunately, I am lacking the knowledge needed to do this in an > > informed way. I am neither familiar enough with git nor do I possess > > the necessary C powers. > > I

Re: 2.6.19 file content corruption on ext3

2006-12-22 Thread Marc Haber
On Fri, Dec 22, 2006 at 08:30:06AM -0500, Daniel Drake wrote: > Marc Haber wrote: > >After updating to 2.6.19, Debian's apt control file > >/var/cache/apt/pkgcache.bin corrupts pretty frequently - like in under > >six hours. In that situation, "aptitude update" segfaults. When I > >delete the file

Re: 2.6.19 file content corruption on ext3

2006-12-22 Thread Daniel Drake
Marc Haber wrote: After updating to 2.6.19, Debian's apt control file /var/cache/apt/pkgcache.bin corrupts pretty frequently - like in under six hours. In that situation, "aptitude update" segfaults. When I delete the file and have apt recreate it, things are fine again for a few hours before the

Re: 2.6.19 file content corruption on ext3

2006-12-21 Thread Andrew Morton
On Thu, 21 Dec 2006 14:03:20 +0100 Peter Zijlstra <[EMAIL PROTECTED]> wrote: > On Tue, 2006-12-19 at 09:43 -0800, Linus Torvalds wrote: > > > > Btw, > > here's a totally new tangent on this: it's possible that user code is > > simply BUGGY. > > depmod: BADNESS: written outside isize 22183 ak

Re: 2.6.19 file content corruption on ext3

2006-12-21 Thread Peter Zijlstra
On Tue, 2006-12-19 at 09:43 -0800, Linus Torvalds wrote: > > Btw, > here's a totally new tangent on this: it's possible that user code is > simply BUGGY. depmod: BADNESS: written outside isize 22183 --- diff --git a/fs/buffer.c b/fs/buffer.c index d1f1b54..5db9fd9 100644 --- a/fs/buffer.c +++

Re: 2.6.19 file content corruption on ext3

2006-12-20 Thread Stephen Clark
Peter Zijlstra wrote: On Tue, 2006-12-19 at 10:59 -0800, Linus Torvalds wrote: On Tue, 19 Dec 2006, Linus Torvalds wrote: here's a totally new tangent on this: it's possible that user code is simply BUGGY. I'm sad to say this doesn't trigger :-( - To unsubscribe from this l

Re: 2.6.19 file content corruption on ext3

2006-12-20 Thread Peter Zijlstra
On Wed, 2006-12-20 at 18:30 +0200, Andrei Popa wrote: > On Wed, 2006-12-20 at 15:23 +0100, Peter Zijlstra wrote: > > On Wed, 2006-12-20 at 16:15 +0200, Andrei Popa wrote: > > > On Wed, 2006-12-20 at 00:42 +0100, Peter Zijlstra wrote: > > > > On Mon, 2006-12-18 at 12:14 -0800, Linus Torvalds wrote:

Re: 2.6.19 file content corruption on ext3

2006-12-20 Thread Andrei Popa
On Wed, 2006-12-20 at 15:23 +0100, Peter Zijlstra wrote: > On Wed, 2006-12-20 at 16:15 +0200, Andrei Popa wrote: > > On Wed, 2006-12-20 at 00:42 +0100, Peter Zijlstra wrote: > > > On Mon, 2006-12-18 at 12:14 -0800, Linus Torvalds wrote: > > > > > > > OR: > > > > > > > > - page_mkclean_one() is s

Re: 2.6.19 file content corruption on ext3

2006-12-20 Thread Martin Schwidefsky
On Wed, 2006-12-20 at 10:01 +0100, Peter Zijlstra wrote: > Also, what is this page_test_and_clear_dirty() business, that seems to > be exclusively s390 btw. However they do seem to need this. > > > But the "ptep_get_and_clear() + flush_tlb_page()" sequence should > > hopefully also work. > > Yeah

Re: 2.6.19 file content corruption on ext3

2006-12-20 Thread Peter Zijlstra
On Wed, 2006-12-20 at 16:15 +0200, Andrei Popa wrote: > On Wed, 2006-12-20 at 00:42 +0100, Peter Zijlstra wrote: > > On Mon, 2006-12-18 at 12:14 -0800, Linus Torvalds wrote: > > > > > OR: > > > > > > - page_mkclean_one() is simply buggy. > > > > GOLD! > > > > it seems to work with all this (fu

Re: 2.6.19 file content corruption on ext3

2006-12-20 Thread Andrei Popa
On Wed, 2006-12-20 at 00:42 +0100, Peter Zijlstra wrote: > On Mon, 2006-12-18 at 12:14 -0800, Linus Torvalds wrote: > > > OR: > > > > - page_mkclean_one() is simply buggy. > > GOLD! > > it seems to work with all this (full diff against current git). > > /me rebuilds full kernel to make sure..

Re: 2.6.19 file content corruption on ext3

2006-12-20 Thread Arjan van de Ven
> Hmm, should we not flush after clearing the dirty bit? That is, why does > ptep_clear_flush_dirty() need a flush after clearing that bit? does it > leak through in the tlb copy? afaics you need to 1) clear 2) flush 3) check and go to 1) if needed to be race free. - To unsubscribe from th

Re: 2.6.19 file content corruption on ext3

2006-12-20 Thread Peter Zijlstra
On Tue, 2006-12-19 at 16:23 -0800, Linus Torvalds wrote: > Pls test. Is good. Only s390 remains a question. Another point, change_protection() also does a cache flush, should we too? > > diff --git a/mm/rmap.c b/mm/rmap.c > index d8a842a..eec8706 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c

Re: 2.6.19 file content corruption on ext3

2006-12-20 Thread Peter Zijlstra
On Wed, 2006-12-20 at 10:01 +0100, Peter Zijlstra wrote: > I will try, but I had a look around the different architectures > implementation of ptep_clear_flush_dirty() and saw that not all do the > actual flush. So if we go down this road perhaps we should introduce > another per arch function tha

Re: 2.6.19 file content corruption on ext3

2006-12-20 Thread Peter Zijlstra
On Tue, 2006-12-19 at 16:23 -0800, Linus Torvalds wrote: > > On Wed, 20 Dec 2006, Peter Zijlstra wrote: > > On Mon, 2006-12-18 at 12:14 -0800, Linus Torvalds wrote: > > > OR: > > > > > > - page_mkclean_one() is simply buggy. > > > > GOLD! > > Ok. I was looking at that, and I wondered.. > > Ho

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Jari Sundell
On 12/20/06, Linus Torvalds <[EMAIL PROTECTED]> wrote: On Tue, 19 Dec 2006, Linus Torvalds wrote: > > here's a totally new tangent on this: it's possible that user code is > simply BUGGY. Btw, here's a simpler test-program that actually shows the difference between 2.6.18 and 2.6.19 in action,

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Linus Torvalds
On Wed, 20 Dec 2006, Peter Zijlstra wrote: > On Mon, 2006-12-18 at 12:14 -0800, Linus Torvalds wrote: > > OR: > > > > - page_mkclean_one() is simply buggy. > > GOLD! Ok. I was looking at that, and I wondered.. However, if that works, then I _think_ the correct sequence is the following.. T

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Andrew Morton
On Tue, 19 Dec 2006 16:03:49 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Wed, 20 Dec 2006, Peter Zijlstra wrote: > > > On Tue, 2006-12-19 at 14:58 -0800, Andrew Morton wrote: > > > > > Well... we'd need to see (corruption && this-not-triggering) to be sure. > > > > > > Pete

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Linus Torvalds
On Wed, 20 Dec 2006, Peter Zijlstra wrote: > On Tue, 2006-12-19 at 14:58 -0800, Andrew Morton wrote: > > > Well... we'd need to see (corruption && this-not-triggering) to be sure. > > > > Peter, have you been able to trigger the corruption? > > Yes; however the mail I send describing that see

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Peter Zijlstra
On Mon, 2006-12-18 at 12:14 -0800, Linus Torvalds wrote: > OR: > > - page_mkclean_one() is simply buggy. GOLD! it seems to work with all this (full diff against current git). /me rebuilds full kernel to make sure... reboot... test... pff the tension... yay, still good! Andrei; would you

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Peter Zijlstra
On Tue, 2006-12-19 at 14:58 -0800, Andrew Morton wrote: > Well... we'd need to see (corruption && this-not-triggering) to be sure. > > Peter, have you been able to trigger the corruption? Yes; however the mail I send describing that seems to be lost in space. /me quotes from the send folder: >

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Peter Zijlstra
On Wed, 2006-12-20 at 00:06 +0100, Peter Zijlstra wrote: > On Tue, 2006-12-19 at 14:58 -0800, Andrew Morton wrote: > > > Well... we'd need to see (corruption && this-not-triggering) to be sure. > > > > Peter, have you been able to trigger the corruption? > > Yes; however the mail I send describi

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Andrew Morton
On Tue, 19 Dec 2006 14:51:55 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Tue, 19 Dec 2006, Peter Zijlstra wrote: > > > On Tue, 2006-12-19 at 10:59 -0800, Linus Torvalds wrote: > > > > > > On Tue, 19 Dec 2006, Linus Torvalds wrote: > > > > > > > > here's a totally ne

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Linus Torvalds
On Tue, 19 Dec 2006, Peter Zijlstra wrote: > On Tue, 2006-12-19 at 10:59 -0800, Linus Torvalds wrote: > > > > On Tue, 19 Dec 2006, Linus Torvalds wrote: > > > > > > here's a totally new tangent on this: it's possible that user code is > > > simply BUGGY. > > I'm sad to say this doesn't trig

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Florian Weimer
* Linus Torvalds: > Now, this should _matter_ only for user processes that are buggy, > and that have written to the page _before_ extending it with > ftruncate(). APT seems to properly extend the file before mapping it, by writing a zero byte at the desired position (creating a hole). 24986 ope

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Peter Zijlstra
On Tue, 2006-12-19 at 10:59 -0800, Linus Torvalds wrote: > > On Tue, 19 Dec 2006, Linus Torvalds wrote: > > > > here's a totally new tangent on this: it's possible that user code is > > simply BUGGY. I'm sad to say this doesn't trigger :-( - To unsubscribe from this list: send the line "unsu

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread dean gaudet
On Mon, 18 Dec 2006, Linus Torvalds wrote: > On Tue, 19 Dec 2006, Nick Piggin wrote: > > > > We never want to drop dirty data! (ignoring the truncate case, which is > > handled privately by truncate anyway) > > Bzzt. > > SURE we do. > > We absolutely do want to drop dirty data in the writeout

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Linus Torvalds
On Tue, 19 Dec 2006, Linus Torvalds wrote: > > here's a totally new tangent on this: it's possible that user code is > simply BUGGY. Btw, here's a simpler test-program that actually shows the difference between 2.6.18 and 2.6.19 in action, and why it could explain why a program like rtorren

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Linus Torvalds
Btw, here's a totally new tangent on this: it's possible that user code is simply BUGGY. There is one case where the kernel actually forcibly writes zeroes into a file: when we're writing a page that straddles the "inode->i_size" boundary. See the various writepages in fs/buffer.c, they all

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Linus Torvalds
On Tue, 19 Dec 2006, Nick Piggin wrote: > > Counterexample? Well AFAIKS, the clearing of PG_dirty in ttfb() in > response to finding all buffers clean is perfectly valid. What makes > you think otherwise? If the page really is clean, then why the heck cant' we just clean the page table bits to

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Linus Torvalds
On Tue, 19 Dec 2006, Nick Piggin wrote: > > Now I'm not exactly sure how ext3 (or any other) filesystems make use > of this particular feature of try_to_free_buffers(), but it is clear > from the comments what it is for. So your patch isn't really a minimal > fix (ie. it would require an OK from

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Peter Zijlstra
On Tue, 2006-12-19 at 21:58 +1100, Nick Piggin wrote: > Peter Zijlstra wrote: > > On Tue, 2006-12-19 at 02:32 -0800, Andrew Morton wrote: > > >>Well it used to be. After 2.6.19 it can do the wrong thing for mapped > >>pages. But it turns out that we don't feed it mapped pages, apart from > >>pag

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Nick Piggin
Peter Zijlstra wrote: On Tue, 2006-12-19 at 02:32 -0800, Andrew Morton wrote: Well it used to be. After 2.6.19 it can do the wrong thing for mapped pages. But it turns out that we don't feed it mapped pages, apart from pagevec_strip() and possibly races against pagefaults. So how about th

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Nick Piggin
Andrew Morton wrote: On Tue, 19 Dec 2006 20:56:50 +1100 Nick Piggin <[EMAIL PROTECTED]> wrote: I think it could be very likely that indeed the bug is a latent one in a clear_page_dirty caller, rather than dirty-tracking itself. The only callers are try_to_free_buffers(), truncate and a few

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Peter Zijlstra
On Tue, 2006-12-19 at 02:32 -0800, Andrew Morton wrote: > On Tue, 19 Dec 2006 20:56:50 +1100 > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > Linus Torvalds wrote: > > > > > NOTICE? First you make a BIG DEAL about how dirty bits should never get > > > lost, but THE VERY SAME FUNCTION actually very

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Andrew Morton
On Tue, 19 Dec 2006 02:32:55 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > > If a write-fault races with a read-fault and the write-fault loses, we forget > to mark the page dirty. No that isn't right, is it. The writer just retakes the fault and all the right things happen. Ho hum. - To

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Nick Piggin
Andrew Morton wrote: On Tue, 19 Dec 2006 20:56:50 +1100 Nick Piggin <[EMAIL PROTECTED]> wrote: Linus Torvalds wrote: NOTICE? First you make a BIG DEAL about how dirty bits should never get lost, but THE VERY SAME FUNCTION actually very much on purpose DOES drop the dirty bit for when it's

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Andrew Morton
On Tue, 19 Dec 2006 20:56:50 +1100 Nick Piggin <[EMAIL PROTECTED]> wrote: > Linus Torvalds wrote: > > > NOTICE? First you make a BIG DEAL about how dirty bits should never get > > lost, but THE VERY SAME FUNCTION actually very much on purpose DOES drop > > the dirty bit for when it's not in the

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Nick Piggin
Linus Torvalds wrote: On Tue, 19 Dec 2006, Nick Piggin wrote: Anyway it has the same issues as the others. See what happens when you run two test_clear_page_dirty_sync_ptes() consecutively, you still loose PG_dirty even though the page might actually be dirty. How can this happen? We'll only

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Martin Michlmayr
* Marc Haber <[EMAIL PROTECTED]> [2006-12-19 09:51]: > I do not have a clue about memory management at all, but is it > possible that you're testing on a box with too much memory? My box has > only 256 MB, and I used to use mutt with a _huge_ inbox with mutt > taking somewhat 150 MB. Add spamassass

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Marc Haber
On Tue, Dec 19, 2006 at 12:24:16AM -0800, Andrew Morton wrote: > Wow. I didn't expect that, because Mark Haber reported that ext3's > data=writeback > fixed it. Maybe he didn't run it for long enough? My test case is Debian's "aptitude update" running once an hour, and it was always the same f

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Peter Zijlstra
On Tue, 2006-12-19 at 10:00 +0100, Peter Zijlstra wrote: > On Tue, 2006-12-19 at 00:04 -0800, Linus Torvalds wrote: > > > Nobody has actually ever explained why "test_clear_page_dirty()" is good > > at all. > > > > - Why is it ever used instead of "clear_page_dirty_for_io()"? > > > > - What i

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Peter Zijlstra
On Tue, 2006-12-19 at 00:04 -0800, Linus Torvalds wrote: > Nobody has actually ever explained why "test_clear_page_dirty()" is good > at all. > > - Why is it ever used instead of "clear_page_dirty_for_io()"? > > - What is the difference? > > - Why would you EVER want to clear bits just in t

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Marc Haber
On Sun, Dec 17, 2006 at 09:43:08PM -0800, Andrew Morton wrote: > Six hours here of fsx-linux plus high memory pressure on SMP on 1k > blocksize ext3, mainline. Zero failures. It's unlikely that this testing > would pass, yet people running normal workloads are able to easily trigger > failures.

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Pekka Enberg
On 12/19/06, Andrew Morton <[EMAIL PROTECTED]> wrote: Wow. I didn't expect that, because Mark Haber reported that ext3's data=writeback fixed it. Maybe he didn't run it for long enough? I don't think it did fix it for Mark: http://marc.theaimsgroup.com/?l=linux-kernel&m=116625777306843&w=2

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Andrew Morton
On Tue, 19 Dec 2006 10:05:03 +0200 Andrei Popa <[EMAIL PROTECTED]> wrote: > > > > Also, it'd be useful if you could determine whether the bug appears with > > > > the ext2 filesystem: do s/ext3/ext2/ in /etc/fstab, or boot with > > > > rootfstype=ext2 if it's the root filesystem. > > > > I fave

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Linus Torvalds
On Tue, 19 Dec 2006, Nick Piggin wrote: > > > > Anyway it has the same issues as the others. See what happens when you > > run two test_clear_page_dirty_sync_ptes() consecutively, you still loose > > PG_dirty even though the page might actually be dirty. > > How can this happen? We'll only test

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Linus Torvalds
bits in the PTE's, and never touched them before, we never even realized that the code that played with PG_dirty was totally insane" Now, that's just a theory. And yeah, it may be stated a bit provocatively. It may not be entirely correct. I'm just saying.. maybe it is?

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Andrei Popa
> > > Also, it'd be useful if you could determine whether the bug appears with > > > the ext2 filesystem: do s/ext3/ext2/ in /etc/fstab, or boot with > > > rootfstype=ext2 if it's the root filesystem. > > I fave file corruption. - To unsubscribe from this list: send the line "unsubscribe linux-k

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Nick Piggin
Peter Zijlstra wrote: On Tue, 2006-12-19 at 15:36 +1100, Nick Piggin wrote: plain text document attachment (fs-fix.patch) Index: linux-2.6/fs/buffer.c === --- linux-2.6.orig/fs/buffer.c 2006-12-19 15:15:46.0 +1100 +++ lin

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Peter Zijlstra
On Mon, 2006-12-18 at 11:18 -0800, Linus Torvalds wrote: > > diff --git a/mm/rmap.c b/mm/rmap.c > > index d8a842a..3f9061e 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -448,7 +448,7 @@ static int page_mkclean_one(struct page > > goto unlock; > > > > entry = ptep_get_and

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Linus Torvalds
On Tue, 19 Dec 2006, Nick Piggin wrote: > > I wouldn't have thought it becomes clean by dropping it ;) Is this a > trick question? My answer is that we clean a page by by taking some > action such that the underlying data matches the data in RAM... Sure. > We don't "drop" any data until it has

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Peter Zijlstra
On Tue, 2006-12-19 at 15:36 +1100, Nick Piggin wrote: > plain text document attachment (fs-fix.patch) > Index: linux-2.6/fs/buffer.c > === > --- linux-2.6.orig/fs/buffer.c2006-12-19 15:15:46.0 +1100 > +++ linux-2.6/fs/

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Nick Piggin
Linus Torvalds wrote: On Tue, 19 Dec 2006, Nick Piggin wrote: We never want to drop dirty data! (ignoring the truncate case, which is handled privately by truncate anyway) Bzzt. SURE we do. We absolutely do want to drop dirty data in the writeout path. How do you think dirty data ever _b

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Linus Torvalds
On Tue, 19 Dec 2006, Nick Piggin wrote: > > We never want to drop dirty data! (ignoring the truncate case, which is > handled privately by truncate anyway) Bzzt. SURE we do. We absolutely do want to drop dirty data in the writeout path. How do you think dirty data ever _becomes_ clean data?

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Nick Piggin
Linus Torvalds wrote: On Mon, 18 Dec 2006, Peter Zijlstra wrote: This should be safe; page_mkclean walks the rmap and flips the pte's under the pte lock and records the dirty state while iterating. Concurrent faults will either do set_page_dirty() before we get around to doing it or vice versa,

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrei Popa
> > > If all of test_clear_page_dirty() has been commented out then the page > > > will > > > never become clean hence will never fall out of pagecache, so unless > > > Andrei > > > is doing a reboot before checking for corruption, perhaps the underlying > > > data on-disk is incorrect, but we c

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrew Morton
On Tue, 19 Dec 2006 03:44:51 +0200 Andrei Popa <[EMAIL PROTECTED]> wrote: > On Mon, 2006-12-18 at 17:21 -0800, Andrew Morton wrote: > > On Mon, 18 Dec 2006 16:57:30 -0800 (PST) > > Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > > What happens if you only ifdef out that single thing? > > > >

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrei Popa
On Mon, 2006-12-18 at 16:57 -0800, Linus Torvalds wrote: > > On Tue, 19 Dec 2006, Andrei Popa wrote: > > > > > > > > nope, no file corruption at all. > > > > > > Ok. That's interesting, but I think you actually #ifdef'ed out too > > > much: > > > > > > It was really just the _inner_ "if (mappi

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrei Popa
On Mon, 2006-12-18 at 17:21 -0800, Andrew Morton wrote: > On Mon, 18 Dec 2006 16:57:30 -0800 (PST) > Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > What happens if you only ifdef out that single thing? > > > > The actual page-cleaning functions make sure to only clear the TAG_DIRTY > > bit _af

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrew Morton
On Mon, 18 Dec 2006 16:57:30 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: > What happens if you only ifdef out that single thing? > > The actual page-cleaning functions make sure to only clear the TAG_DIRTY > bit _after_ the page has been marked for writeback. Is there some ordering >

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Gene Heskett
On Monday 18 December 2006 18:48, Andrei Popa wrote: >On Mon, 2006-12-18 at 14:32 -0800, Linus Torvalds wrote: >> On Mon, 18 Dec 2006, Andrei Popa wrote: >> > > This should be fairly easy to test: just change every single ", 1" >> > > case in the patch to ", 0". >> > > >> > > What happens for you i

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Linus Torvalds
On Tue, 19 Dec 2006, Andrei Popa wrote: > > > > > > nope, no file corruption at all. > > > > Ok. That's interesting, but I think you actually #ifdef'ed out too > > much: > > > > It was really just the _inner_ "if (mapping_cap_account_dirty(.." > > statement that I meant you should remove. >

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrei Popa
On Mon, 2006-12-18 at 16:04 -0800, Linus Torvalds wrote: > > On Tue, 19 Dec 2006, Andrei Popa wrote: > > > > > > There's exactly two call sites that call "page_mkclean()" (an dthat is > > > the > > > only thing in turn that calls "page_mkclean_one()", which we already > > > determined will cau

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Linus Torvalds
On Tue, 19 Dec 2006, Andrei Popa wrote: > > the corrupted file has a chink full with zeros > > http://193.226.119.62/corruption0.jpg > http://193.226.119.62/corruption1.jpg Thanks. Yup, filled with zeroes, and the corruption stops (but does _not_ start) at a page boundary. That _does_ look v

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrei Popa
On Mon, 2006-12-18 at 14:45 -0800, Linus Torvalds wrote: > > On Mon, 18 Dec 2006, Alessandro Suardi wrote: > > > > No idea whether this can be a data point or not, but > > here it goes... my P2P box is about to turn 5 days old > > while running nonstop one or both of aMule 2.1.3 and > > BitTorren

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Linus Torvalds
On Tue, 19 Dec 2006, Andrei Popa wrote: > > > > There's exactly two call sites that call "page_mkclean()" (an dthat is the > > only thing in turn that calls "page_mkclean_one()", which we already > > determined will cause the corruption). > > > > Can you just TOTALLY DISABLE that case for the

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrei Popa
On Mon, 2006-12-18 at 14:32 -0800, Linus Torvalds wrote: > > On Mon, 18 Dec 2006, Andrei Popa wrote: > > > > > > This should be fairly easy to test: just change every single ", 1" case > > > in > > > the patch to ", 0". > > > > > > What happens for you in that case? > > > > I have file corrupti

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Linus Torvalds
On Mon, 18 Dec 2006, Alessandro Suardi wrote: > > No idea whether this can be a data point or not, but > here it goes... my P2P box is about to turn 5 days old > while running nonstop one or both of aMule 2.1.3 and > BitTorrent 4.4.0 on ext3 mounted w/default options > on both IDE and USB disks.

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Linus Torvalds
On Mon, 18 Dec 2006, Andrei Popa wrote: > > > > This should be fairly easy to test: just change every single ", 1" case in > > the patch to ", 0". > > > > What happens for you in that case? > > I have file corruption. Magic. And btw, _thanks_ for being such a great tester. So now I have one m

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Gene Heskett
On Monday 18 December 2006 15:41, Linus Torvalds wrote: >On Mon, 18 Dec 2006, Linus Torvalds wrote: >> But at the same time, it's interesting that it still happens when we >> try to re-add the dirty bit. That would tell me that it's one of two >> cases: > >Forget that. There's a third case, which i

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Alessandro Suardi
On 12/18/06, Andrei Popa <[EMAIL PROTECTED]> wrote: On Mon, 2006-12-18 at 12:41 -0800, Linus Torvalds wrote: > > On Mon, 18 Dec 2006, Linus Torvalds wrote: > > > > But at the same time, it's interesting that it still happens when we try > > to re-add the dirty bit. That would tell me that it's on

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Peter Zijlstra
On Mon, 2006-12-18 at 12:14 -0800, Linus Torvalds wrote: > > On Mon, 18 Dec 2006, Andrei Popa wrote: > > > > I dropped that patch and added WARN_ON(1), the unified patch is > > attached. > > > > I got corruption: "Hash check on download completion found bad chunks, > > consider using "safe_sync"

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrew Morton
On Mon, 18 Dec 2006 12:14:35 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: > OR: > > - page_mkclean_one() is simply buggy. > > And I'm starting to wonder about the second case. But it all LOOKS really > fine - I can't see anything wrong there (it uses the extremely > conservative "pte

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrei Popa
On Mon, 2006-12-18 at 12:41 -0800, Linus Torvalds wrote: > > On Mon, 18 Dec 2006, Linus Torvalds wrote: > > > > But at the same time, it's interesting that it still happens when we try > > to re-add the dirty bit. That would tell me that it's one of two cases: > > Forget that. There's a third c

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Linus Torvalds
On Mon, 18 Dec 2006, Linus Torvalds wrote: > > But at the same time, it's interesting that it still happens when we try > to re-add the dirty bit. That would tell me that it's one of two cases: Forget that. There's a third case, which is much more likely: - Andrew's patch had a ", 1" where i

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Linus Torvalds
On Mon, 18 Dec 2006, Andrei Popa wrote: > > I dropped that patch and added WARN_ON(1), the unified patch is > attached. > > I got corruption: "Hash check on download completion found bad chunks, > consider using "safe_sync"." Ok. That is actually _very_ interesting. It's interesting because (

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Andrei Popa
On Mon, 2006-12-18 at 11:18 -0800, Linus Torvalds wrote: > > On Mon, 18 Dec 2006, Andrei Popa wrote: > > > > I applied Linus patch, Andrew patch, Peter Zijlstra patches(the last > > two). All unified patch is attached. I tested and I have no corruption. > > That wasn't very interesting, because

Re: 2.6.19 file content corruption on ext3

2006-12-18 Thread Linus Torvalds
On Mon, 18 Dec 2006, Andrei Popa wrote: > > I applied Linus patch, Andrew patch, Peter Zijlstra patches(the last > two). All unified patch is attached. I tested and I have no corruption. That wasn't very interesting, because you also had the patch that just disabled "page_mkclean_one()" entire

  1   2   >