Reiserfs, 3 Raid1 arrays, 2.4.1 machine locks up
I'm not subscribed to the kernel mailing list, so please cc any replies to me. I'm building a firewall on a P133 with 48 MB of memory using RH 7.0, latest updates, etc. and kernel 2.4.1. I've built a customized install of RH (~200MB) which I untar onto the system after building my raid arrays, etc. via a Rescue CD which I created using Timo's Rescue CD project. The booting kernel is 2.4.1-ac10, no networking, raid compiled in but raid1 as a module, reiserfs as a module, ext2 and iso-9660 compiled in, using sg support for cd-roms. I had to strip the kernel down so it would fit on a floppy as the system does not support booting off of CD-ROM. After booting and getting my initial file system in memory (20+ MB ramdisk), I created a partition for swap, format and then swapon so I don't run out of memory. At this point I usually have 3-5 MB free memory, 128 MB swap. I partitioned the 2 drives (on 1st and 2nd controller, both 1.3 GB each) into 4 total partitions. 1st is swap and then the next 3, 1 primary, 2 extended are for raid 1 arrays. I've given 20 MB to /boot (md0), 650MB to / (md1) and the rest (400+MB) to /var (md2). I format md0 as ext2 and md1 and md2 as reiserfs. When I go to untar the image on the cd to /mnt/slash (which has md1 mounted on it), the system extracts about 30MB of data and then just stops responding. No kernel output, etc. I can change to the other virtual consoles, but no other keyboard input is accepted. After resetting the machine, the raid arrays rebuild ok, and reiserfs gives me no problems other than it usually replays 2 or 3 transactions. If I tell tar to pickup on the last directory I saw extracted, it gets about another 30MB of data and stops again. I've waited for the raid syncing to be finished or just started after the arrays are available and it doesn't matter. I first tried with 2.4.1 stock and then went to 2.4.1-ac10 (the latest at the time I was playing with this) and it did exactly the same thing. If I format md1 and md2 with ext2, then everything works fine. I was initially compiling 386 only support in and have tried with 586 support (no difference). I've tried both r5 and tea hashes with reiserfs. One thing I did notice was that the syncing of the raid 1 arrays went in sequence, md0, md1, md2 instead of in parrallel. I assume it is because the machine just doesn't have the horsepower, etc. or is it that I have multiple raid arrays on the same drives? This isn't a life or death issue at the moment, but I would like to be able to use reiserfs in this scenario in the future. I have tested the same rescue CD boot image on a K62, 450Mhz, 128 MB system. No raid, just one reiserfs partition and it untarred without any issues. I'm thinking this is something specific to older, lower memory machines? -- James A. Pattie [EMAIL PROTECTED] Linux -- SysAdmin / Programmer PC & Web Xperience, Inc. http://www.pcxperience.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Reiserfs, 3 Raid1 arrays, 2.4.1 machine locks up
Colonel wrote: > In clouddancer.list.kernel.owner, you wrote: > > > >I'm not subscribed to the kernel mailing list, so please cc any replies > >to me. > > > >I'm building a firewall on a P133 with 48 MB of memory using RH 7.0, > >latest updates, etc. and kernel 2.4.1. > >I've built a customized install of RH (~200MB) which I untar onto the > >system after building my raid arrays, etc. via a Rescue CD which I > >created using Timo's Rescue CD project. The booting kernel is > >2.4.1-ac10, no networking, raid compiled in but raid1 as a module > > Hmm, raid as a module was always a Bad Idea(tm) in the 2.2 "alpha" > raid (which was misnamed and is 2.4 raid). I suggest you change that > and update, as I had no problems with 2.4.2-pre2/3, nor have any been > posted to the raid list. I just tried with 2.4.1-ac14, raid and raid1 compiled in and it did the same thing. I'm going to try to compile reiserfs in (if I have enough room to still fit the kernel on the floppy with it's initial ramdisk, etc.) and see what that does. -- James A. Pattie [EMAIL PROTECTED] Linux -- SysAdmin / Programmer PC & Web Xperience, Inc. http://www.pcxperience.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Reiserfs, 3 Raid1 arrays, 2.4.1 machine locks up
Tom Sightler wrote: > >> >I'm building a firewall on a P133 with 48 MB of memory using RH 7.0, > >> >latest updates, etc. and kernel 2.4.1. > >> >I've built a customized install of RH (~200MB) which I untar onto > the > >> >system after building my raid arrays, etc. via a Rescue CD which I > >> >created using Timo's Rescue CD project. The booting kernel is > >> >2.4.1-ac10, no networking, raid compiled in but raid1 as a module > >> > >> Hmm, raid as a module was always a Bad Idea(tm) in the 2.2 "alpha" > >> raid (which was misnamed and is 2.4 raid). I suggest you change that > >> and update, as I had no problems with 2.4.2-pre2/3, nor have any been > >> posted to the raid list. > > > >I just tried with 2.4.1-ac14, raid and raid1 compiled in and it did the > >same thing. I'm going to try to compile reiserfs in (if I have enough > room > >to still fit the kernel on the floppy with it's initial ramdisk, etc.) > and > >see what that does. > > There seem to be several reports of reiserfs falling over when memory is > low. It seems to be undetermined if this problem is actually reiserfs or MM > related, but there are other threads on this list regarding similar issues. > This would explain why the same disk would work on a different machine with > more memory. Any chance you could add memory to the box temporarily just to > see if it helps, this may help prove if this is the problem or not. > > Later, > Tom Out of all the old 72 pin simms we have, we have it maxed out at 48 MB's. I'm tempted to take the 2 drives out and put them in the k6-2, but that's too much of a hassle. I'm currently going to try 2.4.1-ac19 and see what happens. The machine does have 128MB of swap space working, and whenever I've checked memory usage (while the system was still responding), it never went over a couple megs of swap space used. -- James A. Pattie [EMAIL PROTECTED] Linux -- SysAdmin / Programmer PC & Web Xperience, Inc. http://www.pcxperience.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Reiserfs, 3 Raid1 arrays, 2.4.1 machine locks up
Tom Sightler wrote: > > > There seem to be several reports of reiserfs falling over when memory is > > > low. It seems to be undetermined if this problem is actually reiserfs > or MM > > > related, but there are other threads on this list regarding similar > issues. > > > This would explain why the same disk would work on a different machine > with > > > more memory. Any chance you could add memory to the box temporarily > just to > > > see if it helps, this may help prove if this is the problem or not. > > > > > > > Out of all the old 72 pin simms we have, we have it maxed out at 48 MB's. > I'm > > tempted to take the 2 drives out and put them in the k6-2, but that's too > much > > of a hassle. I'm currently going to try 2.4.1-ac19 and see what happens. > > > > The machine does have 128MB of swap space working, and whenever I've > checked > > memory usage (while the system was still responding), it never went over a > > couple megs of swap space used. > > Ah yes, but, from what I've read, the problem seems to occur when > buffer/cache memory is low (<6MB), you could have tons of swap and still > reach this level. > > Later, > Tom You were right! I managed to find another 32MB of memory to bump it up to 64 MB total and it worked perfectly. It appears that I had only about 4 MB of buffer/cache in the 48 MB system and over 15MB in the 64 MB system. I did my install and switched back to the 48MB running normally and its working just fine. Thanks, -- James A. Pattie [EMAIL PROTECTED] Linux -- SysAdmin / Programmer PC & Web Xperience, Inc. http://www.pcxperience.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Reiserfs, 3 Raid1 arrays, 2.4.1 machine locks up
Colonel wrote: >>There seem to be several reports of reiserfs falling over when memory is >>low. It seems to be undetermined if this problem is actually reiserfs >> or MM related, but there are other threads on this list regarding similar >> issues. This would explain why the same disk would work on a different >> machine with more memory. Any chance you could add memory to the box >> temporarily just to see if it helps, this may help prove if this is the >> problem or not. >> >> >> Well, I didn't happen to start the thread, but your comments may >> explain some "gee I wonder if it died" problems I just had with my >> 2.4.1-pre2+reiser test box. It only has 16M, so it's always low >> memory (never been a real problem in the past however). The test >> situation is easily repeatable for me [1]. It's a 486 wall mount, so >> it's easier to convert the fs than add memory, and it showed about >> 200k free at the time of the sluggishness. Previous 2.4.1 testing >> with ext2 fs didn't show any sluggishness, but I also didn't happen to >> run the test above either. When I come back to the office later, I'll >> convert the fs, repeat the test and pass on the results. >> >> >> [1] Since I decided to try to catch up on kernels, I had just grabbed >> -ac18, cd to ~linux and run "rm -r *" via an ssh connection. In a >> second connection, I tried a simple "dmesg" and waited over a minute >> for results (long enough to log in directly on the box and bring up >> top) followed by loading emacs for ftp transfers from kernel.org, >> which again 'went to sleep'. >> - > >If these are freezes I had them too in 2.4.1, 2.4.2-pre1 fixed it for me. >Really I think it was the patch in handle_mm_fault setting TASK_RUNNING. > >/RogerL > > Ohoh, I see that I fat-fingered the kernel version. The test box > kernel is 2.4.2-pre2 with Axboe's loop4 patch to the loopback fs. It > runs a three partition drive, a small /boot in ext2, / as reiser and > swap. I am verifying that the freeze is repeatable at the moment, and > so far I cannot cause free memory to drop to 200k and a short ice age > does not occur. Unless I can get that to repeat, the effort will be > useless... the only real difference is swap, it was not initially > active and now it is. Free memory never drops below 540k now, so I > would suspect a MM influence. [EMAIL PROTECTED] didn't mention > the memory values in his initial post, but it would be interesting to > see if he simply leaves his machine alone if it recovers > (i.e. probable swap thrashing) and then determine if the freeze ever > re-occurs. James seems to have better repeatability than I do. > Rebooting and retrying still doesn't result in a noticable freeze for > me. Some other factor must have been involved that I didn't notice. > Still seems like MM over reiser tho. When the machine stopped responding, the first time, I let it go over the weekend (2 days+) and it still didn't recover. I never saw a thrashing effect. The initial memory values were 2MB free memory, < 1MB cache. I never really looked at the cache values as I wasn't sure how they affected the system. when the system was untarring my tarball, the memory usage would get down < 500kb and swap would be around a couple of megs usually. > > > PS for james: > >One thing I did notice was that the syncing of the raid 1 arrays went in > sequence, md0, md1, md2 instead of in parrallel. I assume it is because > the machine just doesn't have the horsepower, etc. or is it that I have > multiple raid arrays on the same drives? > > Same drives. That's what I thought. Thanks, -- James A. Pattie [EMAIL PROTECTED] Linux -- SysAdmin / Programmer PC & Web Xperience, Inc. http://www.pcxperience.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Reiserfs, 3 Raid1 arrays, 2.4.1 machine locks up
Colonel wrote: >Sender: [EMAIL PROTECTED] >Date: Wed, 21 Feb 2001 08:45:02 -0600 > From: "James A. Pattie" <[EMAIL PROTECTED]> > >Colonel wrote: > >>>There seem to be several reports of reiserfs falling over when memory is >>>low. It seems to be undetermined if this problem is actually reiserfs >>> or MM related, but there are other threads on this list regarding similar >>> issues. This would explain why the same disk would work on a different >>> machine with more memory. Any chance you could add memory to the box >>> temporarily just to see if it helps, this may help prove if this is the >>> problem or not. >>> >>> > >When the machine stopped responding, the first time, I let it go over the weekend >(2 days+) and it still didn't recover. I never saw a thrashing effect. The >initial memory values were 2MB free memory, < 1MB cache. I never really looked at >the cache values as I wasn't sure how they affected the system. when the system >was untarring my tarball, the memory usage would get down < 500kb and swap would >be >around a couple of megs usually. > > Well, it still looks like you have a good test case to resolve the > problem. Can you add memory per the above request? > > I should drop out of this, it seems I had a one time event. Something > to keep in mind is /boot should either be ext2 or mounted differently > under reiser (check their website for details). You should probably > try the Magic SysREQ stuff to see what's up at the time of freeze. > You should probably run memtest86 to head off questions about your > memory stability. I added memory yesterday and got it to work after having 64MB in the system. the free memory (cache/buffer) was over 30MB. I didn't have any problems then. After I got everything installed, I bumped the memory back to 48MB and it is running fine. I don't have the 17+MB ramdisk taking up the memory anymore, so the system has > 15MB of cache/buffer available at all times, even running ssh, sendmail, squid, firewalling, etc. > > > Just to check on the raid setup, the drives are on separate > controllers and there is not a slow device on the same bus? I've been > running the "2.4" raid for a couple years and that was the usual > problem. Reiserfs is probably more aggressive working the drive and > it may tend to unhide other system problems. > They are on seperate controllers. The second controller has the CD-ROM drive (32x) which should be faster than the hard drive (since the drives are older). > > -- > "... being a Linux user is sort of like living in a house inhabited by > a large family of carpenters and architects. Every morning when you > wake up, the house is a little different. Maybe there is a new turret, > or some walls have moved. Or perhaps someone has temporarily removed > the floor under your bed." - Unix for Dummies, 2nd Edition -- James A. Pattie [EMAIL PROTECTED] Linux -- SysAdmin / Programmer PC & Web Xperience, Inc. http://www.pcxperience.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PROBLEM: Reiserfs + 3Ware issue with 2.4.5
1. Reiserfs + 3Ware issue with 2.4.5 2. I'm building a rescue cd based on Timo's Rescue CD v. 0.5.4. I've compiled the 2.4.5 kernel to be a 386, all scsi controllers I might encounter compiled in, software raid, reisersfs compiled in. The only modules are scsi tapes, network cards, nfs and smbfs support. When I boot my K6II-350 Box (128 MB memory) off the cd, it boots up and eventually gets me to my prompt. In the process the insmod of 3c59x gives me the following error: /lib/modules/2.4.5/kernel/drivers/net/3c59x.o: unresolved symbol del_timer_sync (This is just something I noticed along the way which is annoying but not a show stopper yet.) The real issue is when after mounting my filesystems off the 3ware card and going to unmount them the following happens: bash# umount tmp journal_begin called without kernel lock held kernel BUG at journal.c:423! invalid operand: CPU:0 EIP: 0010:[] EFLAGS: 00010282 eax: 001d ebx: c65aff24 ecx: c65f4000 edx: 0001 esi: c4a1e800 edi: ebp: 3b26397a esp: c65afeac ds: 0018 es: 0018ss: 0018 Process umount (pid: 116, stackpage=c65af000) Stack: c027b0cc c027b264 01a7 c017b62f c027c281 5b86 0808 c0106fb0 5b86 c48f1e50 c48f1df0 c65aff24 c4a1e800 c02c3560 c02c35d8 c017b857 c65aff24 c4a1e800 000a c016e064 c65aff24 Call Trace: [] [] [] [] [] [] [] [][][][] Code: 0f 0b 83 c4 0c c3 8d 76 00 31 c0 c3 90 31 c0 c3 90 53 31 db Segmentation fault bash# /boot is ext2 /, /home, /tmp, /var, /usr are all reiserfs -- As long as I don't retry to unmount the filesystem or unmount another filesystem the system is still usable. But when I try to unmount another filesystem that process just appears to go into a never ending state. Once I've locked up all my consoles I can only hit the reset button :( ps ax shows: 51 tty3 SW 0:00 [kreiserfsd] 119 tty3D 0:00 umount home I can get to my other virtual console and still look into the currently mounted filesystems, I just can't shutdown the box, kill the umount process, etc. I can even look into the filesystem mounted as home that I am currently trying to umount as seen in the ps output. The box is currently running a RH 7.1 system with a custom built 2.4.2 kernel so I know reiserfs and the 3ware card work correctly before 2.4.5. :) ver_linux shows (on my box that built the kernel): Linux navi.zelda.pcxperience.com 2.4.5 #1 Sat Jun 2 13:26:40 CDT 2001 i586 unknown Gnu C 2.96 Gnu make 3.79.1 binutils 2.10.91.0.2 util-linux 2.10s mount 2.11b modutils 2.4.2 e2fsprogs 1.19 reiserfsprogs 3.x.0f PPP2.4.0 isdn4k-utils 3.1pre1 Linux C Library2.2.3 Dynamic linker (ldd) 2.2.3 Procps 2.0.7 Net-tools 1.57 Console-tools 0.3.3 Sh-utils 2.0 Modules Loaded tdfx 3c59x sb sb_lib uart401 sound soundcore I'm attaching my .config file if that helps. I can't really provide more output from the test box as I am having to manually type stuff in on my other machine (in this email). :( -- James A. Pattie [EMAIL PROTECTED] Linux -- SysAdmin / Programmer PC & Web Xperience, Inc. http://www.pcxperience.com/ # # Automatically generated make config: don't edit # CONFIG_X86=y CONFIG_ISA=y # CONFIG_SBUS is not set CONFIG_UID16=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y # # Loadable module support # CONFIG_MODULES=y CONFIG_MODVERSIONS=y CONFIG_KMOD=y # # Processor type and features # CONFIG_M386=y # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MCRUSOE is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MCYRIXIII is not set # CONFIG_X86_CMPXCHG is not set # CONFIG_X86_XADD is not set CONFIG_X86_L1_CACHE_SHIFT=4 CONFIG_RWSEM_GENERIC_SPINLOCK=y # CONFIG_RWSEM_XCHGADD_ALGORITHM is not set # CONFIG_TOSHIBA is not set # CONFIG_MICROCODE is not set # CONFIG_X86_MSR is not set # CONFIG_X86_CPUID is not set CONFIG_NOHIGHMEM=y # CONFIG_HIGHMEM4G is not set # CONFIG_HIGHMEM64G is not set CONFIG_MATH_EMULATION=y # CONFIG_MTRR is not set CONFIG_SMP=y # # General setup # CONFIG_NET=y # CONFIG_VISWS is not set CONFIG_X86_IO_APIC=y CONFIG_X86_LOCAL_APIC=y CONFIG_PCI=y # CONFIG_PCI_GOBIOS is not set # CONFIG_PCI_GODIRECT is not set CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y # CONFIG_PCI_NAMES is not set # CONFIG_EISA is not set # CONFIG_MCA is not set # CONFIG_HOTPLUG is not set # CONFIG_PCMCIA is not set CONF