Re: Zero Divide in Kernel 3.12-rc4
Ingo, this looks like it might be related to the ESP driver - scsi_finish_command called from the swapper process during apt-get dist-upgrade does seem plausible. Some of the Amiga SCSI drivers did fiddle with the chip interrupt enable on SCSI interrupt entry, but I'd have thought the ESP core is reasonably thread-safe these days. To pinpoint where in sd_completed_bytes this happens, I'd need the sd_mod module and the module symbol map. Cheers, MIchael I'm testing the ESP SCSI driver port by Tuomas and Michael to 3.12-rc4 and got now this kernel panic during heavy disk activity (apt-get dist-upgrade and parallel a rsync backup by BackupPC): Debian GNU/Linux jessie/sid spice ttyS0 spice login: [77568.07] *** ZERO DIVIDE *** FORMAT=2 [77568.08] Current process id is 0 [77568.09] BAD KERNEL TRAP: [77568.10] Modules linked in: xt_multiport iptable_filter ip_tables x_tables ipv6 8390 loop evdev dmasound_paula mac_hid dmasound_core parport_amiga soundcore parport amimouse ext3 mbcache jbd dm_mod nbd sg sd_mod zorro7xx 53c700 hydra amiflop a3000 [77568.32] PC: [<0484c33a>] sd_completed_bytes+0x90/0xe8 [sd_mod] [77568.33] SR: 2000 SP: 00277e58 a2: 0027e2e4 [77568.34] d0: d1: 007735a0d2: d3: 0001 [77568.35] d4: d5: 007735a8a0: 024dd000a1: 024a0ea0 [77568.36] Process swapper (pid: 0, task=0027e2e4) [77568.37] Frame format=2 instr addr=0484c336 [77568.39] Stack from 00277e90: 0812 0001 00200028 0004 0249d120 02be3090 0272c9e0 007735a0 00277f04 0484c5f8 0249d120 00277f30 000a 00276000 0100 0020 0004 0249d120 1000 02460614 002b9480 2002 0bb8 0249d100 70040200 024dd400 0013f838 0249d120 00277f30 002b9480 00276000 001d38e2 000e1cec 0249d120 0001 00276000 00277f30 00277f30 0002c8da 002b9480 00272704 000f 2598 08031470 [77568.95] Call Trace: [<0484c5f8>] sd_done+0x1d6/0x2aa [sd_mod] [77568.97] [<1000>] kernel_pg_dir+0x0/0x1000 [77568.98] [<2002>] _start+0x2/0x8 [77568.99] [<0013f838>] scsi_finish_command+0x8e/0xb2 [77569.00] [<001d38e2>] printk+0x0/0x24 [77569.01] [<000e1cec>] blk_done_softirq+0x90/0x9c [77569.02] [<0002c8da>] __do_softirq+0xa2/0x12a [77569.03] [<2598>] badsys+0x6/0xa [77569.04] [<00012b08>] slognd+0x74/0x8a [77569.05] [<>] res_func+0x101f/0x141a [77569.06] [<001d6944>] schedule_preempt_disabled+0x0/0xe [77569.07] [<0002ca04>] do_softirq+0x2c/0x32 [77569.08] [<264c>] ret_from_exception+0x0/0xc [77569.09] [<2598>] badsys+0x6/0xa [77569.10] [<000454d6>] cpu_startup_entry+0x74/0xd6 [77569.11] [<0002721c>] kernel_thread+0x0/0x24 [77569.12] [<000f0abc>] strlen+0x0/0x14 [77569.13] [<001d307a>] rest_init+0x5e/0x66 [77569.14] [<002ca6e6>] start_kernel+0x38c/0x398 [77569.15] [<37ee>] setup_rt_frame+0x400/0x4be [77569.16] [<37ee>] setup_rt_frame+0x400/0x4be [77569.17] [<002c8854>] _sinittext+0x854/0x11ac [77569.18] [77569.19] Code: 4a80 6704 4c42 0001 2c01 2207 4c42 1406 <2c00> 2e01 2004 2204 6704 4c42 0001 2801 2205 4c42 1404 2800 2a01 202e fff8 222e [77569.35] Disabling lock debugging due to kernel taint [77569.36] Kernel panic - not syncing: Aiee, killing interrupt handler! [77611.97] amikbd: Ctrl-Amiga-Amiga reset warning!! I don't know whether this is related to the ESP driver or not, but maybe someone is better at reading this kind of output and can judge on this... :-) -- Ciao... //Fon: 0381-2744150 . Ingo \X/ http://blog.windfluechter.net gpg pubkey: http://www.juergensmann.de/ij_public_key. -- To unsubscribe from this list: send the line "unsubscribe linux-m68k" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To UNSUBSCRIBE, email to debian-68k-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/f8389c591d50fe216b546e70d464c...@biophys.uni-duesseldorf.de
Re: Zero Divide in Kernel 3.12-rc4
On Mon, Oct 21, 2013 at 9:34 AM, Michael Schmitz wrote: > this looks like it might be related to the ESP driver - scsi_finish_command > called from the swapper process during apt-get dist-upgrade does seem > plausible. > > Some of the Amiga SCSI drivers did fiddle with the chip interrupt enable on > SCSI interrupt entry, but I'd have thought the ESP core is reasonably > thread-safe these days. > > To pinpoint where in sd_completed_bytes this happens, I'd need the sd_mod > module and the module symbol map. /* be careful ... don't want any overflows */ u64 factor = scmd->device->sector_size / 512; do_div(start_lba, factor); do_div(end_lba, factor); scmd->device->sector_size should be 512, so factor should be 1. Let's try a bit harder with a fresher mind and a cup of coffee and a mini-twix: >> [77568.32] PC: [<0484c33a>] sd_completed_bytes+0x90/0xe8 [sd_mod] >> [77568.33] SR: 2000 SP: 00277e58 a2: 0027e2e4 >> [77568.34] d0: d1: 007735a0d2: d3: 0001 >> [77568.35] d4: d5: 007735a8a0: 024dd000a1: 024a0ea0 >> [77569.19] Code: 4a80 6704 4c42 0001 2c01 2207 4c42 1406 <2c00> 2e01 >> 2004 2204 6704 4c42 0001 2801 2205 4c42 1404 2800 2a01 202e fff8 222e "4c42" is a division. It's the second one of the four divisions: 0: 4a80 tstl %d0 d0 is zero, so the first division is skipped. 2: 6704 beqs 0x8 4: 4c42 0001 divull %d2,%d1,%d0 8: 2c01 movel %d1,%d6 a: 2207 movel %d7,%d1 c: 4c42 1406 divul %d2,%d6,%d1 It's dividing by d2, which is zero. So scmd->device->sector_size must be smaller than 512 (probably zero). 10: 2c00 movel %d0,%d6 12: 2e01 movel %d1,%d7 14: 2004 movel %d4,%d0 16: 2204 movel %d4,%d1 18: 6704 beqs 0x1e 1a: 4c42 0001 divull %d2,%d1,%d0 1e: 2801 movel %d1,%d4 20: 2205 movel %d5,%d1 22: 4c42 1404 divul %d2,%d4,%d1 26: 2800 movel %d0,%d4 28: 2a01 movel %d1,%d5 2a: 202e fff8 movel %fp@(-8),%d0 The posted binary has slightly different code (different addresses, and the division is "4c40"): 00168404 : 168404: 4e56 fff8 linkw %fp,#-8 168408: 48e7 3f1c moveml %d2-%d7/%a3-%a5,%sp@- 16840c: 266e 0008 moveal %fp@(8),%a3 168410: 206b 0054 moveal %a3@(84),%a0 168414: 2828 0032 movel %a0@(50),%d4 168418: 2a28 0036 movel %a0@(54),%d5 16841c: 2c2b 0040 movel %a3@(64),%d6 168420: 2e2b 0044 movel %a3@(68),%d7 168424: 7001moveq #1,%d0 168426: b0a8 0022 cmpl %a0@(34),%d0 16842a: 6600 00b2 bnew 1684de 16842e: 486e fff8 pea %fp@(-8) 168432: 4878 0060 pea 60 168436: 2f2b 0058 movel %a3@(88),%sp@- 16843a: 4eb9 0015 4e86 jsr 154e86 168440: 4fef 000c lea %sp@(12),%sp 168444: 4a80tstl %d0 168446: 6700 0096 beqw 1684de 16844a: 2053moveal %a3@,%a0 16844c: 2028 0054 movel %a0@(84),%d0 168450: b0ab 0040 cmpl %a3@(64),%d0 168454: 6400 0088 bccw 1684de 168458: 2206movel %d6,%d1 16845a: 7409moveq #9,%d2 16845c: e4a9lsrl %d2,%d1 16845e: 2601movel %d1,%d3 168460: 4202clrb %d2 168462: d685addl %d5,%d3 168464: d584addxl %d4,%d2 168466: 0c80 01ff cmpil #511,%d0 16846c: 6212bhis 168480 16846e: da85addl %d5,%d5 168470: d984addxl %d4,%d4 168472: 2002movel %d2,%d0 168474: 2203movel %d3,%d1 168476: d281addl %d1,%d1 168478: d180addxl %d0,%d0 16847a: 2840moveal %d0,%a4 16847c: 2a41moveal %d1,%a5 16847e: 602abras 1684aa 168480: 7209moveq #9,%d1 168482: e2a8lsrl %d1,%d0 168484: 2204movel %d4,%d1 168486: 2045moveal %d5,%a0 168488: 6704beqs 16848e 16848a: 4c40 1004 divull %d0,%d4,%d1 16848e: 2a08movel %a0,%d5 168490: 4c40 5404 divul %d0,%d4,%d5 168494: 2801movel %d1,%d4 168496: 2202movel %d2,%d1 168498: 2043moveal %d3,%a0 16849a: 6704beqs 1684a0 16849c: 4c40 1002 divull %d0,%d2,%d1 1684a0: 2608movel %a0,%d3 1684a2: 4c40 3402 divul %d0,%d2,%d3 1684a6: 2841moveal %d1,%a4 1684a8: 2a43moveal %d3,%a5 1684aa: 202e fff8 m
Re: Linux 3.1-2-m68k config
Finn Thain dixit: >CONFIG_ADB_MACIISI is disabled in -multi due to crashing bugs. Debian >might want to do the same (until I get a chance to fix it). The serial OK, thanks. >Below is the same list after sort -k4 -k5 -k2. Note that some modules in >-multi are disabled in the Debian config. That may or may not help you Some do, yes. >! CONFIG_DEFAULT_IOSCHED : "noop" "cfq" size >! CONFIG_FAT_DEFAULT_IOCHARSET : "utf8" "iso8859-1" >! CONFIG_NLS_DEFAULT : "utf8" "iso8859-1" by design >! CONFIG_LOGO_LINUX_CLUT224: . y >! CONFIG_LOGO_LINUX_MONO : . y >! CONFIG_LOGO_LINUX_VGA16 : . y >! CONFIG_LOGO_MAC_CLUT224 : . y I never liked those boot logos tbh, and Debian disables them too. >! CONFIG_ROOT_NFS : . y by design (we use initrd) >! CONFIG_SUN3LANCE : . y >! CONFIG_SUN3X_ESP : . y I think Sun3 support can’t coexist with the others, or at least I’ve read about it. Anyway, I didn’t even enable all platforms in the Debian kernel. But if you think I should change any more than the three already listed (fpu, brk, CONFIG_ADB_MACIISI) please say so. (Though the bit about users triggering things is also true.) bye, //mirabilos -- Beware of ritual lest you forget the meaning behind it. yeah but it means if you really care about something, don't ritualise it, or you will lose it. don't fetishise it, don't obsess. or you'll forget why you love it in the first place. -- To UNSUBSCRIBE, email to debian-68k-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/pine.bsm.4.64l.1310210900100.3...@herc.mirbsd.org
Re: Bits from the Release Team (Jessie freeze info)
On 2013-10-19 16:38, Jeremiah C. Foster wrote: > Hello, > > On Sun, Oct 13, 2013 at 05:01:31PM +0200, Niels Thykier wrote: > > [snip freeze policy] > Hi, I s/-arm/-ports/'ed the CC, since I figured the rest of the porters would find the answer equally interesting. >> Results of porter roll-call >> === >> >> [...] >> >> That said, we would like to encourage porters behind all ports to >> ensure that the toolchain is up to date and working. We are aware of >> at least gcc on mips having its test suite disabled[GCC]. Other ports >> may suffer from similar issues and we hope to have those resolved >> sooner rather than later. We are currently waiting for the gcc >> maintainers to compile a list of such issues. > > So I can extrapolate from this that ensuring that the toolchain is up > to date and working is a key activity of a porter. Yes; build-essential being broken is obviously a problem. But also having the same default compiler on all architectures is also desired. > If my assumption is > correct, is there a complete definition of the "toolchain" as we see > it in Debian that a porter might reasonably be expected to use to do > thier porting? > I do not have an complete list of packages, although it will definitely include build-essential. My intuition is that "toolchain" should include any compiler used by packages on that architecture[1] (e.g. if the arch has built haskell packages, it should have a working haskell compiler as well). But as said, that is my personally view and not an official statement. > In addition, I wonder if there is a way to report the status of the > toolchain and what sort of expectations are there around "up to date"? I would love for us to have an automated system to give us a "weather-report" on the toolchain for each architecture. It would be nice both for us to see how ports are doing and for porters to spot and fix problems early. As for up-to-date, I don't have a complete answer here. I seem to remember the GCC maintainers being frustrated at having to maintain gcc-4.6 (it is apparently still default for some architectures) despite gcc-4.8 being the latest stable release. > Is it expected to build Debian toolchain nightly and run a specific > test suite? Is the expectation that one uses pbuilder and builds a set > of packages? What we got in the policy so far[2]: """ Installer: The architecture must have a working,tested installer. [...] Archive coverage: The architecture needs to have successfully compiled the current version of the overwhelming part of the archive [...] """ Which implies "a set of packages" being "the current version of the overwhelming part of the archive" plus all of d-i. However, that is not something you "just build", so having a smaller set as a basic test would probably be way more useful. I am not aware of such a "basic test set", so feel free to propose one. I like the "toolchain nightly" thing as well. I don't think it is "required", but it sounds like the kind of thing that would help people spot issues sooner rather than later! > Perhaps this is outlined on the wiki somewhere and if not > perhaps it ought to be? > > Regards, > > Jeremiah > > Having documentation on it would definitely be a good thing. For actual requirements, we should add them to the policy[2], but having a wiki-page of "recommended porter practises/tests" would probably be a nice addition too. ~Niels [1] My rationale for this is that we would like to be able to rebuild/reproduce builds, which would require a working compiler. [2] http://release.debian.org/jessie/arch_policy.html -- To UNSUBSCRIBE, email to debian-68k-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/52654b6d.9020...@thykier.net