Re: Zero Divide in Kernel 3.12-rc4

2013-10-21 Thread Michael Schmitz

Ingo,

this looks like it might be related to the ESP driver - 
scsi_finish_command called from the swapper process during apt-get 
dist-upgrade does seem plausible.


Some of the Amiga SCSI drivers did fiddle with the chip interrupt 
enable on SCSI interrupt entry, but I'd have thought the ESP core is 
reasonably thread-safe these days.


To pinpoint where in sd_completed_bytes this happens, I'd need the 
sd_mod module and the module symbol map.


Cheers,

MIchael

I'm testing the ESP SCSI driver port by Tuomas and Michael to 3.12-rc4 
and got now this kernel panic during heavy disk activity (apt-get 
dist-upgrade and parallel a rsync backup by BackupPC):


Debian GNU/Linux jessie/sid spice ttyS0

spice login: [77568.07] *** ZERO DIVIDE ***   FORMAT=2
[77568.08] Current process id is 0
[77568.09] BAD KERNEL TRAP: 
[77568.10] Modules linked in: xt_multiport iptable_filter 
ip_tables x_tables ipv6 8390 loop evdev dmasound_paula mac_hid 
dmasound_core parport_amiga soundcore parport amimouse ext3 mbcache 
jbd dm_mod nbd sg sd_mod zorro7xx 53c700 hydra amiflop a3000

[77568.32] PC: [<0484c33a>] sd_completed_bytes+0x90/0xe8 [sd_mod]
[77568.33] SR: 2000  SP: 00277e58  a2: 0027e2e4
[77568.34] d0: d1: 007735a0d2: d3: 
0001
[77568.35] d4: d5: 007735a8a0: 024dd000a1: 
024a0ea0

[77568.36] Process swapper (pid: 0, task=0027e2e4)
[77568.37] Frame format=2 instr addr=0484c336
[77568.39] Stack from 00277e90:
 0812  0001 00200028 0004 0249d120 
02be3090
0272c9e0  007735a0 00277f04 0484c5f8 0249d120 00277f30 
000a
00276000 0100 0020 0004 0249d120 1000 02460614 
002b9480
2002 0bb8 0249d100 70040200  024dd400 0013f838 
0249d120
00277f30 002b9480 00276000 001d38e2 000e1cec 0249d120 0001 
00276000
00277f30 00277f30 0002c8da 002b9480 00272704 000f 2598 
08031470

[77568.95] Call Trace: [<0484c5f8>] sd_done+0x1d6/0x2aa [sd_mod]
[77568.97]  [<1000>] kernel_pg_dir+0x0/0x1000
[77568.98]  [<2002>] _start+0x2/0x8
[77568.99]  [<0013f838>] scsi_finish_command+0x8e/0xb2
[77569.00]  [<001d38e2>] printk+0x0/0x24
[77569.01]  [<000e1cec>] blk_done_softirq+0x90/0x9c
[77569.02]  [<0002c8da>] __do_softirq+0xa2/0x12a
[77569.03]  [<2598>] badsys+0x6/0xa
[77569.04]  [<00012b08>] slognd+0x74/0x8a
[77569.05]  [<>] res_func+0x101f/0x141a
[77569.06]  [<001d6944>] schedule_preempt_disabled+0x0/0xe
[77569.07]  [<0002ca04>] do_softirq+0x2c/0x32
[77569.08]  [<264c>] ret_from_exception+0x0/0xc
[77569.09]  [<2598>] badsys+0x6/0xa
[77569.10]  [<000454d6>] cpu_startup_entry+0x74/0xd6
[77569.11]  [<0002721c>] kernel_thread+0x0/0x24
[77569.12]  [<000f0abc>] strlen+0x0/0x14
[77569.13]  [<001d307a>] rest_init+0x5e/0x66
[77569.14]  [<002ca6e6>] start_kernel+0x38c/0x398
[77569.15]  [<37ee>] setup_rt_frame+0x400/0x4be
[77569.16]  [<37ee>] setup_rt_frame+0x400/0x4be
[77569.17]  [<002c8854>] _sinittext+0x854/0x11ac
[77569.18]
[77569.19] Code: 4a80 6704 4c42 0001 2c01 2207 4c42 1406 <2c00> 
2e01 2004 2204 6704 4c42 0001 2801 2205 4c42 1404 2800 2a01 202e fff8 
222e

[77569.35] Disabling lock debugging due to kernel taint
[77569.36] Kernel panic - not syncing: Aiee, killing interrupt 
handler!

[77611.97] amikbd: Ctrl-Amiga-Amiga reset warning!!


I don't know whether this is related to the ESP driver or not, but 
maybe someone is better at reading this kind of output and can judge 
on this... :-)


--
Ciao...  //Fon: 0381-2744150
. Ingo \X/ http://blog.windfluechter.net

gpg pubkey: http://www.juergensmann.de/ij_public_key.
--
To unsubscribe from this list: send the line "unsubscribe linux-m68k" 
in

the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To UNSUBSCRIBE, email to debian-68k-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/f8389c591d50fe216b546e70d464c...@biophys.uni-duesseldorf.de



Re: Zero Divide in Kernel 3.12-rc4

2013-10-21 Thread Geert Uytterhoeven
On Mon, Oct 21, 2013 at 9:34 AM, Michael Schmitz
 wrote:
> this looks like it might be related to the ESP driver - scsi_finish_command
> called from the swapper process during apt-get dist-upgrade does seem
> plausible.
>
> Some of the Amiga SCSI drivers did fiddle with the chip interrupt enable on
> SCSI interrupt entry, but I'd have thought the ESP core is reasonably
> thread-safe these days.
>
> To pinpoint where in sd_completed_bytes this happens, I'd need the sd_mod
> module and the module symbol map.

/* be careful ... don't want any overflows */
u64 factor = scmd->device->sector_size / 512;
do_div(start_lba, factor);
do_div(end_lba, factor);

scmd->device->sector_size should be 512, so factor should be 1.

Let's try a bit harder with a fresher mind and a cup of coffee and
a mini-twix:

>> [77568.32] PC: [<0484c33a>] sd_completed_bytes+0x90/0xe8 [sd_mod]
>> [77568.33] SR: 2000  SP: 00277e58  a2: 0027e2e4
>> [77568.34] d0: d1: 007735a0d2: d3: 0001
>> [77568.35] d4: d5: 007735a8a0: 024dd000a1: 024a0ea0

>> [77569.19] Code: 4a80 6704 4c42 0001 2c01 2207 4c42 1406 <2c00> 2e01
>> 2004 2204 6704 4c42 0001 2801 2205 4c42 1404 2800 2a01 202e fff8 222e

"4c42" is a division. It's the second one of the four divisions:

   0: 4a80   tstl %d0

d0 is zero, so the first division is skipped.

   2: 6704   beqs 0x8
   4: 4c42 0001   divull %d2,%d1,%d0
   8: 2c01   movel %d1,%d6
   a: 2207   movel %d7,%d1
   c: 4c42 1406   divul %d2,%d6,%d1

It's dividing by d2, which is zero. So scmd->device->sector_size must be
smaller than 512 (probably zero).

  10: 2c00   movel %d0,%d6
  12: 2e01   movel %d1,%d7
  14: 2004   movel %d4,%d0
  16: 2204   movel %d4,%d1
  18: 6704   beqs 0x1e
  1a: 4c42 0001   divull %d2,%d1,%d0
  1e: 2801   movel %d1,%d4
  20: 2205   movel %d5,%d1
  22: 4c42 1404   divul %d2,%d4,%d1
  26: 2800   movel %d0,%d4
  28: 2a01   movel %d1,%d5
  2a: 202e fff8   movel %fp@(-8),%d0

The posted binary has slightly different code (different addresses, and the
division is "4c40"):

00168404 :
  168404:   4e56 fff8   linkw %fp,#-8
  168408:   48e7 3f1c   moveml %d2-%d7/%a3-%a5,%sp@-
  16840c:   266e 0008   moveal %fp@(8),%a3
  168410:   206b 0054   moveal %a3@(84),%a0
  168414:   2828 0032   movel %a0@(50),%d4
  168418:   2a28 0036   movel %a0@(54),%d5
  16841c:   2c2b 0040   movel %a3@(64),%d6
  168420:   2e2b 0044   movel %a3@(68),%d7
  168424:   7001moveq #1,%d0
  168426:   b0a8 0022   cmpl %a0@(34),%d0
  16842a:   6600 00b2   bnew 1684de 
  16842e:   486e fff8   pea %fp@(-8)
  168432:   4878 0060   pea 60 
  168436:   2f2b 0058   movel %a3@(88),%sp@-
  16843a:   4eb9 0015 4e86  jsr 154e86 
  168440:   4fef 000c   lea %sp@(12),%sp
  168444:   4a80tstl %d0
  168446:   6700 0096   beqw 1684de 
  16844a:   2053moveal %a3@,%a0
  16844c:   2028 0054   movel %a0@(84),%d0
  168450:   b0ab 0040   cmpl %a3@(64),%d0
  168454:   6400 0088   bccw 1684de 
  168458:   2206movel %d6,%d1
  16845a:   7409moveq #9,%d2
  16845c:   e4a9lsrl %d2,%d1
  16845e:   2601movel %d1,%d3
  168460:   4202clrb %d2
  168462:   d685addl %d5,%d3
  168464:   d584addxl %d4,%d2
  168466:   0c80  01ff  cmpil #511,%d0
  16846c:   6212bhis 168480 
  16846e:   da85addl %d5,%d5
  168470:   d984addxl %d4,%d4
  168472:   2002movel %d2,%d0
  168474:   2203movel %d3,%d1
  168476:   d281addl %d1,%d1
  168478:   d180addxl %d0,%d0
  16847a:   2840moveal %d0,%a4
  16847c:   2a41moveal %d1,%a5
  16847e:   602abras 1684aa 
  168480:   7209moveq #9,%d1
  168482:   e2a8lsrl %d1,%d0
  168484:   2204movel %d4,%d1
  168486:   2045moveal %d5,%a0
  168488:   6704beqs 16848e 
  16848a:   4c40 1004   divull %d0,%d4,%d1
  16848e:   2a08movel %a0,%d5
  168490:   4c40 5404   divul %d0,%d4,%d5
  168494:   2801movel %d1,%d4
  168496:   2202movel %d2,%d1
  168498:   2043moveal %d3,%a0
  16849a:   6704beqs 1684a0 
  16849c:   4c40 1002   divull %d0,%d2,%d1
  1684a0:   2608movel %a0,%d3
  1684a2:   4c40 3402   divul %d0,%d2,%d3
  1684a6:   2841moveal %d1,%a4
  1684a8:   2a43moveal %d3,%a5
  1684aa:   202e fff8   m

Re: Linux 3.1-2-m68k config

2013-10-21 Thread Thorsten Glaser
Finn Thain dixit:

>CONFIG_ADB_MACIISI is disabled in -multi due to crashing bugs. Debian 
>might want to do the same (until I get a chance to fix it). The serial 

OK, thanks.

>Below is the same list after sort -k4 -k5 -k2. Note that some modules in 
>-multi are disabled in the Debian config. That may or may not help you 

Some do, yes.

>! CONFIG_DEFAULT_IOSCHED   : "noop" "cfq"

size

>! CONFIG_FAT_DEFAULT_IOCHARSET : "utf8" "iso8859-1"
>! CONFIG_NLS_DEFAULT   : "utf8" "iso8859-1"

by design

>! CONFIG_LOGO_LINUX_CLUT224: . y
>! CONFIG_LOGO_LINUX_MONO   : . y
>! CONFIG_LOGO_LINUX_VGA16  : . y
>! CONFIG_LOGO_MAC_CLUT224  : . y

I never liked those boot logos tbh, and Debian disables them too.

>! CONFIG_ROOT_NFS  : . y

by design (we use initrd)

>! CONFIG_SUN3LANCE : . y
>! CONFIG_SUN3X_ESP : . y

I think Sun3 support can’t coexist with the others, or at
least I’ve read about it. Anyway, I didn’t even enable all
platforms in the Debian kernel.

But if you think I should change any more than the three
already listed (fpu, brk, CONFIG_ADB_MACIISI) please say
so. (Though the bit about users triggering things is also
true.)

bye,
//mirabilos
-- 
 Beware of ritual lest you forget the meaning behind it.
 yeah but it means if you really care about something, don't
ritualise it, or you will lose it. don't fetishise it, don't
obsess. or you'll forget why you love it in the first place.


--
To UNSUBSCRIBE, email to debian-68k-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/pine.bsm.4.64l.1310210900100.3...@herc.mirbsd.org



Re: Bits from the Release Team (Jessie freeze info)

2013-10-21 Thread Niels Thykier
On 2013-10-19 16:38, Jeremiah C. Foster wrote:
> Hello,
> 
> On Sun, Oct 13, 2013 at 05:01:31PM +0200, Niels Thykier wrote:
> 
> [snip freeze policy]
>  

Hi,

I s/-arm/-ports/'ed the CC, since I figured the rest of the porters
would find the answer equally interesting.

>> Results of porter roll-call
>> ===
>>
>> [...]
>>
>> That said, we would like to encourage porters behind all ports to
>> ensure that the toolchain is up to date and working.  We are aware of
>> at least gcc on mips having its test suite disabled[GCC].  Other ports
>> may suffer from similar issues and we hope to have those resolved
>> sooner rather than later.  We are currently waiting for the gcc
>> maintainers to compile a list of such issues.
> 
> So I can extrapolate from this that ensuring that the toolchain is up
> to date and working is a key activity of a porter.

Yes; build-essential being broken is obviously a problem.  But also
having the same default compiler on all architectures is also desired.

> If my assumption is
> correct, is there a complete definition of the "toolchain" as we see
> it in Debian that a porter might reasonably be expected to use to do
> thier porting?
> 

I do not have an complete list of packages, although it will definitely
include build-essential.  My intuition is that "toolchain" should
include any compiler used by packages on that architecture[1] (e.g. if
the arch has built haskell packages, it should have a working haskell
compiler as well).  But as said, that is my personally view and not an
official statement.

> In addition, I wonder if there is a way to report the status of the
> toolchain and what sort of expectations are there around "up to date"?

I would love for us to have an automated system to give us a
"weather-report" on the toolchain for each architecture.  It would be
nice both for us to see how ports are doing and for porters to spot and
fix problems early.
  As for up-to-date, I don't have a complete answer here.  I seem to
remember the GCC maintainers being frustrated at having to maintain
gcc-4.6 (it is apparently still default for some architectures) despite
gcc-4.8 being the latest stable release.

> Is it expected to build Debian toolchain nightly and run a specific
> test suite? Is the expectation that one uses pbuilder and builds a set
> of packages?

What we got in the policy so far[2]:

"""
Installer: The architecture must have a working,tested installer.
[...]

Archive coverage: The architecture needs to have successfully compiled
the current version of the overwhelming part of the archive [...]
"""

Which implies "a set of packages" being "the current version of the
overwhelming part of the archive" plus all of d-i.  However, that is not
something you "just build", so having a smaller set as a basic test
would probably be way more useful.  I am not aware of such a "basic test
set", so feel free to propose one.

I like the "toolchain nightly" thing as well. I don't think it is
"required", but it sounds like the kind of thing that would help people
spot issues sooner rather than later!

> Perhaps this is outlined on the wiki somewhere and if not
> perhaps it ought to be?
> 
> Regards,
> 
> Jeremiah
> 
> 

Having documentation on it would definitely be a good thing.  For actual
requirements, we should add them to the policy[2], but having a
wiki-page of "recommended porter practises/tests" would probably be a
nice addition too.

~Niels

[1] My rationale for this is that we would like to be able to
rebuild/reproduce builds, which would require a working compiler.

[2] http://release.debian.org/jessie/arch_policy.html



-- 
To UNSUBSCRIBE, email to debian-68k-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/52654b6d.9020...@thykier.net