Re: minidump size on amd64

2010-10-01 Thread Andriy Gapon
on 30/09/2010 01:26 Alan Cox said the following:
>  On 9/29/2010 3:41 PM, Andriy Gapon wrote:
>> So perhaps we need to add another level of indirection?
>> I.e. first dump contiguous array of "pseudo-pde" entries that would point to
>> chunks of "pseudo-pte" entries, so that "pseudo-pte" entries could be sparse.
>> This is instead of dumping 1GB of contiguous "pseudo-ptes" as we do now.
>>
> 
> That would be the best approach.  That said, for the foreseeable future, the
> kernel page table on amd64 will have two valid ranges, no more, no less.  So, 
> if
> it's much easier to modify minidump to deal with a page table that is assumed 
> to
> have two contiguous parts, just do it.  That assumption should remain valid 
> for
> a few years.

I tried to go for a full thing, at least how I understood it.
Two hours of hacking frenzy and here is a result:

http://people.freebsd.org/~avg/amd64-minidump.diff

I hope that the code is not too ugly.  At least my testing shows that it's not
(badly) broken.

The idea.  We dump contiguously only pages with PDEs (which means both valid and
invalid PDEs), valid pages with PTEs are dumped the same way as data physical
pages (i.e. via dump_add_page, etc); no fake PTEs for 2MB pages.
PDE area of the dump takes about 20MB as opposed to 1GB for PTE area (the math
is obvious, but just in case).

libkva is changed to treat former PTE area as PDE area and is also taught to
understand PG_PS in PDE.
There is now an overhead of having to first read a PTE page in V-to-P-to-offset
lookup for !PG_PS case.  Perhaps we could cache all PTEs in memory and have a
lookup table for them, but I didn't bother with this possibly premature
optimization at this time.

There is an unrelated change in minidumpsys - "bitmap_frozen".
I had to do it despite having a patch in my local tree to stop other CPUs on
panic->dump.  Code in dump path (peripheral disk driver, CAM, SIM driver,
something else?) seems to do some memory allocations and change dump bitmap,
which leads to a mismatch between dump size and dump bitmap; and also
potentially to inconsistencies in the bitmap itself.  So I decided that it's a
good idea to freeze the bitmap once we decided what pages we want to dump.

Some variables and structure fields with 'pte' in them should probably be
renamed to have 'pde' instead.

What do you think?
I will appreciate reviews, testing, comments, etc.
Thanks!
-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Examining the VM splay tree effectiveness

2010-10-01 Thread Andre Oppermann

On 30.09.2010 19:51, Ivan Voras wrote:

On 09/30/10 18:37, Andre Oppermann wrote:


Both the vmmap and page table make use of splay trees to manage the
entries and to speed up lookups compared to long to traverse linked
lists or more memory expensive hash tables.  Some structures though
do have an additional linked list to simplify ordered traversals.


The property of splay tree requiring *writes* for nearly every read
really is a thorn in the eye for SMP. It seems to me that even if the
immediate benefits from converting to something else are not directly
observable, it will still be worth doing it.


Fully agreed.


It's a shame that RCU is still a patent minefield :/

http://mirror.leaseweb.com/kernel/people/npiggin/patches/lockless/2.6.16-rc5/radix-intro.pdf


I'm not convinced that RCU is the only scalable way of sharing a
data structure across a possibly large number of CPU's.

The term "lockless" is often used and frequently misunderstood.
Having a lockess data structure *doesn't* mean that is either
performant, scalable or both.  It heavily depends on a number
of additional factors.  Many times "lockless" just replaces a
simple lock/unlock cycle with a number of atomic operations on
the data structure.  This can easily backfire because an atomic
operation just hides the computational complexity and also dirties
the CPU's cache lines.  Generally on cache coherent architectures
almost all of the atomic operation is in HW with bus lock cycles,
bus snooping and whatnot.  While seemingly simple form the programmers
point of view, the overhead and latency is still there.  Needless
to say that on more relaxed architectures (SPARC, Alpha, ...) the
overhead is higher.  Also important in the overall picture are the
memory barrier semantics of locks.  Some newer CPU's have started
to provide hardware implemented lock managers and combine it with
SMT features so that access to an already locked lock causes an
immediate HW thread switch and on unlock a switch back.  We also
have rm_locks (read mostly locks) that do not require synchronization
on read but have a more expensive write lock.  In UMA we use a mix
of global pools of elements with per-CPU caching of free elements.

As always the best approach depends on the dominant access pattern
of a structure.  It all comes down to the amortization cost of the
different locking or "lockless" strategies.  It's also important
to make sure that worst case behavior doesn't bring down the box.

As a rule of thumb I use this:

 a) make sure the lock is held for only a small amount of time
to avoid lock contention.
 b) do everything you can outside of the lock.
 c) if the lock is found to be heavily contended rethink the
whole approach and check if other data structures can be used.
 d) minimize write accesses to memory in the lock protected
shared data structure.
 e) PROFILE, DON'T SPECULATE! Measure the access pattern and
measure the locking/data access strategy's cost in terms
of CPU cycles consumed.

 f) on lookup heavy data structures avoid writing to memory and
by it dirtying CPU caches.
 g) on modify heavy data structures avoid touching too many
elements.
 h) on lookup and modify heavy data structure that are used
across many CPU's all bets are off and a different data
structure approach should be considered resulting ideally
in case f).

It all starts with the hypothesis that a data structure is not
optimally locked.

--
Andre
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Examining the VM splay tree effectiveness

2010-10-01 Thread Andre Oppermann

On 01.10.2010 06:49, Matthew Dillon wrote:

 I don't remember the reference but I read a comprehensive comparison
 between various indexing methods about a year ago and the splay tree
 did considerably better than a RB-tree.  The RB-tree actually did
 fairly poorly.


It heavily depends on the access pattern of data structure.  Is
it lookup, modify or insert/delete dominated?  Or a mix of any
of them.  How heavily is the data structure shared across CPU's?
Without this information it is impossible to make a qualified choice.

Making general comparative statements on indexing methods without
taking the access pattern and SMP/CPU cache behavior into account
is going to lead to wrong approach 90% of the time. (Made that number
of up ;-)


 Any binary tree-like structure makes fairly poor use of cpu caches.
 Splay trees work somewhat better as long as there is some locality
 of reference in the lookups, since the node being looked up is moved
 to the root.  It isn't a bad trade-off.


Again, it hugely depends on how good the locality is and how expensive
the CPU cache line dirtying of the splay rotation is.  You can quickly
fall off an amortization *cliff* here.

I agree with binary tree structures being a bit less optimal for CPU
caches because the tree node is embedded with the data element.  On
the plus side not many other data structures are either.  And as long
as memory is only read it can be cached on multiple CPU's.  Touching
it throws it out everywhere else and causes a high latency memory access
on the next read.


 On the negative side all binary-tree-like structures tend to be
 difficult to MP-lock in a fine-grained manner (not that very many
 structures are locked at that fine a grain anyway, but it's a data
 point).  Splay-trees are impossible to lock at a fine-grain due to
 the massive modifications made on any search and the movement
 of the root, and there are MP access issues too.


I doubt that fine grained locking of such data structures is beneficial
in many cases.  Fine grained locking implies more locks, more bus lock
cycles, more memory barriers and more CPU cache dirtying.  As long as
a data structure's global lock is not significantly contended on and
based on that a finer locking doesn't lead to parallel operation it just
creates a lot of overhead for nothing.


 --

 What turned out to be the best indexing mechanism was a chained
 hash table whos hoppers were small linear arrays instead of single
 elements.  So instead of pointer-chaining each element you have a small
 for() loop for 4-8 elements before you chain.  The structure being
 indexed would NOT be integrated into the index directly, the index
 would point at the final structure from the hopper.


This makes a lot of sense if the index is sufficiently small, lets say
one or two int's.  When you go beyond that the advantage quickly fades
away.


 For our purposes such linear arrays would contain a pointer and
 an indexing value in as small an element as possible (8-16 bytes),
 the idea being that you make the absolute best use of your cache line
 and L1 cache / memory burst.  One random access (initial hash index),
 then linear accesses using a small indexing element, then one final
 random access to get to the result structure and validate that
 it's the one desired (at that point we'd be 99.9% sure that we have
 the right structure because we have already compared the index value
 stored in the hopper).  As a plus the initial hash index also makes
 MP locking the base of the chains easier.


Agreed under the premise that the access pattern fits this behavior.


 I don't use arrayized chained hash tables in DFly either, but only
 because it's stayed fairly low on the priority list.  cpu isn't really
 a major issue on systems these days.  I/O is the bigger problem.
 RB-Trees are simply extremely convenient from the programming side,
 which is why we still use them.


Agreed with the emphasis on including lock/atomic cycles and CPU
cache hierarchy into I/O.


 I was surprised that splay trees did so well vs RB-trees, I never liked
 the memory writes splay trees do let alone the impossibility of
 fine-grained locking.  Originally I thought the writes would make
 performance worse, but they actually don't.  Still, if I were to change
 any topologies now I would definitely go with the chained-hash /
 small-linear-array / chain / small-linear-array / chain mechanic.  It
 seems to be the clear winner.


Without first studying the accesses pattern and applying it to
the various data structures this is a very speculative statement
to make.

--
Andre
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.o

Re: patch for topology detection of Intel CPUs

2010-10-01 Thread Andriy Gapon
on 06/09/2010 15:17 Andriy Gapon said the following:
> on 29/08/2010 12:25 Andriy Gapon said the following:
>> The below patch is against sources in FreeBSD tree, it should be applied
>> either to sys/amd64/amd64/mp_machdep.c or sys/i386/i386/mp_machdep.c 
>> depending
>> on the desired architecture:
>> http://people.freebsd.org/~avg/intel-cpu-topo.diff
> 
> I see that I am not getting as many testers as I expected, so I am going to 
> commit
> the patch.
> 
> You still have a short while to either objectively object to the patch or to
> voluntary test it :-)

This is now committed as r213323.
Many thanks to all the testers.
MFC is planned in one month's time.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sleep/Lenovo SL410

2010-10-01 Thread Ian Smith
On Thu, 30 Sep 2010, Matt wrote:
 >  Success!
 > 
 > After setting every possible suspend/resume sysctl,
 > "sysctl hw.pci.do_power_resume=0"
 > allowed suspend and resume. Still beeps 1-3 times before suspend, with rapid
 > sleep light flashing until suspend complete.

Interesting; $someone may document do_power_resume a bit more $someday?

 > Kernel conf is attached.
 > World built from last Friday's CVS, -CURRENT
 > 
 > acpiconf -s3 works perfectly from console
 > previously opened windows are garbled until refresh in X

Some thinkpads have responded positively in this regard to setting
hw.syscons.sc_no_suspend_vtswitch=1

 > acpiconf -s4 causes shutdown, does not resume on power on.

Suspend To Disk is not expected to work; your laptop (like most) has no 
BIOS support for S4, as per your hw.acpi.s4bios: 0

cheers, Ian

 > Swap is 2x RAM.
 > 
 > VMstat -i rate is 540.
 > 
 > Thank you FreeBSD!
 > 
 > Matt
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


CACHE_LINE_SIZE too small, so struct vpglocks size alignment doesn't work

2010-10-01 Thread Svatopluk Kraus
Hallo,

  a size of 'struct vpglocks' is padded to CACHE_LINE_SIZE size in
'sys/vm/vm_page.h'
header file. I work on a 'coldfire' port where CACHE_LINE_SIZE is 16 bytes and
sizeof(struct mtx) is 20 bytes thus size alignment doesn't work.

  I solved it somehow, but I like to learn how to solve it in spirit
of FreeBSD.
There are a couple of possibilities:

A1. Do nothing for small CACHE_LINE_SIZE.
A2. Pad to multiple of CACHE_LINE_SIZE.

B1. use #if with CACHE_LINE_SIZE
B2. use #if with __coldfire__

When I use B1 solution I need to known sizeof(struct mtx) value in
preprocessing time.
So, is it correct to use something like 'assym.s' magic
(sys/i386/i386/genassym.c)
in MI code? Or has someone another suggestion?

   Regards, Svata
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CACHE_LINE_SIZE too small, so struct vpglocks size alignment doesn't work

2010-10-01 Thread Matthew Fleming
On Fri, Oct 1, 2010 at 1:00 PM, Svatopluk Kraus  wrote:
> Hallo,
>
>  a size of 'struct vpglocks' is padded to CACHE_LINE_SIZE size in
> 'sys/vm/vm_page.h'
> header file. I work on a 'coldfire' port where CACHE_LINE_SIZE is 16 bytes and
> sizeof(struct mtx) is 20 bytes thus size alignment doesn't work.
>
>  I solved it somehow, but I like to learn how to solve it in spirit
> of FreeBSD.
> There are a couple of possibilities:
>
> A1. Do nothing for small CACHE_LINE_SIZE.
> A2. Pad to multiple of CACHE_LINE_SIZE.
>
> B1. use #if with CACHE_LINE_SIZE
> B2. use #if with __coldfire__
>
> When I use B1 solution I need to known sizeof(struct mtx) value in
> preprocessing time.
> So, is it correct to use something like 'assym.s' magic
> (sys/i386/i386/genassym.c)
> in MI code? Or has someone another suggestion?

What about padding to CACHE_LINE_SIZE - (sizeof(struct vpglocks) &
(CACHE_LINE_SIZE-1)) ?

Some compilers will complain about 0-sized arrays, but gcc isn't one of them.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


ping6 cause kernel panic in nd6_output_lle()

2010-10-01 Thread Andrey Chernov
Pinging nonexistent IPv6 adress withing the same prefixlen 64 (i.e. 
nonexistent neighbor) immediately cause kernel panic in nd6_output_lle()

See
http://img837.imageshack.us/img837/7496/01102010f.jpg

Please fix.

-- 
http://ache.pp.ru/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Examining the VM splay tree effectiveness

2010-10-01 Thread Ed Schouten
Andre,

* Andre Oppermann  wrote:
> A splay tree is an interesting binary search tree with insertion,
> lookup and removal performed in O(log n) *amortized* time.  With
> the *amortized* time being the crucial difference to other binary trees.
> On every access *including* lookup it rotates the tree to make the
> just found element the new root node.  For all gory details see:
>  http://en.wikipedia.org/wiki/Splay_tree

Even though a red-black tree is quite good since it guarantees a $2 \log
n$ upperbound, the problem is that it's quite computationally intensive.

Maybe it would be worth looking at other types of balanced trees? For
example, another type of tree which has only $O(\log n)$ amortized
insertion/removal/lookup time, but could already be a lot better in
practice, is a Treap.

Greetings,
-- 
 Ed Schouten 
 WWW: http://80386.nl/


pgppFv0Ft9fqs.pgp
Description: PGP signature


Re: Sleep/Lenovo SL410

2010-10-01 Thread Paul B Mahol
On 10/1/10, Ian Smith  wrote:
> On Thu, 30 Sep 2010, Matt wrote:
>  >  Success!
>  >
>  > After setting every possible suspend/resume sysctl,
>  > "sysctl hw.pci.do_power_resume=0"
>  > allowed suspend and resume. Still beeps 1-3 times before suspend, with
> rapid
>  > sleep light flashing until suspend complete.
>
> Interesting; $someone may document do_power_resume a bit more $someday?
It is already documented.
>
>  > Kernel conf is attached.
>  > World built from last Friday's CVS, -CURRENT
>  >
>  > acpiconf -s3 works perfectly from console
>  > previously opened windows are garbled until refresh in X
>
> Some thinkpads have responded positively in this regard to setting
> hw.syscons.sc_no_suspend_vtswitch=1
>
>  > acpiconf -s4 causes shutdown, does not resume on power on.
>
> Suspend To Disk is not expected to work; your laptop (like most) has no
> BIOS support for S4, as per your hw.acpi.s4bios: 0

Suspend to disk does not work because FreeBSD does not support it.
(s4bios is irrelevant here)
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Very interesting paper: An Analysis of Linux Scalability to many Cores

2010-10-01 Thread Andre Oppermann

Just saw the link to a very interesting paper on SMP scalability.
A very good read and highly relevant for our efforts as well.  In
certain areas we may already fare better, in others we still have
some work to do.

An Analysis of Linux Scalability to many Cores

ABSTRACT
 This paper analyzes the scalability of seven system applications
 (Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce)
 running on Linux on a 48-core computer. Except for gmake, all
 applications trigger scalability bottlenecks inside a recent Linux
 kernel. Using mostly standard parallel programming techniques—
 this paper introduces one new technique, sloppy counters—
 these bottlenecks can be removed from the kernel or avoided by
 changing the applications slightly. Modifying the kernel required
 in total 3002 lines of code changes. A speculative conclusion from
 this analysis is that there is no scalability reason to give up on
 traditional operating system organizations just yet.

http://pdos.csail.mit.edu/papers/linux:osdi10.pdf

--
Andre

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread David Wolfskill
I have recently acquired a new laptop (to replace the "Frankenlaptop"
I've been using for the last several years).

The new machine is a Dell Precision M4400, so it's pretty recent
technology compared to what I'm used to.  :-}

I installed FreeBSD 8.1-R on slice 1, customized it a bit to work in my
environment, then cloned slice 1 to slice 2, booted from slice 2,
populated /usr/src via "svn co" (pointing to stable/8, and upgraded
slice to stable/8 as of r213245.

So far, so good.

I tinkered with it a bit more, building ports the way I want them, &c.;
the following day, I upgraded to r213267.

That went well, so I cloned slice 2 to slice 4, used "svn switch" to
flip /usr/src from slice 4 to head, booted from slice 4, and upgraded
slice 4 to 9.0-CURRENT as of r213267.

On the reboot following the install (the "smoke test"), I noticed that
the machine got most of the way through the kernel probes, then hung
(requiring a power cycle to break out of it).

It did this a few more times, then the next boot worked.

I thought this odd, but not necessarily demonstrating a problem with
FreeBSD: I hadn't had much experienmce with this particular hardware,
after all.

The following day (as is my usual pattern), I upgraded slice 2 to
stable/8 as of r213295 without incident.  After upgrading the installed
ports, I then booted from slice 4 (after several tries), then upgraded
slice 4 to head as of r213295.  Again, attempts to boot from slice 4
usually -- but not always -- would hang, always in the same place.

Now, I had hada a somewhat-similar hang on my work desktop, which is
also a Dell machine.  And in that case -- though there were several
differences, soime of which may well be relevant -- a BIOS upgrade
resolved that issue.

So I checked; the laptop had BIOS A19, and Dell had A23 available.

This morning, I upgraded slice 2 to stable/8 as of r213322, booted slice
4, and upgraded it to head as of r213322.  Again, it woudl hang more
often than not.

This afternoon, after receiving appropriate encouragement (that yes, I
probably could use Dell's Linux BIOS updater from a KNOPPIX
environment), I was able to successfully update the BIOS to A23.

Unfortunately, booting head (slice 4) still hangs -- usually.  I'm
unable to detect a pattern in why it sometimes boots OK, while most of
the time it hangs.

So when it hangs (today), It's runing:

FreeBSD localhost 9.0-CURRENT FreeBSD 9.0-CURRENT #1 r213322: Fri Oct  1 
10:18:30 PDT 2010 r...@g1-222.catwhisker.org.:/usr/obj/usr/src/sys/CANARY  
i386

And looking at the stable/8 /var/log/messages, when it boots under head,
it runs along:

...
Oct  1 13:37:41 localhost kernel: ugen6.1:  at usbus6
Oct  1 13:37:41 localhost kernel: uhub6:  on usbus6
Oct  1 13:37:41 localhost kernel: ugen7.1:  at usbus7
Oct  1 13:37:41 localhost kernel: uhub7:  on usbus7
Oct  1 13:37:41 localhost kernel: ad4: 238475MB  at 
ata2-master UDMA100 SATA 3Gb/s
Oct  1 13:37:41 localhost kernel: acd0: DVDR  
at ata3-master UDMA100 SATA 1.5Gb/s
Oct  1 13:37:41 localhost kernel: hdac0: HDA Codec #0: IDT 92HD71B7
Oct  1 13:37:41 localhost kernel: pcm0:  at cad 
0 nid 1 on hdac0
Oct  1 13:37:41 localhost kernel: pcm1:  at cad 
0 nid 1 on hdac0
Oct  1 13:37:41 localhost kernel: pcm2:  at 
cad 0 nid 1 on hdac0
Oct  1 13:37:41 localhost kernel: uhub0: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub1: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub2: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub4: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub5: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub6: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub3: 6 ports with 6 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub7: 6 ports with 6 removable, self powered
Oct  1 13:37:41 localhost kernel: acd0: FAILURE - INQUIRY ILLEGAL REQUEST 
asc=0x24 ascq=0x00 
Oct  1 13:37:41 localhost kernel: (probe0:ata3:0:0:0): TEST UNIT READY. CDB: 0 
0 0 0 0 0 
Oct  1 13:37:41 localhost kernel: (probe0:ata3:0:0:0): CAM status: SCSI Status 
Error
Oct  1 13:37:41 localhost kernel: (probe0:ata3:0:0:0): SCSI status: Check 
Condition
Oct  1 13:37:41 localhost kernel: (probe0:ata3:0:0:0): SCSI sense: NOT READY 
asc:3a,1 (Medium not present - tray closed)
Oct  1 13:37:41 localhost kernel: cd0 at ata3 bus 0 scbus1 target 0 lun 0
Oct  1 13:37:41 localhost kernel: cd0:  
Removable CD-ROM SCSI-0 device 
Oct  1 13:37:41 localhost kernel: cd0: 100.000MB/s transfersSMP: AP CPU #1 
Launched!
Oct  1 13:37:41 localhost kernel: 
Oct  1 13:37:41 localhost kernel: cd0: Attempt to query device size failed: NOT 
READY, Medium not present - tray closed
Oct  1 13:37:41 localhost kernel: Trying to mount root from ufs:/dev/ad4s2a
Oct  1 13:37:41 localhost kernel: ugen2.2:  at usbus2


and stops right there when it hangs.

When it does not hang, the boot continues at that poin

Re: ping6 cause kernel panic in nd6_output_lle()

2010-10-01 Thread Bjoern A. Zeeb

On Fri, 1 Oct 2010, Andrey Chernov wrote:


Pinging nonexistent IPv6 adress withing the same prefixlen 64 (i.e.
nonexistent neighbor) immediately cause kernel panic in nd6_output_lle()

See
http://img837.imageshack.us/img837/7496/01102010f.jpg


want to try the patch from kern/148857?

/bz

--
Bjoern A. Zeeb  Welcome a new stage of life.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread David Wolfskill
On Fri, Oct 01, 2010 at 02:20:38PM -0700, David Wolfskill wrote:
> I have recently acquired a new laptop (to replace the "Frankenlaptop"
> I've been using for the last several years).
> ...
> While I'm not about to assume that this indicates something wrong
> with FreeBSD, I'm a bit less inclined to believe that it might be a
> hardware/BIOS issue than I was yesterday.
> 
> Here are some differences between what I saw with my work desktop vs.
> the new laptop:
> 
> * Desktop would reliably hang on each alternate boot.  No pattern
>   detected for laptop, but hangs predominate (by a factor of about 4:1).
> 
> * Desktop would hang on alternate boots regardless of which branch of
>   FreeBSD I was trying to boot.  Laptop only hangs on head.
> 
> * BIOS upgrade resolved issue with desktop.  So far, it hasn't with the
>   laptop.
> 
> How might I get sufficient appropriate additional detail that I might be
> able to help get this figured out, and possibly even fixed?
> ...

At a colleague's suggestion, I tried disabling some of the devices in
the BIOS.

Under System Configuration/Miscellaneous Devices, there are several
devices listed; eac is enabled by default.

I found the disabling the "Module Bay" appears to avoid the hang --
reliably.

That appears to be the minimally-invasive change necessary to avoid the
hang.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpYmC88jbEAc.pgp
Description: PGP signature


Re: Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread David Wolfskill
On Fri, Oct 01, 2010 at 04:30:01PM -0700, David Wolfskill wrote:
> ...
> I found the disabling the "Module Bay" appears to avoid the hang --
> reliably.
> 
> That appears to be the minimally-invasive change necessary to avoid the
> hang.
> 

Until I realized what was in the Modular Bay: the CD/CVD reader/burner.

So I tried a variation on the theme:  I left all the devices enabled,
but I physically removed the device from the bay before booting -- and
was unable to get it to fail.

And -- just now -- I disabled the channel (via atacontrol(8)), inserted
the drive, and enabled the channel:

g1-222# atacontrol list
ATA channel 0:
Master:  no device present
Slave:   no device present
ATA channel 1:
Master:  no device present
Slave:   no device present
ATA channel 2:
Master:  ad4  SATA revision 2.x
Slave:   no device present
ATA channel 3:
Master:  no device present
Slave:   no device present
ATA channel 4:
Master:  no device present
Slave:   no device present
ATA channel 5:
Master:  no device present
Slave:   no device present
g1-222# atacontrol detach ata3
g1-222# atacontrol list
ATA channel 0:
Master:  no device present
Slave:   no device present
ATA channel 1:
Master:  no device present
Slave:   no device present
ATA channel 2:
Master:  ad4  SATA revision 2.x
Slave:   no device present
ATA channel 3:
Master:  no device present
Slave:   no device present
ATA channel 4:
Master:  no device present
Slave:   no device present
ATA channel 5:
Master:  no device present
Slave:   no device present
g1-222# atacontrol attach ata3
Master: acd0  SATA revision 1.x
Slave:   no device present
g1-222# atacontrol list
ATA channel 0:
Master:  no device present
Slave:   no device present
ATA channel 1:
Master:  no device present
Slave:   no device present
ATA channel 2:
Master:  ad4  SATA revision 2.x
Slave:   no device present
ATA channel 3:
Master: acd0  SATA revision 1.x
Slave:   no device present
ATA channel 4:
Master:  no device present
Slave:   no device present
ATA channel 5:
Master:  no device present
Slave:   no device present
g1-222# 

This is running:

FreeBSD g1-222.catwhisker.org. 9.0-CURRENT FreeBSD 9.0-CURRENT #1 r213322: Fri 
Oct  1 10:18:30 PDT 2010 
r...@g1-222.catwhisker.org.:/usr/obj/usr/src/sys/CANARY  i386

Any ideas on what mught be causing CURRENT to hang -- sometimes
-- given that it appears to involve the Modular Bay (or the specific
device that is in the bay during the hang)?

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgp30wudNEIgM.pgp
Description: PGP signature


Re: Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread Brandon Gooch
On Fri, Oct 1, 2010 at 8:33 PM, David Wolfskill  wrote:
> On Fri, Oct 01, 2010 at 04:30:01PM -0700, David Wolfskill wrote:
>> ...
>> I found the disabling the "Module Bay" appears to avoid the hang --
>> reliably.
>>
>> That appears to be the minimally-invasive change necessary to avoid the
>> hang.
>> 
>
> Until I realized what was in the Modular Bay: the CD/CVD reader/burner.
>
> So I tried a variation on the theme:  I left all the devices enabled,
> but I physically removed the device from the bay before booting -- and
> was unable to get it to fail.
>
> And -- just now -- I disabled the channel (via atacontrol(8)), inserted
> the drive, and enabled the channel:
>
> g1-222# atacontrol list
> ATA channel 0:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 1:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 2:
>    Master:  ad4  SATA revision 2.x
>    Slave:       no device present
> ATA channel 3:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 4:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 5:
>    Master:      no device present
>    Slave:       no device present
> g1-222# atacontrol detach ata3
> g1-222# atacontrol list
> ATA channel 0:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 1:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 2:
>    Master:  ad4  SATA revision 2.x
>    Slave:       no device present
> ATA channel 3:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 4:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 5:
>    Master:      no device present
>    Slave:       no device present
> g1-222# atacontrol attach ata3
> Master: acd0  SATA revision 1.x
> Slave:       no device present
> g1-222# atacontrol list
> ATA channel 0:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 1:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 2:
>    Master:  ad4  SATA revision 2.x
>    Slave:       no device present
> ATA channel 3:
>    Master: acd0  SATA revision 1.x
>    Slave:       no device present
> ATA channel 4:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 5:
>    Master:      no device present
>    Slave:       no device present
> g1-222#
>
> This is running:
>
> FreeBSD g1-222.catwhisker.org. 9.0-CURRENT FreeBSD 9.0-CURRENT #1 r213322: 
> Fri Oct  1 10:18:30 PDT 2010     
> r...@g1-222.catwhisker.org.:/usr/obj/usr/src/sys/CANARY  i386
>
> Any ideas on what mught be causing CURRENT to hang -- sometimes
> -- given that it appears to involve the Modular Bay (or the specific
> device that is in the bay during the hang)?
>

If you haven't already, it may be worth trying 'options ATA_CAM' in
your kernel config.

-Brandon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread Garrett Cooper
On Fri, Oct 1, 2010 at 6:33 PM, David Wolfskill  wrote:
> On Fri, Oct 01, 2010 at 04:30:01PM -0700, David Wolfskill wrote:
>> ...
>> I found the disabling the "Module Bay" appears to avoid the hang --
>> reliably.
>>
>> That appears to be the minimally-invasive change necessary to avoid the
>> hang.
>> 
>
> Until I realized what was in the Modular Bay: the CD/CVD reader/burner.
>
> So I tried a variation on the theme:  I left all the devices enabled,
> but I physically removed the device from the bay before booting -- and
> was unable to get it to fail.
>
> And -- just now -- I disabled the channel (via atacontrol(8)), inserted
> the drive, and enabled the channel:
>
> g1-222# atacontrol list
> ATA channel 0:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 1:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 2:
>    Master:  ad4  SATA revision 2.x
>    Slave:       no device present
> ATA channel 3:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 4:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 5:
>    Master:      no device present
>    Slave:       no device present
> g1-222# atacontrol detach ata3
> g1-222# atacontrol list
> ATA channel 0:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 1:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 2:
>    Master:  ad4  SATA revision 2.x
>    Slave:       no device present
> ATA channel 3:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 4:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 5:
>    Master:      no device present
>    Slave:       no device present
> g1-222# atacontrol attach ata3
> Master: acd0  SATA revision 1.x
> Slave:       no device present
> g1-222# atacontrol list
> ATA channel 0:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 1:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 2:
>    Master:  ad4  SATA revision 2.x
>    Slave:       no device present
> ATA channel 3:
>    Master: acd0  SATA revision 1.x
>    Slave:       no device present
> ATA channel 4:
>    Master:      no device present
>    Slave:       no device present
> ATA channel 5:
>    Master:      no device present
>    Slave:       no device present
> g1-222#
>
> This is running:
>
> FreeBSD g1-222.catwhisker.org. 9.0-CURRENT FreeBSD 9.0-CURRENT #1 r213322: 
> Fri Oct  1 10:18:30 PDT 2010     
> r...@g1-222.catwhisker.org.:/usr/obj/usr/src/sys/CANARY  i386
>
> Any ideas on what mught be causing CURRENT to hang -- sometimes
> -- given that it appears to involve the Modular Bay (or the specific
> device that is in the bay during the hang)?

Do you have boot -v output?
Thanks,
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread David Wolfskill
On Fri, Oct 01, 2010 at 07:22:33PM -0700, Garrett Cooper wrote:
> ...
> > Any ideas on what mught be causing CURRENT to hang -- sometimes
> > -- given that it appears to involve the Modular Bay (or the specific
> > device that is in the bay during the hang)?
> 
> Do you have boot -v output?

Yes; please see ,
which has:

albert(8.1-S)[11] ls -lT
total 196
-rw-r--r--  1 david  wheel   11497 Oct  1 20:19:06 2010 console.log
-rw-r--r--  1 david  wheel   60397 Oct  1 19:26:23 2010 dmesg.boot
-rw-r--r--  1 david  wheel  114752 Oct  1 20:21:50 2010 messages
albert(8.1-S)[12] 

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgp9IN5A3Ck5q.pgp
Description: PGP signature


Re: Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread Garrett Cooper
On Fri, Oct 1, 2010 at 8:24 PM, David Wolfskill  wrote:
> On Fri, Oct 01, 2010 at 07:22:33PM -0700, Garrett Cooper wrote:
>> ...
>> > Any ideas on what mught be causing CURRENT to hang -- sometimes
>> > -- given that it appears to involve the Modular Bay (or the specific
>> > device that is in the bay during the hang)?
>>
>>     Do you have boot -v output?
>
> Yes; please see ,
> which has:
>
> albert(8.1-S)[11] ls -lT
> total 196
> -rw-r--r--  1 david  wheel   11497 Oct  1 20:19:06 2010 console.log
> -rw-r--r--  1 david  wheel   60397 Oct  1 19:26:23 2010 dmesg.boot
> -rw-r--r--  1 david  wheel  114752 Oct  1 20:21:50 2010 messages
> albert(8.1-S)[12]

This might sound like a stupid idea, but can you try booting with
a CD/DVD in the drive?
Thanks,
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread David Wolfskill
On Fri, Oct 01, 2010 at 08:56:13PM -0500, Brandon Gooch wrote:
> ...
> > Any ideas on what mught be causing CURRENT to hang -- sometimes
> > -- given that it appears to involve the Modular Bay (or the specific
> > device that is in the bay during the hang)?
> >
> 
> If you haven't already, it may be worth trying 'options ATA_CAM' in
> your kernel config.

Well, I hadn't, so I tried it.  And since I had read the note:

# ATA_CAM:  Turn ata(4) subsystem controller drivers into cam(4)
#   interface modules. This deprecates all ata(4)
#   peripheral device drivers (atadisk, ataraid, atapicd,
#   atapifd, atapist, atapicam) and all user-level APIs.
#   cam(4) drivers and APIs will be connected instead.

I had an idea that /etc/fstab would need a global edit (though I
didn't know what device would be used).

Attempting a single-user mode boot clarified that for me:

:s/ad4/ada0/

Fortunately, I'm in the habit of keeping more than one bootable slice
around, so I was able to accomplish that.

And -- probably more important for this discussion -- I was unable to
re-create the failing condition: the machine booted Just Fine every
time I tried it, whether the CD/DVD drive was inserted or not.

Now, this seems a bit more like a circumvention or diagnostic aid
(to me) than an actual solution -- or am I misunderstanding?

Thanks for the suggestion, though!

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpkjlSXMIEeB.pgp
Description: PGP signature


Re: Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread David Wolfskill
On Fri, Oct 01, 2010 at 08:37:50PM -0700, Garrett Cooper wrote:
> ...
> This might sound like a stupid idea, but can you try booting with
> a CD/DVD in the drive?

Ah -- sorry about that. :-(

OK; it may be a bit before I get a successful verbose boot without
ATA_CAM and with the CD/DVD drive inserted.  But I will work on it and
replace the files I placed with new ones.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgp9Bmy3BNEn6.pgp
Description: PGP signature


Re: Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread David Wolfskill
On Fri, Oct 01, 2010 at 08:37:50PM -0700, Garrett Cooper wrote:
> ...
> > albert(8.1-S)[11] ls -lT
> > total 196
> > -rw-r--r--  1 david  wheel   11497 Oct  1 20:19:06 2010 console.log
> > -rw-r--r--  1 david  wheel   60397 Oct  1 19:26:23 2010 dmesg.boot
> > -rw-r--r--  1 david  wheel  114752 Oct  1 20:21:50 2010 messages
> > albert(8.1-S)[12]
> 
> This might sound like a stupid idea, but can you try booting with
> a CD/DVD in the drive?

Again, my apologies.  And I got lucky: it booted OK on the second try.

The files arfe now:

albert(8.1-S)[2] ls -lT
total 232
-rw-r--r--  1 david  wheel   22824 Oct  1 20:50:04 2010 console.log
-rw-r--r--  1 david  wheel   60458 Oct  1 20:45:38 2010 dmesg.boot
-rw-r--r--  1 david  wheel  142589 Oct  1 20:47:47 2010 messages
albert(8.1-S)[3] 

Thanks.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgp6gOB8qZiAh.pgp
Description: PGP signature


Re: Hang near end of kernel probes since r213267 (likely earlier)

2010-10-01 Thread Brandon Gooch
On Fri, Oct 1, 2010 at 10:38 PM, David Wolfskill  wrote:
> On Fri, Oct 01, 2010 at 08:56:13PM -0500, Brandon Gooch wrote:
>> ...
>> > Any ideas on what mught be causing CURRENT to hang -- sometimes
>> > -- given that it appears to involve the Modular Bay (or the specific
>> > device that is in the bay during the hang)?
>> >
>>
>> If you haven't already, it may be worth trying 'options ATA_CAM' in
>> your kernel config.
>
> Well, I hadn't, so I tried it.  And since I had read the note:
>
> # ATA_CAM:              Turn ata(4) subsystem controller drivers into cam(4)
> #                       interface modules. This deprecates all ata(4)
> #                       peripheral device drivers (atadisk, ataraid, atapicd,
> #                       atapifd, atapist, atapicam) and all user-level APIs.
> #                       cam(4) drivers and APIs will be connected instead.
>
> I had an idea that /etc/fstab would need a global edit (though I
> didn't know what device would be used).
>
> Attempting a single-user mode boot clarified that for me:
>
>        :s/ad4/ada0/
>
> Fortunately, I'm in the habit of keeping more than one bootable slice
> around, so I was able to accomplish that.
>
> And -- probably more important for this discussion -- I was unable to
> re-create the failing condition: the machine booted Just Fine every
> time I tried it, whether the CD/DVD drive was inserted or not.
>
> Now, this seems a bit more like a circumvention or diagnostic aid
> (to me) than an actual solution -- or am I misunderstanding?
>
> Thanks for the suggestion, though!
>

The ATA_CAM option allows the use of the new CAM ATA infrastructure,
replacing the legacy operation of the ATA devices. The following may
be enlightening:

http://lists.freebsd.org/pipermail/freebsd-current/2009-December/013956.html

I've been using it now on all of my newer machines; it's well worth
the slight changes to your setups considering what you stand to gain.
I definitely would not consider it a work-around, unless you
absolutely require legacy operation of some device.

-Brandon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"