At 12:26 PM 12/04/2005 -0600, Scott Long wrote this to All:
David Sze wrote:
At 11:31 PM 10/04/2005 -0600, Scott Long wrote this to All:

Making a driver PAE-ified means either teaching it to do 64-bit
scatter-gather (assuming that the peripheral hardware can do this
and that it's documented), or teaching the driver to correctly handle
EINPROGRESS from bus_dmamap_load() along with using the proper busdma
tag limits.  The strategy I took with 6.x/5.x was the second one since
I didn't have good IPS docs in front of me and I wanted it follow the
APIs correctly.  I did test it with 8GB of memory and it performed
correctly under load.  I haven't taken a close enough look at your
MFC patch to say for sure if it's correct or not.  I'm not sure if
I'll have time to take another look in the next few days, unfortunately.
Is there any chance you could test 5.x/6.0 under load with PAE just to
validate the assertion that it works correctly there?

I had a chance to test 5.4-RC1 (i386) today with GENERIC, SMP, PAE, and SMP-PAE kernels (the last one is just PAE with "options SMP").
To recap, the hardware is an IBM xSeries 346, Dual Xeon 3GHz (non-E64MT), ServeRAID-7K.
GENERIC and SMP survived "make buildkernel", but PAE and SMP-PAE paniced reproducibly doing the same. The DDB stack trace doesn't appear to be anywhere near the IPS driver though, so I'm way out of my league.

Darnit, hard to say if this is an existing bug in 5.4 or if it's a bug/corruption in ips.Can you re-run with PAE disabled?

Works fine with PAE disabled (or at least I couldn't get it to panic), both UP and SMP kernels.


Would you be
willing to put the Giant lock back on top of the driver?  This would
mean modifying the call to bus_intr_config(), adding the D_GIANTNEEDED
flag to the disk structure in disk_create(), and switching the mutex
argument in bus_dma_tag_create() for the sg_dmatag tag.

I put Giant back in as you described (patch attached), but it still panic'ed with PAE enabled, both UP and SMP kernels. The stack trace was very similar; the fault address (0x24) and the top three stack frames were the same as without Giant:

        propagate_priority
        turnstile_wait
        _mtx_lock_sleep

At this point I no longer have access to the hardware, the customer wanted his servers back. They're going into the datacenter with RELENG_4 (w/IPS stability patch), without PAE (so the top ~900MB of his 4GB RAM is lost to PCI-X address space).



Attachment: ips.RELENG_5_4.giant.patch
Description: Binary data

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to