from:"John Baldwin"

Re: [PATCH] hwpmc(4) syscall arguments fix

2010-11-01 Thread John Baldwin

On Friday, October 29, 2010 8:12:06 pm Oleksandr Tymoshenko wrote:
>  I ran into problems trying to get hwpmc to work on 64-bit MIPS
> system with big endian byte order. Turned out hwpmc syscall handler
> is byte-order and register_t size agnostic unlike the rest of syscalls.
> The best solution I have so far is a copy sys/sysproto.h approach:
> http://people.freebsd.org/~gonzo/patches/hwpmc-syscall.diff
> 
> Any other ideas how to get it fixed in more clean way?

Yes, a better way would be to add pmc_syscall() to sys/kern/syscalls.master as 
a NOSTD system call.  Then it's arguments would be included in sysproto.h 
directly.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Fileops in file.h

2010-11-08 Thread John Baldwin

On Sunday, November 07, 2010 10:08:08 am Fernando Apesteguía wrote:
> Hi,
> 
> I'm trying to understand  some pieces of the FreeBSD kernel.
> Having a look at struct fileops in file.h I was wondering why other
> file related functions don't have an entry in the function vector. I
> was thinking in mmap, fsync or sendfile.
> 
> Can anyone tell me the reason?

Mostly that it hasn't been done yet.  If there was a clean way to do an 
f_mmap() and get some of the type-specific knowledge out of vm_mmap.c I'd 
really like it.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [PATCH] mptutil(8) - capture errors and percolate up to caller

2010-11-08 Thread John Baldwin

On Saturday, November 06, 2010 4:13:23 am Garrett Cooper wrote:
> Similar to r214396, this patch deals with properly capturing error
> and passing it up to the caller in mptutil just in case the errno
> value gets stomped on by warn*(3); this patch deals with an improper
> use of warn(3), and also some malloc(3) errors, as well as shrink down
> some static buffers to fit the data being output.
> If someone could review and help me commit this patch it would be
> much appreciated; all I could do is run negative tests on my local box
> and minor positive tests on my vmware fusion instance because it
> doesn't fully emulate a fully working mpt(4) device (the vmware
> instance consistently crashed with a warning about the mpt
> controller's unimplemented features after I poked at it enough).
> I'll submit another patch to fix up style(9) in this app if requested.
> Thanks!

The explicit 'return (ENOMEM)' calls are fine as-is.  I do not think they need 
changing.

Having static char arrays of '15' rather than '16' is probably pointless.  The 
stack is already at least 4-byte aligned on all the architectures we support, 
so a 15-byte char array will actually be 16 bytes.  It was chose to be a good
enough value, not an exact fit.  An exact fit is not important here.

Moving the 'buf' in mpt_raid_level() is a style bug.  It should stay where it 
is.  Same with 'buf' in mpt_volstate() and mpt_pdstate().

IOC_STATUS_SUCCESS() returns a boolean, it is appropriate to test it with ! 
rather than == 0.  It is also easier for a person to read the code that way.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: libkvm: consumers of kvm_getprocs for non-live kernels?

2010-11-11 Thread John Baldwin

On Wednesday, November 10, 2010 3:41:52 pm Ulrich Spörlein wrote:
> Hi,
> 
> I have this cleanup of libkvm sitting in my tree and it needs a little
> bit of testing, especially the function kvm_proclist, which is only
> called from kvm_deadprocs which is only called from kvm_getprocs when kd
> is not ALIVE.
> 
> The only consumer in our tree that I can make out is *probably* kgdb, as
> ps(1), top(1), w(1), pkill(1), fstat(1), systat(1), pmcstat(8) and
> bsnmpd don't really work on coredumps

ps and fstat certainly work fine on crashdumps.  w did before devfs (it 
doesn't have a good way to map the device entries from the crashed kernel to 
the entries in wtmp IIRC).  kvm_getprocs() is certainly actively used by 
various programs on crashdumps and works.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread John Baldwin

On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote:
> I'm trying to build a high-level language wrapper around kqueue/kevent,
> specifically, a Perl wrapper.
> 
> (In fact I am trying to fix this bug:
>   http://rt.cpan.org/Public/Bug/Display.html?id=61481
> )
> 
> My plan is to use the  void *udata  field of a kevent watcher to store a
> pointer to some user-provided Perl data structure (an SV*), to associate
> with the event. Typically this could be a code reference for an event
> callback or similar, but the exact nature doesn't matter. It's a pointer
> to a reference-counted data structure. SvREFCNT_dec(sv) is the function
> used to decrement the reference counter.
> 
> To account for the fact that the kernel stores a pointer here, I'm
> artificially increasing the reference count on the object, so that it
> still remains alive even if the rest of the Perl code drops it, to rely
> on getting it back out of the kernel in an individual kevent. At some
> point when the kernel has finished looking after the event, this count
> needs to be decreased again, so the structure can be freed.
> 
> I am having trouble trying to work out how to do this, or rather, when.
> I have the following problems:
> 
>  * If the event was registered using EV_ONESHOT, when it gets fired the
>flags that come back in the event stucture do not include EV_ONESHOT.
> 
>  * Some events can only happen once, such as watching for EVFILT_PROC
>NOTE_EXIT events.
> 
>  * The kernel can silently drop watches, such as when the process calls
>close() on a filehandl with an EVFILT_READ or EVFILT_WRITE watch.
> 
>  * There doesn't seem to be a way to query that pointer back out of the
>kernel, in case the user code wants to EV_DELETE the watch.
> 
> These problems all mean that I never quite know when I ought to call
> SvREFCNT_dec() on that pointer.
> 
> My current best-attack plan looks like the following:
> 
>  a) Store a structure in the  void *udata  that contains the actual SV*
> pointer and a flag to remember if the event had been installed as
> EV_ONESHOT (or remember if it was one of the event types that is
> oneshot anyway)
> 
>  b) Store an entire mapping in userland from filter+identity to pointer,
> so that if userland wants to EV_DELETE the watch early, it has the
> pointer to be able to drop it.
> 
> I can't think of a solution to the close() problem at all, though.
> 
> Part a of my solution seems OK (though I'd wonder why the flags back
> from the kernel don't contain EV_ONESHOT), but part b confuses me. I had
> thought the point of kqueue/kevent is the O(1) nature of it, which is
> among why the kernel is storing that  void *udata  pointer in the first
> place. If I have to store a mapping from every filter+identity back to
> my data pointer, why does the kernel store one at all? I could just
> ignore the udata field and use my mapping for my own purposes.
> 
> Have I missed something here, then? I was hoping there'd be a nice way
> for kernel to give me back those pointers so I can just decrement a
> refcount on it, and have it reclaimed. 

I think the assumption is that userland actually maintains a reference on the 
specified object (e.g. a file descriptor) and will know to drop the associated 
data when the file descriptor is closed.  That is, think of the kevent as a 
member of an eventable object rather than a separate object that has a 
reference to the eventable object.  When the eventable object's reference 
count drops to zero in userland, then the kevent should be deleted, either via 
EV_DELETE, or implicitly (e.g. by closing the associated file descriptor).

I think in your case you should not give the kevent a reference to your 
object, but instead remove the associated event for a given object when an 
object's refcount drops to zero.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Phantom sysctl

2010-11-15 Thread John Baldwin

On Monday, November 15, 2010 12:53:57 pm Garrett Cooper wrote:
> According to SYSCTL_INT(9):
> 
>  The SYSCTL kernel interfaces allow code to statically declare sysctl(8)
>  MIB entries, which will be initialized when the kernel module containing
>  the declaration is initialized.  When the module is unloaded, the sysctl
>  will be automatically destroyed.
> 
> The sysctl should be reaped when the module is unloaded. My dumb
> test kernel module [1] doesn't seem to do that though (please note
> that the OID test_int_sysctl is created, and not reaped... FWIW it's
> kind of bizarre that test_int_sysctl is created in the first place,
> given what I've seen when SYSCTL_* gets executed):

I believe I have seen this work properly before.  Look for 'sysctl' in
sys/kern/kern_linker.c to see the sysctl hooks invoked on kldload and
kldunload to manage these sysctls.  You will probably want to start your
debugging in the unload hook as it sounds like the node is not being
fully deregistered.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread John Baldwin

On Monday, November 15, 2010 1:12:11 pm Paul LeoNerd Evans wrote:
> On Mon, Nov 15, 2010 at 11:25:42AM -0500, John Baldwin wrote:
> > I think the assumption is that userland actually maintains a reference on 
> > the 
> > specified object (e.g. a file descriptor) and will know to drop the 
> > associated 
> > data when the file descriptor is closed.  That is, think of the kevent as a 
> > member of an eventable object rather than a separate object that has a 
> > reference to the eventable object.  When the eventable object's reference 
> > count drops to zero in userland, then the kevent should be deleted, either 
> > via 
> > EV_DELETE, or implicitly (e.g. by closing the associated file descriptor).
> 
> Ah. Well, that could be considered a bit more awkward for the use case I
> wanted to apply. The idea was that the  udata  would refer effectively
> to a closure, to invoke when the event happens. The idea being you can
> just add an event watcher by, say:
> 
>   $ev->EV_SET( $pid, EVFILT_PROC, 0, NOTE_EXIT, 0, sub {
>  print STDERR "The child process $pid has now exited\n";
>   } );
> 
> So, the kernel's udata pointer effectively holds the only reference to
> this anonymous closure. It's much more flexible this way, especially for
> oneshot events like that.
> 
> The beauty is also that the kevents() loop can simply know that the
> udata is always a code reference so just has to invoke it to do whatever
> the original caller wanted to do.
> 
> Keep in mind my use-case here; I'm not trying to be one specific
> application, it's a general-purpose kevent-wrapping library.

So is GCD (Apple's libdispatch).  It also implements closures on top of
kevent.  However, the way it works is that it doesn't expose kevent()
directly, instead it uses kevent to implement asynchronous I/O on a 
socket for example, and since it is logically managing the life cycle
of a socket, it knows when the socket is closed and cleans up then.

> > I think in your case you should not give the kevent a reference to your 
> > object, but instead remove the associated event for a given object when an 
> > object's refcount drops to zero.
> 
> Well that's certainly doable in longrunning watches, but I don't think
> it sounds very convenient for a oneshot event; see the above example for
> justification.

For the above case, if you know an event is one shot, you should either
use EV_ONESHOT, or use a wrapper around the closure that clears the event
after the closure runs (or possibly before the closure runs?)

> Also it again begs my question, worth repeating here:
> 
> On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote:
> > I had
> > thought the point of kqueue/kevent is the O(1) nature of it, which is
> > among why the kernel is storing that  void *udata  pointer in the first
> > place. If I have to store a mapping from every filter+identity back to
> > my data pointer, why does the kernel store one at all? I could just
> > ignore the udata field and use my mapping for my own purposes.
> 
> If you're saying that in my not-so-rare use case, I don't want to be
> using udata, and instead keeping my own mapping, why does the kernel
> provide this udata field at all?

Your use case is rare.  Almost all consumers of kevent() that I've seen
use kevent() as one part of a system that maintain the lifecycle of objects.
Those objects are only accessed within the system, so the system knows when
an object is closed and can release the resources at the same time.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: breaking the crunchgen logic into a share/mk file

2010-11-16 Thread John Baldwin

On Tuesday, November 16, 2010 8:01:43 am Andrey V. Elsukov wrote:
> On 08.11.2010 15:31, Adrian Chadd wrote:
> >> I've broken out the crunchgen logic from src/rescue/rescue into a
> >> share/mk file - that way it can be reused in other areas.
> >>
> >> The diff is here: http://people.freebsd.org/~adrian/crunchgen-mk.diff<
> >> http://people.freebsd.org/%7Eadrian/crunchgen-mk.diff>
> >>
> >> This bsd.crunchgen.mk file is generic enough to use in my
> >> busybox-style thing as well as for src/rescue/rescue/.
> >>
> >> Comments, feedback, etc welcome!
> 
> It seems this broke usage of livefs from sysinstall.
> sysinstall does check for /rescue/ldconfig and can not find it there.
> I think attached patch can fix this issue (not tested).

Err, are there no longer hard links to all of the frontends for a given 
crunch?  If so, that is a problem as it will make rescue much harder to use.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Software interrupts; where to start

2010-11-16 Thread John Baldwin

On Tuesday, November 16, 2010 12:08:51 pm Nathan Vidican wrote:
> What I would like to do, is replace the above scenario with one wherein the
> program writing to the serial port is always connected and running, but not
> polling; ideally having some sort of interupt or signal triggered from
> within memcached when a value is altered. Sort of a 're-sync' request
> asserting that the program sending data out the serial port should 'loop
> once'. I'd like to continue with the use of memcached as it provides a
> simple way for multiple systems to query the values in the array as well,
> (ie: some devices need not change the data, but only view it; given the
> latency requirements memcached operates ideally). This trigger should be
> asynchronous in that it should be fired and forgotten by memcached (by
> nature of the hardware designed, no error-checking nor receipt would be
> needed).
> 
> I'm just not sure where to start? Could someone send me the right RTFM link
> to start from, or perhaps suggest a better way to look at solving this
> problem? Ideally any example code to look at with a simple signal or
> interrupt type of handler would be great. What I'm leaning towards is
> modifying memcached daemon to send a signal or trigger an interrupt of some
> sort to tell the other program communicating with the device to re-poll
> once. It would also be nice to have a way to trigger from other programs
> too.

A simple solution would be to create a pipe shared between memcached and the 
process that writes to the serial port.  memcached would write a dummy byte to 
the pipe each time it updates the values.  Your app could either use 
select/poll/kqueue or a blocking read on the pipe to sleep until memcached 
does an update.  That requires modify memcached though.  I'm not familiar 
enough with memcached to know if it already has some sort of signalling 
facility already that you could use directly.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: breaking the crunchgen logic into a share/mk file

2010-11-20 Thread John Baldwin

On Tuesday, November 16, 2010 8:45:08 am Andrey V. Elsukov wrote:
> On 16.11.2010 16:29, John Baldwin wrote:
> > Err, are there no longer hard links to all of the frontends for a given 
> > crunch?  If so, that is a problem as it will make rescue much harder to use.
> 
> Yes, probably this patch is not needed and it should be fixed somewhere in
> makefiles. But currently rescue does not have any hardlinks:
> http://pub.allbsd.org/FreeBSD-snapshots/i386-i386/9.0-HEAD-20101116-JPSNAP/cdrom/livefs/rescue/
> 
> And what is was before:
> http://pub.allbsd.org/FreeBSD-snapshots/i386-i386/9.0-HEAD-20101112-JPSNAP/cdrom/livefs/rescue/

That definitely needs to be fixed.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: new cpuid bits

2010-11-22 Thread John Baldwin

On Friday, November 19, 2010 10:39:53 am Andriy Gapon wrote:
> 
> Guys,
> 
> I would like to add definitions for couple more useful CPUID bits, but I am
> greatly confused about how to name them.
> I failed to deduce the naming convention from the existing definitions and I 
> am
> not sure how to make the names proper and descriptive.
> 
> The bits in question are returned by CPUID.6 in EAX and ECX.
> CPUID.6 block is described by both AMD and Intel as "Thermal and Power 
> Management
> (Leaf)".  Bits in EAX are defined only for Intel at present, the bit in ECX is
> defined for both.
> 
> Description/naming of the bits from the specifications:
> EAX[0]: Digital temperature sensor is supported if set
> EAX[1]: Intel Turbo Boost Technology Available
> EAX[2]: ARAT. APIC-Timer-always-running feature is supported if set.
> ECX[0]:
>   Intel: Hardware Coordination Feedback Capability (Presence of Bits MCNT and 
> ACNT
> MSRs).
>   AMD:  EffFreq: effective frequency interface.
> 
> How does the following look to you?
> I will appreciate suggestions/comments.

Looks fine to me.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Quick i386 question...

2010-11-22 Thread John Baldwin

On Saturday, November 20, 2010 3:38:58 pm Sergio Andrés Gómez del Real wrote:
> If received an interrupt while in protected-mode and paging enabled,
> is linear address from IDT stored at the idtr translated using the
> paging-hierarchy structures?
> I have looked at the interrupt/exception chapter in the corresponding
> Intel manual but can't find the answer. Maybe I overlooked.

Yes.  A linear address is the flat virtual address after segments are taken 
into account.  It is the address used as an input to the paging support in the 
MMU.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Best way to determine if an IRQ is present

2010-11-22 Thread John Baldwin

On Saturday, November 20, 2010 4:58:02 pm Garrett Cooper wrote:
> Trying to do a complete solution for kern/145385, Andriy has
> raised concerns about IRQ mapping to CPUs; while I've have put
> together more pieces of the puzzle, I'm a bit confused how I determine
> whether or not an IRQ is available for use.
> Sure, I could linear probe a series of IRQs, but that would
> probably be expensive, and different architectures treat IRQs
> differently, so building assumptions based on the fact that IRQ
> hierarchy is done in a particular order is probably not the best thing
> to do.
> I've poked around kern/kern_cpuset.c and kern/kern_intr.c a bit
> but I may have missed something important...

Well, the real solution is actually larger than described in the PR.  What you 
really want to do is take the logical CPUs offline when they are "halted".  
Taking a CPU offline should trigger an EVENTHANDLER that various bits of code 
could invoke.  In the case of platforms that support binding interrupts to 
CPUs (x86 and sparc64 at least), they would install an event handler that 
searches the MD interrupt tables (e.g. the interrupt_sources[] array on x86) 
and move bound interrupts to other CPUs.  However, I think all the interrupt
bits will be MD, not MI.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Building my own release ISOs

2010-11-22 Thread John Baldwin

On Sunday, November 21, 2010 8:31:22 pm Sean Bruno wrote:
> Does this look about right to build from a test branch?
> 
> sudo make release SVNROOT=ssh+svn://svn.freebsd.org/base
> SVNBRANCH=projects/sbruno_64cpus MAKE_ISOS=y MAKE_DVD=y NO_FLOPPIES=y
> NODOC=y NOPORTSATALL=y WORLD_FLAGS=-j32 KERNEL_FLAGS=-j32
> BUILDNAME=sbruno CHROOTDIR=/new_release

Sure.  Note, though, that you don't have to create a branch just to build a 
release with a patch.  You can always use LOCAL_PATCHES to apply patches to 
the source tree you build a release against.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Best way to determine if an IRQ is present

2010-11-25 Thread John Baldwin


Andriy Gapon wrote:

on 22/11/2010 16:24 John Baldwin said the following:
Well, the real solution is actually larger than described in the PR.  What you 
really want to do is take the logical CPUs offline when they are "halted".  
Taking a CPU offline should trigger an EVENTHANDLER that various bits of code 
could invoke.  In the case of platforms that support binding interrupts to 
CPUs (x86 and sparc64 at least), they would install an event handler that 
searches the MD interrupt tables (e.g. the interrupt_sources[] array on x86) 
and move bound interrupts to other CPUs.  However, I think all the interrupt

bits will be MD, not MI.


That's a good idea and a comprehensive approach.
One minor technical detail - should an offlined CPU be removed from all_cpus 
mask/set?


That's tricky.  In other e-mails I've had on this topic, the idea has 
been to have a new online_cpus mask and maybe a CPU_ONLINE() test macro 
 similar to CPU_ABSENT().  In that case, an offline CPU should still be 
in all_cpus, but many places that use all_cpus would need to use 
online_cpus instead.


--
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: How to debug BTX loader?

2010-11-30 Thread John Baldwin

On Monday, November 29, 2010 1:01:27 pm Darmawan Salihun wrote:
> Hi guys, 
> 
> I'm currently working on a BIOS for a custom Single Board Computer (SBC). 
> I have the required BIOS source code and tools at hand. 
> However, the boot process always stuck in the BTX loader 
> (the infamous "ACPI autoload failed") when I booted out of USB stick 
> (with the FreeBSD 8.1 USB stick image). 
> 
> I could get the system to boot into FreeBSD 8.1 
> (by keeping the CDROM tray open and close it when the board looks for 
> boot device, otherwise BTX will reboot instantly). 

Are you getting an actual BTX error message or a freeze?  BTX is just a 
minikernel written all in assembly.  It doesn't handle loading the kernel, 
etc.  All that work is done by the /boot/loader program (which is written in 
C).  You can find all the source to the boot code in src/sys/boot.  The BTX 
kernel is in src/sys/boot/i386/btx/btx/.

However, to debug this further we would need more info such as what exactly 
you are seeing (a hang, a BTX fault with a register dump, etc.).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: 8.1-RELEASE hangs on reboot

2010-12-01 Thread John Baldwin

On Tuesday, November 30, 2010 8:23:19 pm Ondřej Majerech wrote:
> Hello,
> 
> my 8.1-R system has just started hanging on reboot. Specifically after
> I svn up'd my source and updated from 8.1-R-p1 to -p2.
> 
> Some kind of hang occurs on every reboot attempt. Usually it hangs at
> the "Rebooting..." message, but sometimes the thing just locks up
> before it even syncs disks. shutdown -p now seems to shutdown the
> system successfully each time.
> 
> So I booted into single-user mode, executed "reboot" and during the
> "Syncing disks" I pressed Ctrl-Alt-Escape to break into the debugger.
> There I single-stepped with the "s" command until the thing simply
> stopped doing anything. (Even if I pressed NumLock, the LED on the
> keyboard wouldn't turn off.)
> 
> The screen content at the moment of hang is (dutifully typed over as
> the thing is dead and I don't have a serial cable):
> 
> [thread pid 12 tid 100017 ]
> Stopped at sckbdevent+0x5f: call _mtx_unlock_flags
> db>
> [thread pid 12 tid 100017 ]
> Stopped at _mtx_unlock_flags: pushq %rbp
> db>
> [thread pid 12 tid 100017 ]
> Stopped at _mtx_unlock_flags+0x1: movq %rsp,%rbp
> db>
> [thread pid 12 tid 100017 ]
> Stopped at _mtx_unloock_flags+0x4: subq $0x20,%rsp
> db>
> [thread pid 12 tid 100017 ]
> Stopped at _mtx_unlock_flags+0x8: movq %rbx,(%rsp)
> db>
> [thread pid 12 tid 100017 ]
> Stopped at _mtx_unlock_flags+0xc: movq %r12,0x8(%rsp)
> db>
> [thread pid 12 pid 100017 ]
> Stopped at _mtx_unlock_flags+0x11: movq %rdi,%rbx
> db>
> [thread pid 12 pid 100017 ]
> Stopped at _mtx_unlock_flags+0x14: movq %r13,0x10(%rsp)
> db>
> E
> 
> Including that "E" at the end.

No good ideas here, though I think we just turned off PSL_T by
accident so it ran for a while before hanging after this.  'E' must be
the start of a message on the console.

> As I said, it's 8.1-RELEASE-p2; it's on AMD64. I'm using custom kernel
> which only differs from GENERIC by addition of the debugging options:
> 
> options INVARIANTS
> options INVARIANT_SUPPORT
> options WITNESS
> options DEBUG_LOCKS
> options DEBUG_VFS_LOCKS
> options DIAGNOSTIC
> 
> I tried rebooting with ACPI disabled, but the thing paniced on boot with
> 
> panic: Duplicate free of item 0xff00025e from zone
> 0xff00bfdcc2a0(1024)
> 
> cpuid = 0
> KDB: enter: panic
> [thread pid 0 tid 10 ]
> Stopped at kdb_enter+0x3d: movq $0, 0x6b2d20(%rip)
> db> bt
> Tracing pid 0 tid 10 td 0x80c63fc0
> kdb_enter() at kdb_enter+0x3d
> panic() at panic+0x17b
> uma_dbg_free() at uma_dbg_free+0x171
> uma_zfree_arg() at uma_zfree_arg+0x68
> free() at free+0xcd
> device_set_driver() at device_set_driver+0x7c
> device_attach() at device_attach+0x19b
> bus_generic_attach() at bus_generic_attach+0x1a
> pci_attach() at pci_attach+0xf1

The free() should be the free to free the softc but that implies it had a 
previous driver and softc.  Maybe add some debug info to devclass_set_driver() 
to print out the previous driver's name (and maybe the value of the pointer)
before free'ing the softc.  You could use gdb on the kernel.debug and the 
pointer value to figure out exactly which driver was the previous one and look 
to see if it's probe routine does something funky with the softc pointer.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: How to debug BTX loader?

2010-12-02 Thread John Baldwin

On Wednesday, December 01, 2010 4:09:42 pm Darmawan Salihun wrote:
> Hi John, 
> 
> --- On Tue, 11/30/10, John Baldwin  wrote:
> 
> > From: John Baldwin 
> > Subject: Re: How to debug BTX loader?
> > To: freebsd-hackers@freebsd.org
> > Cc: "Darmawan Salihun" 
> > Date: Tuesday, November 30, 2010, 9:38 AM
> > On Monday, November 29, 2010 1:01:27
> > pm Darmawan Salihun wrote:
> > > Hi guys, 
> > > 
> > > I'm currently working on a BIOS for a custom Single
> > Board Computer (SBC). 
> > > I have the required BIOS source code and tools at
> > hand. 
> > > However, the boot process always stuck in the BTX
> > loader 
> > > (the infamous "ACPI autoload failed") when I booted
> > out of USB stick 
> > > (with the FreeBSD 8.1 USB stick image). 
> > > 
> > > I could get the system to boot into FreeBSD 8.1 
> > > (by keeping the CDROM tray open and close it when the
> > board looks for 
> > > boot device, otherwise BTX will reboot instantly). 
> > 
> > Are you getting an actual BTX error message or a
> > freeze?  BTX is just a 
> > minikernel written all in assembly.  It doesn't handle
> > loading the kernel, 
> > etc.  All that work is done by the /boot/loader
> > program (which is written in 
> > C).  You can find all the source to the boot code in
> > src/sys/boot.  The BTX 
> > kernel is in src/sys/boot/i386/btx/btx/.
> > 
> > However, to debug this further we would need more info such
> > as what exactly 
> > you are seeing (a hang, a BTX fault with a register dump,
> > etc.).
> 
> One of the BTX fault shows the register dump in the attachment. 
> I hope this could help. Anyway, If I were to try to interpret 
> such register dump, where should I start? I understand x86/x86_64 
> assembly pretty much, but I'm not quite well versed with the 
> FreeBSD code using it. 

Looks like the mailing list stripped the attachment.  Can you post the 
attachment at a URL?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: How to debug BTX loader?

2010-12-02 Thread John Baldwin

On Thursday, December 02, 2010 2:12:04 pm Darmawan Salihun wrote:
> Hi John, 
> 
> --- On Thu, 12/2/10, John Baldwin  wrote:
> 
> > From: John Baldwin 
> > Subject: Re: How to debug BTX loader?
> > To: freebsd-hackers@freebsd.org
> > Cc: "Darmawan Salihun" 
> > Date: Thursday, December 2, 2010, 8:58 AM
> > On Wednesday, December 01, 2010
> > 4:09:42 pm Darmawan Salihun wrote:
> > > Hi John, 
> > > 
> > > --- On Tue, 11/30/10, John Baldwin 
> > wrote:
> > > 
> > > > From: John Baldwin 
> > > > Subject: Re: How to debug BTX loader?
> > > > To: freebsd-hackers@freebsd.org
> > > > Cc: "Darmawan Salihun" 
> > > > Date: Tuesday, November 30, 2010, 9:38 AM
> > > > On Monday, November 29, 2010 1:01:27
> > > > pm Darmawan Salihun wrote:
> > > > > Hi guys, 
> > > > > 
> > > > > I'm currently working on a BIOS for a custom
> > Single
> > > > Board Computer (SBC). 
> > > > > I have the required BIOS source code and
> > tools at
> > > > hand. 
> > > > > However, the boot process always stuck in
> > the BTX
> > > > loader 
> > > > > (the infamous "ACPI autoload failed") when I
> > booted
> > > > out of USB stick 
> > > > > (with the FreeBSD 8.1 USB stick image). 
> > > > > 
> > > > > I could get the system to boot into FreeBSD
> > 8.1 
> > > > > (by keeping the CDROM tray open and close it
> > when the
> > > > board looks for 
> > > > > boot device, otherwise BTX will reboot
> > instantly). 
> > > > 
> > > > Are you getting an actual BTX error message or a
> > > > freeze?  BTX is just a 
> > > > minikernel written all in assembly.  It
> > doesn't handle
> > > > loading the kernel, 
> > > > etc.  All that work is done by the
> > /boot/loader
> > > > program (which is written in 
> > > > C).  You can find all the source to the boot
> > code in
> > > > src/sys/boot.  The BTX 
> > > > kernel is in src/sys/boot/i386/btx/btx/.
> > > > 
> > > > However, to debug this further we would need more
> > info such
> > > > as what exactly 
> > > > you are seeing (a hang, a BTX fault with a
> > register dump,
> > > > etc.).
> > > 
> > > One of the BTX fault shows the register dump in the
> > attachment. 
> > > I hope this could help. Anyway, If I were to try to
> > interpret 
> > > such register dump, where should I start? I understand
> > x86/x86_64 
> > > assembly pretty much, but I'm not quite well versed
> > with the 
> > > FreeBSD code using it. 
> > 
> > Looks like the mailing list stripped the attachment. 
> > Can you post the 
> > attachment at a URL?
> >
> 
> The BTX crash message is in the attachment.

Ok, so clearly the instruction pointer has jumped off into the weeds given 
that the instruction stream is all 0xff.  The instruction pointer value 
(0xc09d3600) implies that this is in the kernel already during early kernel 
startup (before the kernel installs its own IDT with its own fault and 
exception handlers).  It might be helpful to pull up gdb on your kernel.debug 
file and do 'l *0xc09d3600' to see what you get.  Looking at the stack 
'0xc1830188' might be another address in the kernel.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: coretemp(4)/amdtemp(4) and sysctl nodes

2010-12-06 Thread John Baldwin

On Friday, December 03, 2010 1:05:02 pm m...@freebsd.org wrote:
> There are very few uses in FreeBSD mainline code of
> sysctl_remove_oid(), and I was looking at potentially removing them.
> However, the use in coretemp/amdtemp has me slightly stumped.
> 
> Each device provides a device_get_sysctl_ctx sysctl_ctx that is
> automatically cleaned up when the device goes away.  Yet the sysctl
> nodes for both amdtemp and coretemp use the context of other devices,
> rather than their own.  I can't quite figure out why, though the two
> are slightly different enough that they may have different reasons.
> 
> For coretmp(4) I don't see how the parent device can be removed first,
> since we are a child device.  So from my understanding it makes no
> sense to have an explicit sysctl_remove_oid() and attach in the
> parent's sysctl_ctx.

Well, you would want 'kldunload coretemp.ko' to remove the sysctl node even 
though the parent device is still around.  I suspect the same case is true
for amdtemp.  Probably these drivers should use a separate sysctl context.
I'm not sure how the sysctl code handles removing a node that has an active 
context though.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: small dtrace patch for review

2010-12-06 Thread John Baldwin

On Friday, December 03, 2010 11:57:42 am Andriy Gapon wrote:
> 
> The patch is not about DTrace functionality, but about infrastructure use in 
> one
> particular place.
> http://people.freebsd.org/~avg/dtrace_gethrtime_init.diff
> I believe that sched_pin() is need there to make sure that "host"/base CPU 
> stays
> the same for all calls to smp_rendezvous_cpus().
> The pc_cpumask should just be a cosmetic change.

Looks good to me.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: atomic_set_xxx(&x, 0)

2010-12-07 Thread John Baldwin

On Tuesday, December 07, 2010 12:58:43 pm Andriy Gapon wrote:
> 
> $ glimpse atomic_set_ | fgrep -w 0
> /usr/src/sys/dev/arcmsr/arcmsr.c:   
> atomic_set_int(&acb->srboutstandingcount, 0);
> /usr/src/sys/dev/arcmsr/arcmsr.c:   
> atomic_set_int(&acb->srboutstandingcount, 0);
> /usr/src/sys/dev/jme/if_jme.c:  atomic_set_int(&sc->jme_morework, 0);
> /usr/src/sys/dev/jme/if_jme.c:  atomic_set_int(&sc->jme_morework, 0);
> /usr/src/sys/dev/ale/if_ale.c:  atomic_set_int(&sc->ale_morework, 0);
> /usr/src/sys/mips/rmi/dev/xlr/rge.c:
> atomic_set_int(&(priv->frin_to_be_sent[i]), 0);
> /usr/src/sys/dev/drm/drm_irq.c:
> atomic_set_rel_32(&dev->vblank[i].count, 0);
> /usr/src/sys/dev/cxgb/ulp/tom/cxgb_tom.c:   
> atomic_set_int(&t->tids_in_use, 0);
> 
> I wonder if these are all bugs and atomic_store_xxx() was actually intended?

They are most likely bugs.  You can probably ask yongari@ about jme(4) and
ale(4) and np@ about cxgb(4).  drm_irq looks to want to be an 
atomic_store_rel().
Not sure who to ask about arcmsr(4).  I'm not sure arcmsr(4) really needs the
atomic ops at all, but it should be using atomic_fetchadd() and
atomic_readandclear() instead of some of the current atomic ops.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: getting rid of some -mno-* flags under sys/boot

2010-12-20 Thread John Baldwin

On Sunday, December 19, 2010 12:42:01 pm Garrett Cooper wrote:
> On Sun, Dec 19, 2010 at 3:23 AM, Alexander Best  wrote:
> > hi there,
> >
> > i think some of the -mno-* flags in sys/boot/* can be scrubbed, since 
> > they're
> > already being included from ../Makefile.inc.
> 
> Looks good.
> 
> > also TARGET cleandir leaves some files behind in i386/gptboot which should 
> > be
> > fixed by this patch.
> 
> AHA. This might fix the issue I've seen rebuilding stuff with
> gptzfsboot for a good while now where I have to (on mostly rare
> occasions with -j24, etc typically after updating my source tree)
> rebuild it manually. gptzfsboot and zfsboot also need the fix, BTW.
> The only thing is that these files live under the common directory, so
> shouldn't common clean them up (I see that common doesn't have a
> Makefile though, only a Makefile.inc -- ouch)?
> FWIW though, wouldn't it be better to avoid this accidental bug
> and unnecessary duplication by doing something like the following?
> 
> # ...
> 
> OBJS=zfsboot.o sio.o gpt.o drv.o cons.o util.o
> CLEANFILES+= gptzfsboot.out ${OBJS}
> 
> gptzfsboot.out: ${BTXCRT} ${OBJS}
> # ...

Yes, an OBJS would be good.  Also, gptboot.c was recently changed to not
#include ufsread.c, so that explicit dependency can be removed, as can the
GPTBOOT_UFS variable.

Similar fixes probably apply to gptzfsboot.

BTW, the code in common/ is not built into a library, but specific boot
programs (typically /boot/loader on different platforms) include specific
objects.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: PCI IDE Controller Base Address Register setting

2010-12-28 Thread John Baldwin

On Monday, December 27, 2010 6:07:35 am Darmawan Salihun wrote:
> Hi, 
> 
> I'm trying to install FreeBSD 8.0 on AMD Geode LX800 (CS5536 "southbridge"). 
However, it cannot detect the IDE controller (in the CS5536) correctly. It 
says something similar to this: 
> "IDE controller not present"

Hmm, I can't find a message like that anywhere.  Can you get the exact message 
you are seeing?

> I did lspci in Linux (BackTrack 3) 
> and I saw that the IDE controller Base Address Registers (BARs) 
> are all disabled (only contains zeros), 
> except for one of them (BAR4). 
> BAR4 decodes 16-bytes I/O ports (FFF0h-h). 
> The decoded ports "seems" to conform to the PCI IDE specification 
> for "native-PCI IDE controller" (relocatable within the 
> 16-bit I/O address space). 
> 
> I did "cat /proc/ioports" and I found that 
> the following I/O port address ranges decoded correctly 
> to the IDE controller in the CS5536 "southbridge":
> 
> 1F0h-1F7h 
> 3F6h 
> 170h-177h
> FFF0h-h
> 
> My question: 
> Does FreeBSD require the IDE controller BARs 
> to be programmed to also decode 
> legacy I/O ports ranges (1F0h-1F7h,3F6h and 170h-177h)? 

No.  We hardcode the ISA ranges for BARs 0 through 3 if a PCI IDE controller 
has the "Primary" or "Secondary" bits set in its programming interface 
register and don't even look at the BARs.  We do always examines BARs 4 and 5 
using the normal probing scheme of writing all 1's, etc.  The code in question 
looks like this:

/*
 * For ATA devices we need to decide early what addressing mode to use.
 * Legacy demands that the primary and secondary ATA ports sits on the
 * same addresses that old ISA hardware did. This dictates that we use
 * those addresses and ignore the BAR's if we cannot set PCI native
 * addressing mode.
 */
static void
pci_ata_maps(device_t bus, device_t dev, struct resource_list *rl, int force,
uint32_t prefetchmask)
{
struct resource *r;
int rid, type, progif;
#if 0
/* if this device supports PCI native addressing use it */
progif = pci_read_config(dev, PCIR_PROGIF, 1);
if ((progif & 0x8a) == 0x8a) {
if (pci_mapbase(pci_read_config(dev, PCIR_BAR(0), 4)) &&
pci_mapbase(pci_read_config(dev, PCIR_BAR(2), 4))) {
printf("Trying ATA native PCI addressing mode\n");
pci_write_config(dev, PCIR_PROGIF, progif | 0x05, 1);
}
}
#endif
progif = pci_read_config(dev, PCIR_PROGIF, 1);
type = SYS_RES_IOPORT;
if (progif & PCIP_STORAGE_IDE_MODEPRIM) {
pci_add_map(bus, dev, PCIR_BAR(0), rl, force,
prefetchmask & (1 << 0));
pci_add_map(bus, dev, PCIR_BAR(1), rl, force,
prefetchmask & (1 << 1));
} else {
rid = PCIR_BAR(0);
resource_list_add(rl, type, rid, 0x1f0, 0x1f7, 8);
r = resource_list_reserve(rl, bus, dev, type, &rid, 0x1f0,
0x1f7, 8, 0);
rid = PCIR_BAR(1);
resource_list_add(rl, type, rid, 0x3f6, 0x3f6, 1);
r = resource_list_reserve(rl, bus, dev, type, &rid, 0x3f6,
0x3f6, 1, 0);
}
if (progif & PCIP_STORAGE_IDE_MODESEC) {
pci_add_map(bus, dev, PCIR_BAR(2), rl, force,
prefetchmask & (1 << 2));
pci_add_map(bus, dev, PCIR_BAR(3), rl, force,
prefetchmask & (1 << 3));
} else {
rid = PCIR_BAR(2);
resource_list_add(rl, type, rid, 0x170, 0x177, 8);
r = resource_list_reserve(rl, bus, dev, type, &rid, 0x170,
0x177, 8, 0);
rid = PCIR_BAR(3);
resource_list_add(rl, type, rid, 0x376, 0x376, 1);
r = resource_list_reserve(rl, bus, dev, type, &rid, 0x376,
0x376, 1, 0);
}
pci_add_map(bus, dev, PCIR_BAR(4), rl, force,
prefetchmask & (1 << 4));
pci_add_map(bus, dev, PCIR_BAR(5), rl, force,
prefetchmask & (1 << 5));
}


-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: PCI IDE Controller Base Address Register setting

2010-12-28 Thread John Baldwin

On Tuesday, December 28, 2010 1:38:05 pm Darmawan Salihun wrote:
> Hi,
> 
> --- On Tue, 12/28/10, John Baldwin  wrote:
> 
> > From: John Baldwin 
> > Subject: Re: PCI IDE Controller Base Address Register setting
> > To: freebsd-hackers@freebsd.org
> > Cc: "Darmawan Salihun" 
> > Date: Tuesday, December 28, 2010, 10:20 AM
> > On Monday, December 27, 2010 6:07:35
> > am Darmawan Salihun wrote:
> > > Hi, 
> > > 
> > > I'm trying to install FreeBSD 8.0 on AMD Geode LX800
> > (CS5536 "southbridge"). 
> > However, it cannot detect the IDE controller (in the
> > CS5536) correctly. It 
> > says something similar to this: 
> > > "IDE controller not present"
> > 
> > Hmm, I can't find a message like that anywhere.  Can
> > you get the exact message 
> > you are seeing?
> > 
> 
> It says: 
> 
> "No disks found! Please verify that your disk controller is being properly
> probed at boot time."

Oh, so this is a message from the installer.  Can you capture a verbose dmesg
via a serial console perhaps?  Or at least the kernel probe messages for your
ATA controller?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: PCI IDE Controller Base Address Register setting

2010-12-28 Thread John Baldwin

On Tuesday, December 28, 2010 2:10:59 pm Darmawan Salihun wrote:
> Hi, 
> 
> --- On Tue, 12/28/10, John Baldwin  wrote:
> 
> > From: John Baldwin 
> > Subject: Re: PCI IDE Controller Base Address Register setting
> > To: "Darmawan Salihun" 
> > Cc: freebsd-hackers@freebsd.org
> > Date: Tuesday, December 28, 2010, 1:52 PM
> > On Tuesday, December 28, 2010 1:38:05
> > pm Darmawan Salihun wrote:
> > > Hi,
> > > 
> > > --- On Tue, 12/28/10, John Baldwin 
> > wrote:
> > > 
> > > > From: John Baldwin 
> > > > Subject: Re: PCI IDE Controller Base Address
> > Register setting
> > > > To: freebsd-hackers@freebsd.org
> > > > Cc: "Darmawan Salihun" 
> > > > Date: Tuesday, December 28, 2010, 10:20 AM
> > > > On Monday, December 27, 2010 6:07:35
> > > > am Darmawan Salihun wrote:
> > > > > Hi, 
> > > > > 
> > > > > I'm trying to install FreeBSD 8.0 on AMD
> > Geode LX800
> > > > (CS5536 "southbridge"). 
> > > > However, it cannot detect the IDE controller (in
> > the
> > > > CS5536) correctly. It 
> > > > says something similar to this: 
> > > > > "IDE controller not present"
> > > > 
> > > > Hmm, I can't find a message like that
> > anywhere.  Can
> > > > you get the exact message 
> > > > you are seeing?
> > > > 
> > > 
> > > It says: 
> > > 
> > > "No disks found! Please verify that your disk
> > controller is being properly
> > > probed at boot time."
> > 
> > Oh, so this is a message from the installer.  Can you
> > capture a verbose dmesg
> > via a serial console perhaps?  
> 
> I'm not sure if I can do this because I've tried a couple of times 
> but nothing comes out of the serial console. Perhaps a wrong baud rate 
> setting? 
> I set it to 96bps and 8-N-1 back then. Is that correct? 

Yes, that should be correct.  You have to turn the console on however (it is
not enabled by default).  The simplest way to do this is probably to hit the
key option to break into the loader prompt when you see the boot menu (I think
it is option '6').  Then enter 'boot -D' at the 'OK' prompt.  This should boot
with both the video and serial consoles enabled with the video console as the
primary console.  For a verbose boot, use 'boot -Dv'

If you want to test out the serial console before you boot, you can instead
enter 'set console="vidconsole,comconsole"' at the prompt.  You should then
see an OK prompt on both the screen and the serial port.

Note that the serial console is hardcoded to use the default I/O ports for
COM1.

> > Or at least the kernel
> > probe messages for your
> > ATA controller?
> > 
> 
> I recall that pressing Alt+F2 during the installation would open-up 
> another console, full with log messages. Would that be enough? 

Actually, the kernel probe messages are on the main console, but you can hit
scroll lock to freeze the console and then use page up to go back in history
and find the messages.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: PCI IDE Controller Base Address Register setting

2011-01-03 Thread John Baldwin

On Saturday, January 01, 2011 2:58:12 pm Darmawan Salihun wrote:
> 
> --- On Thu, 12/30/10, Darmawan Salihun  wrote:
> 
> > From: Darmawan Salihun 
> > Subject: Re: PCI IDE Controller Base Address Register setting
> > To: "John Baldwin" 
> > Cc: freebsd-hackers@freebsd.org
> > Date: Thursday, December 30, 2010, 3:28 PM
> > --- On Tue, 12/28/10, John Baldwin
> > 
> > wrote:
> > 
> > > From: John Baldwin 
> > > Subject: Re: PCI IDE Controller Base Address Register
> > setting
> > > To: "Darmawan Salihun" 
> > > Cc: freebsd-hackers@freebsd.org
> > > Date: Tuesday, December 28, 2010, 2:22 PM
> > > On Tuesday, December 28, 2010 2:10:59
> > > pm Darmawan Salihun wrote:
> > > > Hi, 
> > > > 
> > > > --- On Tue, 12/28/10, John Baldwin 
> > > wrote:
> > > > 
> > > > > From: John Baldwin 
> > > > > Subject: Re: PCI IDE Controller Base
> > Address
> > > Register setting
> > > > > To: "Darmawan Salihun" 
> > > > > Cc: freebsd-hackers@freebsd.org
> > > > > Date: Tuesday, December 28, 2010, 1:52 PM
> > > > > On Tuesday, December 28, 2010 1:38:05
> > > > > pm Darmawan Salihun wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > --- On Tue, 12/28/10, John Baldwin
> > 
> > > > > wrote:
> > > > > > 
> > > > > > > From: John Baldwin 
> > > > > > > Subject: Re: PCI IDE Controller
> > Base
> > > Address
> > > > > Register setting
> > > > > > > To: freebsd-hackers@freebsd.org
> > > > > > > Cc: "Darmawan Salihun" 
> > > > > > > Date: Tuesday, December 28, 2010,
> > 10:20
> > > AM
> > > > > > > On Monday, December 27, 2010
> > 6:07:35
> > > > > > > am Darmawan Salihun wrote:
> > > > > > > > Hi, 
> > > > > > > > 
> > > > > > > > I'm trying to install FreeBSD
> > 8.0
> > > on AMD
> > > > > Geode LX800
> > > > > > > (CS5536 "southbridge"). 
> > > > > > > However, it cannot detect the IDE
> > > controller (in
> > > > > the
> > > > > > > CS5536) correctly. It 
> > > > > > > says something similar to this: 
> > > > > > > > "IDE controller not present"
> > > > > > > 
> > > > > > > Hmm, I can't find a message like
> > that
> > > > > anywhere.  Can
> > > > > > > you get the exact message 
> > > > > > > you are seeing?
> > > > > > > 
> > > > > > 
> > > > > > It says: 
> > > > > > 
> > > > > > "No disks found! Please verify that
> > your
> > > disk
> > > > > controller is being properly
> > > > > > probed at boot time."
> > > > > 
> > > > > Oh, so this is a message from the
> > > installer.  Can you
> > > > > capture a verbose dmesg
> > > > > via a serial console perhaps?  
> > > > 
> > > > I'm not sure if I can do this because I've tried
> > a
> > > couple of times 
> > > > but nothing comes out of the serial console.
> > Perhaps a
> > > wrong baud rate setting? 
> > > > I set it to 96bps and 8-N-1 back then. Is that
> > > correct? 
> > > 
> > > Yes, that should be correct.  You have to turn the
> > > console on however (it is
> > > not enabled by default).  The simplest way to do
> > this
> > > is probably to hit the
> > > key option to break into the loader prompt when you
> > see the
> > > boot menu (I think
> > > it is option '6').  Then enter 'boot -D' at the 'OK'
> > > prompt.  This should boot
> > > with both the video and serial consoles enabled with
> > the
> > > video console as the
> > > primary console.  For a verbose boot, use 'boot -Dv'
> > > 
> > 
> > Thanks, I tested this option and it worked. 
> > I could see the debugging messages. 
> > 
> > FreeBSD cannot detect the disk in all of the IDE
> > interfaces.  
> >

Re: PANIC: thread_exit: Last thread exiting on its own.

2011-01-03 Thread John Baldwin

On Friday, December 31, 2010 4:22:36 am Lev Serebryakov wrote:
> Hello, Giovanni.
> You wrote 31 декабря 2010 г., 1:56:20:
> 
> >>  I've  got  this  panic on reboot from geom_raid5.
> > Could you please provide some backtrace? Have you got a core?
>   Backtrace  was were simple (I've reproduce it from my memory, but it
>   really was that simple):
> 
>   
>   panic()
>   thread_exit()
>   kthread_exit()
>   g_raid5_worker()
>   fork_trampoline()
>   ...
> 
>   No core, because I didn't have dumpdev configured :(
> 
> > Which revision of -STABLE are you running(or when last src update)?
>   uname shows:
> 
> FreeBSD 8.2-PRERELEASE #2: Tue Dec 21 01:17:16 MSK 2010
> 
>   I've  rebuilt  kernel  RIGHT after `csup', so difference is no more
>  than several hours.

Looks like 204087 needs to be MFC'd.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [patch] have rtprio check that arguments are numeric; change atoi to strtol

2011-01-04 Thread John Baldwin

On Tuesday, January 04, 2011 6:25:02 am Kostik Belousov wrote:
> On Tue, Jan 04, 2011 at 11:40:45AM +0100, Giorgos Keramidas wrote:
> @@ -123,12 +121,28 @@ main(argc, argv)
>   }
>   exit(0);
>   }
> - exit (1);
> + exit(1);
> +}
> +
> +static int
> +parseint(const char *str, const char *errname)
> +{
> + char *endp;
> + long res;
> +
> + errno = 0;
> + res = strtol(str, &endp, 10);
> + if (errno != 0 || endp == str || *endp != '\0')
> + err(1, "%s shall be a number", errname);

Small nit, maybe use 'must' instead of 'shall'.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Building "third-party" modules for kernel with debug options?

2011-01-07 Thread John Baldwin

On Friday, January 07, 2011 7:15:59 am Lev Serebryakov wrote:
> Hello, Freebsd-hackers.
> 
> 
>   I've found, that "struct bio" is depend on state of "DIAGNOSTIC"
> flag ("options DIAGNOSTIC" in kernel config). But when I build
> third-party GEOM (or any other) module with using of ,
> there is no access to these options. So, module, built from ports, can
> fail on user's kernel, even if it built with proper kernel sources in
> "/usr/src/sys". Is here any solution for this problem?
> 
> P.S. NB: GEOM module is only example, question is about modules &
> kernel options in general, so I put this message on Hackers list.

In general we try to avoid having "public" kernel data structures change size 
when various kernel options are in use.  Some noticeable exceptions to this 
rule are PAE (i386-only) and LOCK_PROFILING (considered to be something users 
would not typically use).  DIAGNOSTIC might arguably be considered the same as 
LOCK_PROFILING, but I am surprised it affects bio.  It should only affect a 
GEOM module that uses bio_pblockno however in this case since you should be 
using kernel routines to allocate bio structures rather than malloc'ing one 
directly.  Perhaps phk@ would ok moving bio_pblockno up above the optional 
diagnostic fields.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [rfc] allow to boot with >= 256GB physmem

2011-01-21 Thread John Baldwin

On Friday, January 21, 2011 11:09:10 am Sergey Kandaurov wrote:
> Hello.
> 
> Some time ago I faced with a problem booting with 400GB physmem.
> The problem is that vm.max_proc_mmap type overflows with
> such high value, and that results in a broken mmap() syscall.
> The max_proc_mmap value is a signed int and roughly calculated
> at vmmapentry_rsrc_init() as u_long vm_kmem_size quotient:
> vm_kmem_size / sizeof(struct vm_map_entry) / 100.
> 
> Although at the time it was introduced at svn r57263 the value
> was quite low (f.e. the related commit log stands:
> "The value defaults to around 9000 for a 128MB machine."),
> the problem is observed on amd64 where KVA space after
> r212784 is factually bound to the only physical memory size.
> 
> With INT_MAX here is 0x7fff, and sizeof(struct vm_map_entry)
> is 120, it's enough to have sligthly less than 256GB to be able
> to reproduce the problem.
> 
> I rewrote vmmapentry_rsrc_init() to set large enough limit for
> max_proc_mmap just to protect from integer type overflow.
> As it's also possible to live tune this value, I also added a
> simple anti-shoot constraint to its sysctl handler.
> I'm not sure though if it's worth to commit the second part.
> 
> As this patch may cause some bikeshedding,
> I'd like to hear your comments before I will commit it.
> 
> http://plukky.net/~pluknet/patches/max_proc_mmap.diff

Is there any reason we can't just make this variable and sysctl a long?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: pci_suspend/pci_resume of custom pcie board

2011-01-25 Thread John Baldwin

On Tuesday, January 25, 2011 9:47:35 am Philip Soeberg wrote:
> Hi,
> 
> I'm in a particular problem where I need to set my custom pcie adapter 
> into d3hot power-mode and a couple of seconds later reset it back to d0.
> The board has an FPGA directly attached to the pcie interface, and as I 
> need to re-configure the FPGA on the fly, I have to ensure the datalink 
> layer between the upstream bridge and my device is idle to prevent any 
> hickups.
> 
> On linux I simply do a pci_save_state(device) followed by 
> pci_set_power_state(device, d3hot), then after my magic on my board, I 
> do the reverse: pci_set_power_state(device, d0) followed by 
> pci_restore_state(device).
> 
> On FreeBSD, say 8, I've found the pci_set_powerstate function, which is 
> documented in PCI(9), but that function does not save nor restore the 
> config space.
> 
> I've tried, just for the fun of it, to go via pci_cfg_save(device, 
> dinfo, 0) with dinfo being device_get_ivars(device) and then 
> subsequently restoring the config space back via pci_cfg_restore(), but 
> since both those functions are declared in  I'm 
> not sure if I'm supposed to use those directly or not.. Besides, I'm not 
> really having any luck with that approach.
> 
> Reading high and low on the net suggest that not all too many driver 
> devs are concerned with suspend/resume operation of their device, and if 
> they are, leave it to user-space to decide when to suspend/resume a 
> device.. I would like to be able to save off my device' config space, 
> put it to sleep (d3hot), wake it back up (d0) and restore the device' 
> config space directly from the device' own driver..
> 
> Anyone who can help me with this?

Use this:

pci_cfg_save(dev, dinfo, 0);
pci_set_powerstate(dev, PCI_POWERSTATE_D3);

/* do stuff */

/* Will set state to D0. */
pci_cfg_restore(dev, dinfo);

We probably should create some wrapper routines (pci_save_state() and 
pci_restore_state() would be fine) that hide the 'dinfo' detail as that isn't 
something device drivers should have to know.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: rtld optimizations

2011-01-26 Thread John Baldwin

On Wednesday, January 26, 2011 10:25:27 am Mark Felder wrote:
> On Tue, 25 Jan 2011 22:49:11 -0600, Alexander Kabaev   
> wrote:
> 
> >  The only extra quirk that said commit
> > does is an optimization of a dlsym() call, which is hardly ever in
> > critical performance path.
> 
> It's really not my place to say, but it seems strange that if an  
> optimization is available people would ignore it because they don't think  
> it's important enough. I don't understand this mentality; if it's not  
> going to break anything and it obviously can improve performance in  
> certain use cases, why not merge it and make FreeBSD even better?

Many things that seem obvious aren't actually true, hence the need for
actual testing and benchmarks.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Namecache lock contention?

2011-01-28 Thread John Baldwin

On Friday, January 28, 2011 8:46:07 am Ivan Voras wrote:
> I have this situation on a PHP server:
> 
> 36623 www 1  760   237M 30600K *Name   6   0:14 47.27% php-cgi
> 36638 www 1  760   237M 30600K *Name   3   0:14 46.97% php-cgi
> 36628 www 1 1050   237M 30600K *Name   2   0:14 46.88% php-cgi
> 36627 www 1 1050   237M 30600K *Name   0   0:14 46.78% php-cgi
> 36639 www 1 1050   237M 30600K *Name   5   0:14 46.58% php-cgi
> 36643 www 1 1050   237M 30600K *Name   7   0:14 46.39% php-cgi
> 36629 www 1  760   237M 30600K *Name   1   0:14 46.39% php-cgi
> 36642 www 1 1050   237M 30600K *Name   2   0:14 46.39% php-cgi
> 36626 www 1 1050   237M 30600K *Name   5   0:14 46.19% php-cgi
> 36654 www 1 1050   237M 30600K *Name   7   0:13 46.19% php-cgi
> 36645 www 1 1050   237M 30600K *Name   1   0:14 45.75% php-cgi
> 36625 www 1 1050   237M 30600K *Name   0   0:14 45.56% php-cgi
> 36624 www 1 1050   237M 30600K *Name   6   0:14 45.56% php-cgi
> 36630 www 1  760   237M 30600K *Name   7   0:14 45.17% php-cgi
> 36631 www 1 1050   237M 30600K RUN 4   0:14 45.17% php-cgi
> 36636 www 1 1050   237M 30600K *Name   3   0:14 44.87% php-cgi
> 
> It looks like periodically most or all of the php-cgi processes are 
> blocked in "*Name" for long enough that "top" notices, then continue, 
> probably in a "thundering herd" way. From grepping inside /sys the most 
> likely suspect seems to be something in the namecache, but I can't find 
> exactly a symbol named "Name" or string beginning with "Name" that would 
> be connected to a lock.

In vfs_cache.c:

static struct rwlock cache_lock;
RW_SYSINIT(vfscache, &cache_lock, "Name Cache");

What are the php scripts doing?  Do they all try to create and delete files at 
the same time (or do renames)?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Divide-by-zero in loader

2011-01-28 Thread John Baldwin

On Friday, January 28, 2011 12:41:08 pm Matthew Fleming wrote:
> I spent a few days chasing down a bug and I'm wondering if a loader
> change would be appropriate.
> 
> So we have these new front-panel LCDs, and like everything these days
> it's a SoC.  Normally it presents to FreeBSD as a USB communications
> device (ucom), but when the SoC is sitting in its own boot loader, it
> presents as storage (umass).  If the box is rebooted in this state,
> the reboot gets into /boot/loader and then reboots itself.  (It took a
> few days just to figure out I was getting into /boot/loader, since the
> only prompt I could definitively stop at was boot2).
> 
> Anyways, I eventually debugged it to the device somehow presenting
> itself to /boot/loader with a geometry of 1024/256/0, and since od_sec
> is 0 that causes a divide-by-zero error in bd_io() while the loader is
> trying to figure out if this is GPT or MBR formatted.  We're still
> trying to figure out why the loader sees this incorrect geometry.
> 
> But meanwhile, this patch fixes the issue, and I wonder if it would be
> a useful safety-belt for other devices where an incorrect geometry can
> be seen?

That's probably fine.  A sector count of zero is invalid for CHS.  However, 
probably we should not even be using C/H/S at all if the device claims to 
support EDD.  We already use raw LBAs if it supports EDD, and we should 
probably just ignore C/H/S altogether if it supports EDD.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Divide-by-zero in loader

2011-01-28 Thread John Baldwin

On Friday, January 28, 2011 2:14:45 pm Matthew Fleming wrote:
> On Fri, Jan 28, 2011 at 11:00 AM, John Baldwin  wrote:
> > On Friday, January 28, 2011 12:41:08 pm Matthew Fleming wrote:
> >> I spent a few days chasing down a bug and I'm wondering if a loader
> >> change would be appropriate.
> >>
> >> So we have these new front-panel LCDs, and like everything these days
> >> it's a SoC.  Normally it presents to FreeBSD as a USB communications
> >> device (ucom), but when the SoC is sitting in its own boot loader, it
> >> presents as storage (umass).  If the box is rebooted in this state,
> >> the reboot gets into /boot/loader and then reboots itself.  (It took a
> >> few days just to figure out I was getting into /boot/loader, since the
> >> only prompt I could definitively stop at was boot2).
> >>
> >> Anyways, I eventually debugged it to the device somehow presenting
> >> itself to /boot/loader with a geometry of 1024/256/0, and since od_sec
> >> is 0 that causes a divide-by-zero error in bd_io() while the loader is
> >> trying to figure out if this is GPT or MBR formatted.  We're still
> >> trying to figure out why the loader sees this incorrect geometry.
> >>
> >> But meanwhile, this patch fixes the issue, and I wonder if it would be
> >> a useful safety-belt for other devices where an incorrect geometry can
> >> be seen?
> >
> > That's probably fine.  A sector count of zero is invalid for CHS.  However,
> > probably we should not even be using C/H/S at all if the device claims to
> > support EDD.  We already use raw LBAs if it supports EDD, and we should
> > probably just ignore C/H/S altogether if it supports EDD.
> 
> This is all almost entirely outside my knowledge, but at the moment
> bd_eddprobe() requres a geometry of 1023/255/63 before it attempts to
> check if EDD can be used.  Is that check incorrect?

Well, it is very conservative in that it only uses EDD if it thinks it can't
use C/H/S.  It would be interesting to see if simply checking for a sector
count of 0 there would avoid the divide-by-zero and let your device "work".

However, it might actually be useful to always use EDD if possible, esp.
EDD3 since that lets you not use bounce buffers down in 1MB.

> In my specific case I know there's no bootable stuff on this disk; the
> earlier layers bypassed it correctly without a problem.
> 
> Thanks,
> matthew
> 

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: NVIDIA (port) driver fails to create /dev/nvidactl; 8.2Prerelease

2011-01-31 Thread John Baldwin

On Friday, January 28, 2011 3:43:12 pm Duane H. Hesser wrote:
> I am attempting to replace the 'nv' X11 driver with the "official"
> nvidia driver from ithe x11/nvidia-driver port, in order to handle
> the AVCHD video files from my Canon HF S20.
> 
> I have been trying for several days now, having read the nvidia
> README file in /usr/local/share and everything Google has to offer.
> 
> Unfortunately devilfs is smarter and meaner than I.
> 
> The 'xorg.conf' file is created by nividia-xconfig.  The console
> output when calling 'startx' to begin the frustration is
> 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> X.Org X Server 1.7.5
> Release Date: 2010-02-16
> X Protocol Version 11, Revision 0
> Build Operating System: FreeBSD 8.1-RELEASE i386 
> Current Operating System: FreeBSD belinda.androcles.org 8.2-PRERELEASE 
FreeBSD 8.2-PRERELEASE #3: Thu Jan 27 13:45:06 PST 2011 
r...@belinda.androcles.org:/usr/obj/usr/src/sys/BELINDA i386
> Build Date: 08 January 2011  05:52:50PM
>  
> Current version of pixman: 0.18.4
> Before reporting problems, check http://wiki.x.org
> to make sure that you have the latest version.
> Markers: (--) probed, (**) from config file, (==) default setting,
> (++) from command line, (!!) notice, (II) informational,
> (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
> (==) Log file: "/var/log/Xorg.0.log", Time: Fri Jan 28 11:32:46 2011
> (==) Using config file: "/etc/X11/xorg.conf"
> NVIDIA: could not open the device file /dev/nvidiactl (No such file or 
directory).
> (EE) Jan 28 11:32:46 NVIDIA(0): Failed to initialize the NVIDIA kernel 
module. Please see the
> (EE) Jan 28 11:32:46 NVIDIA(0): system's kernel log for additional error 
messages and
> (EE) Jan 28 11:32:46 NVIDIA(0): consult the NVIDIA README for details.
> (EE) NVIDIA(0):  *** Aborting ***
> (EE) Screen(s) found, but none have a usable configuration.
> 
> Fatal server error:
> no screens found

You don't have an nvidia0 device attached to vgapci0.  I would suggest adding 
printfs to the nvidia driver's probe routine to find out why it failed to 
probe.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: weird characters in top(1) output

2011-02-01 Thread John Baldwin

On Tuesday, February 01, 2011 8:11:54 am Alexander Best wrote:
> On Tue Feb  1 11, Sergey Kandaurov wrote:
> > On 1 February 2011 15:24, Alexander Best  wrote:
> > > hi there,
> > >
> > > i was doing the following:
> > >
> > > top inf > ~/output
> > >
> > > when i noticed that this was missing the overall statistics line. so i 
> > > went
> > > ahead and did:
> > >
> > > top -d2 inf > ~/output
> > >
> > > funny thing is that for the second output some weird characters seem to 
> > > get
> > > spammed into the overall statistics line:
> > >
> > > last pid: 14320;  load averages:  0.42,  0.44,  0.37  up 1+14:02:02
> > > 13:21:05
> > > 249 processes: 1 running, 248 sleeping
> > > CPU: ^[[3;6H 7.8% user,  0.0% nice, 10.6% system,  0.6% interrupt, 81.0% 
> > > idle
> > > Mem: 1271M Active, 205M Inact, 402M Wired, 67M Cache, 212M Buf, 18M Free
> > > Swap: 18G Total, 782M Used, 17G Free, 4% Inuse
> > >
> > > this only seems to happen when i redirect the top(1) output to a file. if 
> > > i do:
> > >
> > > top -d2 inf
> > >
> > > ...everything works fine. i verified the issue under zsh(1) and sh(1).
> > 
> > My quick check shows that this is a regression between 7.2 and 7.3.
> > Reverting r196382 fixes this bug for me.
> 
> thanks for the help. indeed reverting r196382 fixes the issue.

Hmm, you need more than 10 CPUs to understand the reason for that fix.
Without it all of the updated per-CPU states are off by one column so you
get weird screen effects.  The "garbage" characters are actually just a
terminal sequence to move the cursor.  top uses these things a _lot_ to
move the cursor around.

You can try this instead though, it figures out the appropriate number of
spaces rather than using Move_to() for these two routines:

Index: display.c
===
--- display.c   (revision 218032)
+++ display.c   (working copy)
@@ -447,12 +447,14 @@
 /* print tag and bump lastline */
 if (num_cpus == 1)
printf("\nCPU: ");
-else
-   printf("\nCPU %d: ", cpu);
+else {
+   value = printf("\nCPU %d: ", cpu);
+   while (value++ <= cpustates_column)
+   printf(" ");
+}
 lastline++;
 
 /* now walk thru the names and print the line */
-Move_to(cpustates_column, y_cpustates + cpu);
 while ((thisname = *names++) != NULL)
 {
if (*thisname != '\0')
@@ -532,7 +534,7 @@
 register char **names;
 register char *thisname;
 register int *lp;
-int cpu;
+int cpu, value;
 
 for (cpu = 0; cpu < num_cpus; cpu++) {
 names = cpustate_names;
@@ -540,11 +542,13 @@
 /* show tag and bump lastline */
 if (num_cpus == 1)
printf("\nCPU: ");
-else
-   printf("\nCPU %d: ", cpu);
+else {
+   value = printf("\nCPU %d: ", cpu);
+   while (value++ <= cpustates_column)
+   printf(" ");
+}
 lastline++;
 
-Move_to(cpustates_column, y_cpustates + cpu);
 while ((thisname = *names++) != NULL)
 {
if (*thisname != '\0')

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Strange problems in the old libc malloc routines

2011-02-02 Thread John Baldwin

On Wednesday, February 02, 2011 01:04:15 pm Andrew Duane wrote:
> We are still using the FreeBSD 6 malloc routines, and are rather suddenly
> having a large number of problems with one or two of our programs. Before
> I dig into the 100+ crash dumps I have, I thought I'd see if anyone else
> has ever encountered this.
> 
> The problems all seem to stem from some case of malloc returning the
> pointer "1" instead of either NULL or a valid pointer. Always exactly "1".
> Where this goes bad depends on where it happens (in the program or inside
> malloc itself), but that pointer value of "1" is always involved. Some of
> the structures like page_dir look corrupted too. It seems as if maybe the
> "1" is coming from sbrk(0) which is just returning the value of curbrk
> (which is correct, and not even close to "1").

Could it be related to calls to malloc(0) perhaps?  phkmalloc uses a constant 
for those that defaults to the last byte in a page (e.g. 4095 on x86).  I'm 
not sure what platform you are using malloc on, but is it possible that you 
have ZEROSIZEPTR set to 1 somehow?  Even so, if that is true free() should 
just ignore that pointer and not corrupt its internal state.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Analyzing wired memory?

2011-02-08 Thread John Baldwin

 0xff000f6965e8
ff80798e6000 - ff80798e7000 object 0xff000f6965e8
ff80798e7000 - ff80798e8000 object 0xff000f6965e8
ff80798e8000 - ff80798ea000 kmem_alloc() / contigmalloc()
ff80798ea000 - ff8079949000 kmem_alloc_nofault() (kstack/mapdev)
ff8079949000 - ff807994a000 kmem_alloc() / contigmalloc()
ff807994a000 - ff807994b000 object 0xff0060568af8
ff807994b000 - ff8079969000 kmem_alloc_nofault() (kstack/mapdev)
ff8079969000 - ff807996b000 
ff807996b000 - ff80799b kmem_alloc() / contigmalloc()
ff80799b - ff80799b1000 object 0xff00606caca8
ff80799b1000 - ff80799b2000 object 0xff00606caca8
ff80799b2000 - ff80799b6000 kmem_alloc() / contigmalloc()
ff80799b6000 - ff80799b7000 object 0xff0060568af8
ff80799b7000 - ff80799b8000 object 0xff0060568af8
ff80799b8000 - ff8079cbc000 kmem_alloc() / contigmalloc()
ff8079cbc000 - ff807aa0e000 kmem_alloc_nofault() (kstack/mapdev)
ff807aa0e000 - 8000 
8000 - 808164e8 text/data/bss
808164e8 - 81822000 bootstrap data

(The various objects inserted directly into the kernel_map are likely from
the nvidia driver.)

The 'kvm' command in my gdb script is mostly MI, but some bits are MD such as
the code to handle the 'AP stacks' region.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: CFR: FEATURE macros for AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/PMC/SYSV/...

2011-02-11 Thread John Baldwin

On Friday, February 11, 2011 4:30:28 am Alexander Leidinger wrote:
> Hi,
> 
> during the last GSoC various FEATURE macros where added to the system.  
> Before committing them, I would like to get some review (like if macro  
> is in the correct file, and for those FEATURES where the description  
> was not taken from NOTES if the description is OK).
> 
> If nobody complains, I would like to commit this in 1-2 weeks. If you  
> need more time to review, just tell me.
> 
> Here is the list of affected files (for those impatient ones which do  
> not want to look at the attached patch before noticing that they are  
> not interested to look at it):

Hmm, so what is the rationale for adding FEATURE() macros?  Do we just want to 
add them for everything or do we want to add them on-demand as use cases for 
each knob arrive?  Some features can already be inferred (e.g. if KTR is 
compiled in, then the debug.ktr.mask sysctl will exist).  Also, in the case of 
KTR, I'm not sure that any userland programs need to alter their behavior 
based on whether or not that feature was present.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: problem with build mcelog

2011-02-11 Thread John Baldwin

On Friday, February 11, 2011 7:48:39 am venom wrote:
> Hello.
> 
> i am trying build mcelog
> 
> 
> FreeBSD  8.1-RELEASE-p2 FreeBSD 8.1-RELEASE-p2 #0: Fri Jan 14 04:15:56
> UTC 2011 root@freebsd:/usr/obj/usr/src/sys/GENERIC amd64
> 
> 
> # fetch
> http://ftp2.pl.freebsd.org/pub/FreeBSD/distfiles/mcelog-1.0pre2.tar.gz
> # tar -xf mcelog-1.0pre2.tar.gz
> # cd mcelog-1.0pre2
> # fetch http://people.freebsd.org/~jhb/mcelog/mcelog.patch
> # fetch http://people.freebsd.org/~jhb/mcelog/memstream.c

Oops, I just updated mcelog.patch and it should work fine now.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: map share memory to kernel space

2011-02-14 Thread John Baldwin

On Monday, February 14, 2011 4:18:50 am beezarliu wrote:
> Hackers,
> 
> I want to access a userland share memory in a kernel thread. 
> So I tried to map the share memory to the kernel space. 
> The basic idea is to map the shm_object into kernel_map
> when the share memory is created.
> 
> Using the following patch, I found the vm_object in kernel_map,
> and the vm_object in the address space of userland process are the same.
> But their content in the kernel and userland address mapped are different.
> 
> It's very strang since they are exactly the same vm_object.
> Do I miss something, please help.

Hmm, this is a bit of code I use for something similar to map a VM object into 
the kernel.  It does not use vm_page_grab() directly though:

VM_OBJECT_LOCK(obj);
vm_object_reference_locked(obj);
VM_OBJECT_UNLOCK(obj);

/* Map the object into the kernel_map and wire it. */
kva = vm_map_min(kernel_map);
ofs = foff & PAGE_MASK;
foff = trunc_page(foff);
size = round_page(size + ofs);
rv = vm_map_find(kernel_map, obj, foff, &kva, size, TRUE,
VM_PROT_READ | VM_PROT_WRITE, VM_PROT_READ | VM_PROT_WRITE, 0);
if (rv == KERN_SUCCESS) {
rv = vm_map_wire(kernel_map, kva, kva + size,
VM_MAP_WIRE_SYSTEM | VM_MAP_WIRE_NOHOLES);
if (rv == KERN_SUCCESS) {
*memp = (void *)(kva + ofs);
return (0);
}
vm_map_remove(kernel_map, kva, kva + size);
} else
vm_object_deallocate(obj);

Unmapping the object is easy of course:

kva = (vm_offset_t)mem;
ofs = kva & PAGE_MASK;
kva = trunc_page(kva);
size = round_page(size + ofs);
    vm_map_remove(kernel_map, kva, kva + size);


-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: map share memory to kernel space

2011-02-15 Thread John Baldwin

On Monday, February 14, 2011 10:12:59 pm beezarliu wrote:
> On 2011-02-14 22:56:28, John Baldwin wrote:
> >On Monday, February 14, 2011 4:18:50 am beezarliu wrote:
> >> Hackers,
> >> 
> >> I want to access a userland share memory in a kernel thread. 
> >> So I tried to map the share memory to the kernel space. 
> >> The basic idea is to map the shm_object into kernel_map
> >> when the share memory is created.
> >> 
> >> Using the following patch, I found the vm_object in kernel_map,
> >> and the vm_object in the address space of userland process are the same.
> >> But their content in the kernel and userland address mapped are different.
> >> 
> >> It's very strang since they are exactly the same vm_object.
> >> Do I miss something, please help.
> >
> >Hmm, this is a bit of code I use for something similar to map a VM object 
> >into 
> >the kernel.  It does not use vm_page_grab() directly though:
> 
> 
> Initially, I wanted to allocate all the pages needed when the share memory is 
> created,
> in order to reduce page fault at the time it's used.
> This seems to be an extra step.  :(
> 
> John, thank you very much!

I think vm_map_wire() will find pages for you when it wires the object.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [PATCH] fix impossible case with waitpid(2) in truss

2011-02-15 Thread John Baldwin

On Monday, February 14, 2011 11:12:02 pm Garrett Cooper wrote:
> Hi,
> waitpid(2) returns a value in the set { -1, 0,  } (-1 in the
> event of an ERROR, 0 when WNOHANG is specified,  when the process
> exits according to wait(2)); it never returns a value < -1.
> If someone could commit this patch it would be appreciated.
> Thanks,
> -Garrett

I went with '< 0' to match the style used for ptrace() invocations in other 
parts of truss.  All four calls to waitpid() in truss were broken in this 
fashion.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: ichsmb - correct locking strategy?

2011-02-22 Thread John Baldwin

On Friday, February 18, 2011 9:10:47 am Svatopluk Kraus wrote:
> Hi,
> 
>   I try to figure out locking strategy in FreeBSD and found 'ichsmb'
> device. There is a mutex which protects smb bus (ichsmb device). For
> example in ichsmb_readw() in sys/dev/ichsmb/ichsmb.c, the mutex is
> locked and a command is written to bus, then unbounded (but with
> timeout) sleep is done (mutex is unlocked during sleep). After sleep a
> word is read from bus and the mutex is unlocked.
> 
>   1. If an use of the device IS NOT serialized by layers around then
> more calls to this function (or others) can be started or even done
> before the first one is finished. The mutex don't protect sms bus.
> 
>   2. If an use of the device IS serialized by layers around then the
> mutex is useless.
> 
>   Moreover, I don't mension interrupt routine which uses the mutex and
> smb bus too.
> 
>   Am I right? Or did I miss something?

Hmm, the mutex could be useful if you have an smb controller with an interrupt 
handler (I think ichsmb or maybe intpm can support an interrupt handler) to 
prevent concurrent access to device registers.  That is the purpose of the 
mutex at least.  There is a separate locking layer in smbus itself in (see 
smbus_request_bus(), etc.).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: problem with build mcelog

2011-02-22 Thread John Baldwin

On Friday, February 18, 2011 7:11:07 am Sergey Kandaurov wrote:
> On 18 February 2011 14:13, venom  wrote:
> > On 02/11/2011 11:31 PM, John Baldwin wrote:
> >>
> >> On Friday, February 11, 2011 7:48:39 am venom wrote:
> >>>
> >>> Hello.
> >>>
> >>> i am trying build mcelog
> >>>
> >>>
> >>> FreeBSD  8.1-RELEASE-p2 FreeBSD 8.1-RELEASE-p2 #0: Fri Jan 14
> >>> 04:15:56
> >>> UTC 2011 root@freebsd:/usr/obj/usr/src/sys/GENERIC amd64
> >>>
> >>>
> >>> # fetch
> >>> http://ftp2.pl.freebsd.org/pub/FreeBSD/distfiles/mcelog-1.0pre2.tar.gz
> >>> # tar -xf mcelog-1.0pre2.tar.gz
> >>> # cd mcelog-1.0pre2
> >>> # fetch http://people.freebsd.org/~jhb/mcelog/mcelog.patch
> >>> # fetch http://people.freebsd.org/~jhb/mcelog/memstream.c
> >>
> >> Oops, I just updated mcelog.patch and it should work fine now.
> >>
> >
> > |--- //depot/vendor/mcelog/tsc.c2010-03-05 20:24:22.0 
> > |+++ //depot/projects/mcelog/tsc.c2010-03-05 21:09:24.0 
> > --
> > Patching file tsc.c using Plan A...
> > Hunk #1 succeeded at 15.
> > Hunk #2 succeeded at 52.
> > Hunk #3 succeeded at 75.
> > Hunk #4 succeeded at 156.
> > done
> > 12:12:46 ~/temp/MCE/mcelog-1.0pre2
> > # gmake FREEBSD=yes
> > Makefile:92: .depend: No such file or directory
> > cc -MM -I. p4.c k8.c mcelog.c dmi.c tsc.c core2.c bitfield.c intel.c
> > nehalem.c dunnington.c tulsa.c config.c memutil.c msg.c eventloop.c
> > leaky-bucket.c memdb.c server.c client.c cache.c rbtree.c memstream.c >
> > .depend.X && mv .depend.X .depend
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o mcelog.o mcelog.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o p4.o p4.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o k8.o k8.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o dmi.o dmi.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o tsc.o tsc.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o core2.o core2.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o bitfield.o
> > bitfield.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o intel.o intel.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o nehalem.o 
nehalem.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o dunnington.o
> > dunnington.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o tulsa.o tulsa.c
> > cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
> > -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
> > -Wmissing-declarations -Wdeclaration-after-statement  -o config.o config.c
> > config.c:135: error: static declaration of 'getline' follows non-static
> > declaration
> > /usr/include/stdio.h:370: error: previous declaration of 'getline' was 
here
> > gmake: *** [config.o] Error 1
> >
> 
> A local getline() needs the FreeBSD version check.
> 
> %%%
> --- config.c.olg2011-02-18 14:57:52

Re: seeking into /dev/{null,zero}

2011-02-22 Thread John Baldwin

On Tuesday, February 22, 2011 11:46:05 am Garrett Cooper wrote:
> (Please bottom post)
> 
> On Tue, Feb 22, 2011 at 8:31 AM, Andrew Duane  wrote:
> > I thought seeking past EOF was valid; writing something creates a file 
with a hole in it. I always assumed that was standard semantics.
> 
> That's with SET_HOLE/SET_DATA though, correct? If so, outside of
> that functionality I would assume relatively standard POSIX semantics.

Err, no, you can always seek past EOF and then call write(2) to extend a file 
(it does an implicit ftruncate(2)).  SEEK_HOLE and SEEK_DATA are different, 
they are just used to discover sparse regions within a file.

From the manpage:

 The lseek() system call allows the file offset to be set beyond the end
 of the existing end-of-file of the file.  If data is later written at
 this point, subsequent reads of the data in the gap return bytes of zeros
 (until data is actually written into the gap).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Super pages

2011-02-23 Thread John Baldwin

On Wednesday, February 23, 2011 11:54:59 am Dr. Baud wrote:
> > On 23/02/2011 14:03, Dr. Baud wrote:
> > >
> > >  In general, is it unadvisable to disable super pages?
> > 
> > I don't think there would be any effect on the stability of operation if 
> > you disable superpages, but generally (except in cases of CPU bugs) you 
> > would not need to. Your system should operate a bit faster with 
> > superpages enabled.
> 
> When is the memory allocated via contigmalloc freed? I have a test kernel 
> module that allocates memory in 8MB chucks until contigmalloc says enough 
> (the 
> ginormous.c/Makefile attachment). I also have a bash script that displays the
> interesting memory related kernel state variables (the mem attachement).
> When I load and unload the kernel module and display the VM pages stats I
> never see the wired pages nor free pages change:

Err, it's freed when you call contigfree().  If you leak the memory when you
do a kldunload, it is just lost until you reboot.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: mtx_init/lock_init and uninitialized struct mtx

2011-02-24 Thread John Baldwin

On Thursday, February 24, 2011 10:47:27 am Dmitry Krivenok wrote:
> Hello Hackers,
> 
> Is it allowed to call mtx_init on a mutex defined as an auto variable
> and not initialized explicitly, i.e.:

It does expect you to zero it first.  I've considered adding a MTX_NEW flag to 
disable this check for places where the developer knows it is safe.  Most 
mutexes are allocated in an already-zero'd structure or BSS as Patrick noted,
so they are already correct.  It is a trade off between catching double 
initializations and requiring extra M_ZERO flags or bzero() calls in various 
places.

> static int foo()
> {
>struct mtx m;  // Uninitialized auto variable, so it's value is 
undefined.
>mtx_init(&m, "my_mutex", NULL, MTX_DEF);
>…
>// Do something
>...
>mtx_destroy(&m);
>return 0;
> }
> 
> I encountered a problem with such code on a kernel compiled with
> INVARIANTS option.
> The problem is that mtx_init calls lock_init(&m->lock_object) and
> lock_init, in turn, calls:
> 
>  79 /* Check for double-init and zero object. */
>  80 KASSERT(!lock_initalized(lock), ("lock \"%s\" %p already
> initialized",
>  81 name, lock));
> 
> lock_initialized() just checks that a bit is set in lo_flags field of
> struct lock_object:
> 
> 178 #define lock_initalized(lo) ((lo)->lo_flags & LO_INITIALIZED)
> 
> However, the structure containing this field is never initialized
> (neither in mtx_init nor in lock_init).
> So, assuming that the mutex was defined as auto variable, the content
> of lock_object field of struct mtx
> is also undefined:
> 
>  37 struct mtx {
>  38 struct lock_object  lock_object;/* Common lock
> properties. */
>  39 volatile uintptr_t  mtx_lock;   /* Owner and flags. */
>  40 };
> 
> In some cases, the initial value of lo_flags _may_ have the
> "initialized" bit set and KASSERT will call panic.
> 
> Is it user's responsibility to properly (how exactly?) initialize
> struct mtx, e.g.
> memset(&m, '\0', sizeof(struct mtx));
> 
> Or should mtx_init() explicitly initialize all fields of struct mtx?
> 
> Thanks in advance!
> 
> -- 
> Sincerely yours, Dmitry V. Krivenok
> e-mail: krivenok.dmi...@gmail.com
> skype: krivenok_dmitry
> jabber: krivenok_dmi...@jabber.ru
> icq: 242-526-443
> ___
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
> 

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: listing all modules compiled into a kernel instance

2011-03-01 Thread John Baldwin

On Tuesday, March 01, 2011 6:49:17 am Daniel O'Connor wrote:
> 
> On 01/03/2011, at 15:10, Carl wrote:
> > Kernel drivers can be (and in at least one case are) compiled into the 
> > kernel but are not reported when queried for, at least not in a way that 
I am aware of. For example, the ucom driver is present in the GENERIC kernel in 
this way. My expectation was that "kldstat -v" would list it, if 
present, but it does not. A design flaw?
> 
> Sounds like a bug, but I'm not sure where..
> 
> Maybe ucom doesn't appear because it doesn't have a DRIVER_MODULE() 
> declaration (because it isn't a driver).

Yes, that would explain it.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: listing all modules compiled into a kernel instance

2011-03-02 Thread John Baldwin

On Tuesday, March 01, 2011 3:01:48 pm Carl wrote:
> On 2011-03-01 3:20 AM, Maxim Khitrov wrote:
> > kldstat provides information about components that were loaded
> > dynamically. If your kernel was built with INCLUDE_CONFIG_FILE option
> > (enabled by default in GENERIC), then you can see the static
> > components using:
> >
> > config -x /boot/kernel/kernel
> 
> As has been shown though, "kldstat -v" actually does show static 
> components, at least those declared with DRIVER_MODULE(), and "config 
> -x" does not improve on the situation at all because components like 
> ucom were not cited in the configuration file. IMHO, there needs to be a 
> reliable way to query an existing kernel that yields a _complete_ list 
> of which components are actually included.
> 
> On 2011-03-01 5:00 AM, John Baldwin wrote:
> >> Maybe ucom doesn't appear because it doesn't have a DRIVER_MODULE()
> >> declaration (because it isn't a driver).
> >
> > Yes, that would explain it.
> 
> I can explicitly include ucom in a kernel by adding "device ucom" in the 
> configuration file, in which case it would call DRIVER_MODULE(), right? 
> That would then make it appear in the "kldstat -v" list? So why is it a 
> driver when it's done explicitly, but not a driver when done implicitly? 
> That makes no sense to me since the functionality doesn't change. IMHO, 
> this is a bug that needs to be fixed, not just for ucom but any 
> implicitly included driver.

No, the _source_ code of device ucom has to explicitly say "I am a module 
named 'foo'" using a DECLARE_MODULE() macro (or another macro such as 
DRIVER_MODULE() that invokes DECLARE_MODULE()).  The 'device ucom' in a config 
file does not generate this, that is just an instruction that config(8) uses 
when looking in sys/conf/files to see which C source files to include in the 
kernel build.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: listing all modules compiled into a kernel instance

2011-03-03 Thread John Baldwin

On Thursday, March 03, 2011 3:03:02 am Carl wrote:
> On 2011-03-01 2:13 PM, John Baldwin wrote:
> >> On 2011-03-01 5:00 AM, John Baldwin wrote:
> >>>> Maybe ucom doesn't appear because it doesn't have a DRIVER_MODULE()
> >>>> declaration (because it isn't a driver).
> >>>
> >>> Yes, that would explain it.
> >>
> >> I can explicitly include ucom in a kernel by adding "device ucom" in the
> >> configuration file, in which case it would call DRIVER_MODULE(), right?
> >> That would then make it appear in the "kldstat -v" list? So why is it a
> >> driver when it's done explicitly, but not a driver when done implicitly?
> >> That makes no sense to me since the functionality doesn't change. IMHO,
> >> this is a bug that needs to be fixed, not just for ucom but any
> >> implicitly included driver.
> >
> > No, the _source_ code of device ucom has to explicitly say "I am a module
> > named 'foo'" using a DECLARE_MODULE() macro (or another macro such as
> > DRIVER_MODULE() that invokes DECLARE_MODULE()).  The 'device ucom' in a 
> > config
> > file does not generate this, that is just an instruction that config(8) uses
> > when looking in sys/conf/files to see which C source files to include in the
> > kernel build.
> 
> My wording was unclear. I do understand that it's the source code rather 
> than the configuration file that invokes the macro. The argument I am 
> making is that no matter how the ucom source code ends up being compiled 
> into the kernel, the end result is that the ucom device is functionally 
> present and available at run time. As such, it makes no sense to me that 
> one can discover it's presence/availability with "kldstat -v" _only_ 
> when compiled in as a consequence of a "device ucom" statement. As a 
> user I care about accurate reporting when I query for information and 
> currently "kldstat -v" cannot be relied upon. I shouldn't have to 
> concern myself with what mechanism caused ucom to be included, but only 
> that it was. Moreover, I suggest that for all practical purposes, a 
> module is a module by virtue of its behaviour, not because 
> DECLARE_MODULE() was invoked. Thus my assertion that this is a bug.

Ah, but your assertion is what is wrong.  There is no 'apic' module for
'device apic' for example.  Also, a single 'device foo' might enable
multiple "modules" (e.g. if foo supports devices on both PCI and ISA
buses, you will have foo/pci and foo/isa modules).

A module != a kld.  A kld file may contain zero or more modules.  Most kld's
include at least one module.

> Until it is fixed, please tell me how I can reliably query an existing 
> kernel for a list of its functional modules/drivers.

There are ways to query multiple things about the kernel, but they are more
specific than a nebulous "module" concept:

- kldstat lists the kld's currently loaded
- kldstat -v lists the declared modules in all of the kld's
- lsvfs lists the filesystems currently available
- all new-bus device drivers end up in the kldstat -v output as
  'driver/parent', but this does not work for devices that are actually
  support libraries shared by other drivers (e.g. ucom).

ucom is a bit special as it isn't an actual driver, it's a library of routines
shared by various USB serial drivers: u3g, uark, ubsa, uftdi, etc.  Those are
the "real" drivers that one would want to test for.  By itself 'device ucom'
doesn't buy you anything.  'device ucom' is probably dubious as if you put
'device u3g' in your kernel config, the kernel will automatically include the
USB serial driver library routines.  If you 'kldload u3g.ko' it will
automatically load 'ucom.ko' as a dependency, so an explicit 'device ucom' is
generally not needed.  There is no 'device uether' for the common USB ethernet
routines shared by all the USB ethernet drivers (though there is a uether.ko),
and 'device ucom' should probably be removed.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: hw.physmem (loader.conf and sysctl)

2011-03-04 Thread John Baldwin

On Friday, March 04, 2011 12:48:55 pm Andriy Gapon wrote:
> on 04/03/2011 16:36 Dmitry Krivenok said the following:
> > Hello Hackers,
> > I've limited the amount of physical memory visible for my FreeBSD-8.2 by 
adding
> > the following in loader.conf:
> > 
> > $ cat /boot/loader.conf | grep hw.physmem
> > hw.physmem="500M"
> > $
> > 
> > However, according to sysctl, the system sees
> > 
> > $ sysctl hw.physmem
> > hw.physmem: 507445248
> > $
> > 
> > The difference is (500 * 2**20 - 507445248) / 2**20 == 16.0625 Mb.
> > How does the system use this "hidden" memory?
> 
> Some memory is taken by structures that describe usable pages.
> There is one vm_page_t structure per each 4KB page.
> I believe that that memory is excluded from physmem.

Also, the message buffer for dmesg, and the kernel binary itself.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: UMA zone alloc on large UMA zone causing Page fault in kernel mode

2011-03-08 Thread John Baldwin

On Monday, March 07, 2011 5:54:47 pm Jonathan Stuart wrote:
> Hiya all,
> 
> Does anyone have any idea why a large UMA zone created without the 
> UMA_ZONE_PAGEABLE flag would page fault in kernel mode when I uma_zalloc one 
item with M_ZERO | M_WAITOK.  The fault takes place during the bzero'ing of
> the memory.. the pointer *looks* valid, as well.  This does not happen with 
> some smaller zones I've been using.
> The zone shows up in ddb's "show uma", and it's size 620756992.
> 
> Do I need to use UMA_ZONE_NOFREE to keep it in memory?  This was not my 
> understanding of that flag.

No, that just prevents free slabs from being returned to the system.

Is this on amd64?  Some questions, things you can try if so:

See if it is a direct-mapped address.  If it is, then I'm at a loss as to why
this would fault.  Maybe verify that the physical addresses backed by that
range are valid via SMAP?

If it is not direct-mapped, grab my gdb scripts from
http://www.freebsd.org/~jhb/gdb/.  Then fire up kgdb on the crashdump, cd to
a directory holding the scripts and do 'source gdb6'.  Then you can use the
'kvm' command to display a rough layout of the kernel's address space.  Make
sure the virtual address range is backed by something valid.

If it is then you might want to write some custom gdb scripts to find the
right vm_map_entry in the kernel_map for your address range and find the
backing VM object and starting offset.  Then you can use gdb to examine
the pages assigned to that VM object at that offset and ensure they are
valid, etc.  You might also try examining the PTE's directly as well.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [GSoc] Timeconter Performance Improvements

2011-03-25 Thread John Baldwin

On Thursday, March 24, 2011 9:34:35 am Jing Huang wrote:
> Hi,
> 
>Thanks for your replay. That is just my self-introduction:) I want
> to borrow the shared memory idea from KVM, I am not want to port a
> whole KVM:)  But for this project, there are some basic problems.
> 
> As I know, tsc counter is CPU specific. If the process running on
> a multi-core platform, we must consider switching problem. The one
> way, we can let the kernel to take of this. When switching to another
> CPU, the kernel will reset the shared memory according to the new CPU.
> The second way, we can use CPUID instruction to get the info of
> current CPU, which can be executed in user mode ether. At the same
> time, the kernel maintains shared memory for each CPU. When invoke
> gettimeofday, the function will calculate precise time with current
> CPU's shared memory.
> 
>I don't know which is better? Could I need to deal other problems?

For modern Intel CPUs you can just assume that the TSCs are in sync across 
packages.  They also have invariant TSC's meaning that the frequency doesn't 
change.  You can easily export a copy of the current 'timehands' structure 
when the TSC is used as the timecounter and then just reimplement bintime() in 
userland.  This assumes you use the TSC as the kernel's timecounter, but you 
really need to do that so that ntpd_adjtime() is taken into account, etc.

That will give a very fast and very cheap timecounter.

I believe we already have a shared page (it holds the signal trampoline now)
for at least the x86 platform (probably some others as well).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [GSoc] Timeconter Performance Improvements

2011-03-26 Thread John Baldwin

On Saturday, March 26, 2011 08:16:46 am Peter Jeremy wrote:
> On 2011-Mar-25 08:18:38 -0400, John Baldwin  wrote:
> >For modern Intel CPUs you can just assume that the TSCs are in sync across
> >packages.  They also have invariant TSC's meaning that the frequency
> >doesn't change.
> 
> Synchronised P-state invariant TSCs vastly simplify the problem but
> not everyone has them.  Should the fallback be more complexity to
> support per-CPU TSC counts and varying frequencies or a fallback to
> reading the time via a syscall?

I think we should just fallback to a syscall in that case.  We will also need 
to do that if the TSC is not used as the timecounter (or always duplicate the 
ntp_adjtime() work we do for the current timecounter for the TSC timecounter).

Doing this easy case may give us the most bang for the buck, and it is also a 
good first milestone.  Once that is in place we can decide what the value is 
in extending it to support harder variations.

One thing we do need to think about is if the shared page should just export a
fixed set of global data, or if it should export routines.  The latter 
approach is more complex, but it makes the ABI boundary between userland and 
the kernel more friendly to future changes.  I believe Linux does the latter 
approach?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: New Boot-Loader

2011-03-28 Thread John Baldwin

On Monday, March 28, 2011 12:48:03 am Devin Teske wrote:
> Hi fellow hackers,
> 
> I'm designing an open-sourced replacement boot-loader for FreeBSD. I feel 
that the existing options in the boot-loader menu today can be whittled down 
significantly with a stateful menu system rather than a single-action item 
menu system.

Are you reimplementing loader from scratch or just hacking on the 4th scripts 
to display the menu, etc.?

> In designing the new menu, I'd like to get your opinions. From old:
> 
>   FreeBSD 8.1-RELEASE: twitpic.com/4e485w
> 
> to new:
> 
>   Replacement Boot-Loader: twitpic.com/4e46ol
> 
> NOTE: The final release will have a single-user mode option.
> 
> The new menu allows for more flexibility as selecting options 2 ("Boot 
Verbose") or 3 ("ACPI Support") independently toggles the status, updates the 
menu item, and redisplays the menu -- ever-waiting until the user ultimately 
presses ENTER, "1", or escapes to the prompt and types "boot". Thus, one could 
potentially launch single-user mode with verbosity on and ACPI disabled (if 
one so desired).

This is good.  I think DFly already does this and I had a low priority item on 
my todo list to eventually implement this in the current menu myself.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Include file search path

2011-03-30 Thread John Baldwin

On Tuesday, March 29, 2011 5:20:30 pm m...@freebsd.org wrote:
> I thought I knew something about how the compiler looks for include
> files, but now I think maybe I don't know much. :-)
> 
> So here's what I'm pondering.  When I build a library, like e.g. libc,
> where do the include files get pulled from?  They can't (shouldn't) be
> the ones in /usr/include, but I don't see a -nostdinc like for the
> kernel.  There are -I directives in the Makefile for
> -I${.CURDIR}/include -I${.CURDIR}/../../include, etc., but that won't
> remove /usr/include from the search path.
> 
> I see in the gcc documentation that -I paths are searched before the
> standards paths.  But isn't the lack of -nostdinc a bug (not just for
> libc, but for any library in /usr/src/lib)?  It somewhat feels to me
> that all of the libraries and binaries in the source distribution
> should use -nostdinc and include only from the source distribution
> itself.  This isn't always an issue, but for source upgrades it seems
> crucial, and for a hacker it saves difficulties with having to install
> headers before re-building.
> 
> Is that the intent, and it's not fully implemented?  How badly would
> things break if -nostdinc was included in e.g. bsd.lib.mk? (This would
> break non-base libraries, yes?  But as a thought experiment for the
> base, how far off are we?)

If you are building a library by hand you do want to use the includes from 
/usr/include.  I am not sure how we accomplish during buildworld (but we do).
I think we actually build the compiler in the cross-tools stage such that
it uses the /usr/include directory under {WORLDTMP} in place of /usr/include
in the default search path.

Some other folks might be able to verify that (perhaps ru@?).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Add SUM sysctl

2011-04-18 Thread John Baldwin

On Saturday, April 16, 2011 10:24:44 am rank1see...@gmail.com wrote:
> After compilation of kernel and world in MUM, kernel is installed in MUM, 
> but to install world, we reboot into SUM, then install world. (HANDBOOK)
> Now, in case of GELI usage AND if upgrading is taking place, i.e; 8.2 -> 
> 8.3, once you reboot into SUM to install world, you are doomed, BECAUSE 
> ...
> Kernel will bitch (GELI part), about world->kernel mismatch and you won't 
> be able to install world as you cant decrypt geom providers!!
> The only way to save yourself in that case is to restore /boot/kernel.old, 
> or one is doomed.

This seems broken to me.  An 8.3 kernel+modules should be able to handle GELI 
devices with an 8.2 world.  If they can't, it means someone broke the ABI.  
Even a 9.0 kernel should work fine with an 8.x-stable world.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: SMP question w.r.t. reading kernel variables

2011-04-18 Thread John Baldwin

On Sunday, April 17, 2011 3:49:48 pm Rick Macklem wrote:
> Hi,
> 
> I should know the answer to this, but... When reading a global kernel
> variable, where its modifications are protected by a mutex, is it
> necessary to get the mutex lock to just read its value?
> 
> For example:
> Aif ((mp->mnt_kern_flag & MNTK_UNMOUNTF) != 0)
>   return (EPERM);
> versus
> BMNT_ILOCK(mp);
>  if ((mp->mnt_kern_flag & MNTK_UNMOUNTF) != 0) {
>   MNT_IUNLOCK(mp);
>   return (EPERM);
>  }
>  MNT_IUNLOCK(mp);
> 
> My hunch is that B is necessary if you need an up-to-date value
> for the variable (mp->mnt_kern_flag in this case).
> 
> Is that correct?

You already have good followups from Attilio and Kostik, but one thing to keep 
in mind is that if a simple read is part of a larger "atomic operation" then 
it may still need a lock.  In this case Kostik points out that another lock 
prevents updates to mnt_kern_flag so that this is safe.  However, if not for 
that you would need to consider the case that another thread sets the flag on 
the next instruction.  Even the B case above might still have that problem 
since you drop the lock right after checking it and the rest of the function 
is implicitly assuming the flag is never set perhaps (or it needs to handle 
the case that the flag might become set in the future while MNT_ILOCK() is 
dropped).

One way you can make that code handle that race is by holding MNT_ILOCK() 
around the entire function, but that approach is often only suitable for a 
simple routine.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: SMP question w.r.t. reading kernel variables

2011-04-18 Thread John Baldwin

On Monday, April 18, 2011 4:22:37 pm Rick Macklem wrote:
> > On Sunday, April 17, 2011 3:49:48 pm Rick Macklem wrote:
> > > Hi,
> > >
> > > I should know the answer to this, but... When reading a global
> > > kernel
> > > variable, where its modifications are protected by a mutex, is it
> > > necessary to get the mutex lock to just read its value?
> > >
> > > For example:
> > > A if ((mp->mnt_kern_flag & MNTK_UNMOUNTF) != 0)
> > >   return (EPERM);
> > > versus
> > > B MNT_ILOCK(mp);
> > >  if ((mp->mnt_kern_flag & MNTK_UNMOUNTF) != 0) {
> > >   MNT_IUNLOCK(mp);
> > >   return (EPERM);
> > >  }
> > >  MNT_IUNLOCK(mp);
> > >
> > > My hunch is that B is necessary if you need an up-to-date value
> > > for the variable (mp->mnt_kern_flag in this case).
> > >
> > > Is that correct?
> > 
> > You already have good followups from Attilio and Kostik, but one thing
> > to keep
> > in mind is that if a simple read is part of a larger "atomic
> > operation" then
> > it may still need a lock. In this case Kostik points out that another
> > lock
> > prevents updates to mnt_kern_flag so that this is safe. However, if
> > not for
> > that you would need to consider the case that another thread sets the
> > flag on
> > the next instruction. Even the B case above might still have that
> > problem
> > since you drop the lock right after checking it and the rest of the
> > function
> > is implicitly assuming the flag is never set perhaps (or it needs to
> > handle
> > the case that the flag might become set in the future while
> > MNT_ILOCK() is
> > dropped).
> > 
> > One way you can make that code handle that race is by holding
> > MNT_ILOCK()
> > around the entire function, but that approach is often only suitable
> > for a
> > simple routine.
> > 
> All of this makes sense. What I was concerned about was memory cache
> consistency and whet (if anything) has to be done to make sure a thread
> doesn't see a stale cached value for the memory location.
> 
> Here's a generic example of what I was thinking of:
> (assume x is a global int and y is a local int on the thread's stack)
> - time proceeds down the screen
> thread X on CPU 0thread Y on CPU 1
> x = 0;
>  x = 0; /* 0 for x's location in CPU 1's 
> memory cache */
> x = 1;
>  y = x;
> --> now, is "y" guaranteed to be 1 or can it get the stale cached 0 value?
> if not, what needs to be done to guarantee it?

Well, the bigger problem is getting the CPU and compiler to order the
instructions such that they don't execute out of order, etc.  Because of that,
even if your code has 'x = 0; x = 1;' as adjacent threads in thread X,
the 'x = 1' may actually execute a good bit after the 'y = x' on CPU 1.
Locks force that to sychronize as the CPUs coordinate around the lock cookie
(e.g. the 'mtx_lock' member of 'struct mutex').

> Also, I see cases of:
>  mtx_lock(&np);
>  np->n_attrstamp = 0;
>  mtx_unlock(&np);
> in the regular NFS client. Why is the assignment mutex locked? (I had assumed
> it was related to the above memory caching issue, but now I'm not so sure.)

In general I think writes to data that are protected by locks should always be
protected by locks.  In some cases you may be able to read data using "weaker"
locking (where "no locking" can be a form of weaker locking, but also a
read/shared lock is weak, and if a variable is protected by multiple locks,
then any singe lock is weak, but sufficient for reading while all of the
associated locks must be held for writing) than writing, but writing generally
requires "full" locking (write locks, etc.).

The case above may be excessive caution on my part, but I'd rather be safe than
sorry for writes.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Is there some implicit locking of device methods?

2011-04-26 Thread John Baldwin

On Tuesday, April 26, 2011 5:52:27 am Bartosz Fabianowski wrote:
> Hi list
> 
> I am trying to move a device driver out from under Giant on 8-STABLE. 
> The driver has the usual probe/attach/detach and 
> open/close/read/ioctl/poll/purge methods. So far, all were protected by 
> each other by Giant. With that disabled, I am wondering whether I need 
> to guard against scenarios like the following:
> 
> 1. attach() is running and executes make_dev(). Before attach() has 
> finished, someone calls open() on the newly created device node and 
> tries to read from a device that is not fully instantiated.
> 
> 2. read() is running when the device is suddenly pulled (this is a USB 
> device) so that detach() gets run. Thus, detach() starts tearing down 
> data structures that read() may still be accessing.
> 
> 3. attach() is running when the device is pulled again, triggering 
> detach(). Now, attach() and detach() are running concurrently, the first 
> one initializing data structures and the second one tearing them down again.
> 
> Obviously, I can avoid races under these conditions by protecting each 
> of the above functions with a mutex. What puzzles is me is that no other 
> device seems to be doing this. There never is a mutex involved in any 
> attach(), detach(), open() methods... Is there some kind of implicit 
> locking going on that I am not aware of? Are DEVMETHODs automatically 
> protected from each other and the world? Are methods referenced by a 
> struct cdevsw similarly protected from each other somehow?

The probe/attach/detach stuff is still under Giant.  However, you will have to 
do your own locking to handle races.  Many of these races can be handled 
without locks however:

 - Don't call make_dev() until your device is fully ready in attach() (e.g.
   at the end.. NIC drivers tend to call ether_ifattach() as the very last
   thing for this same reason).
 - Call destroy_dev() as the first thing in detach.  destroy_dev() will block
   until any calls to your cdevsw routines return (so if a thread is in read()
   when detach happens, destroy_dev() will hang until your read() call
   returns).  Note that if you need to wake up waiting threads when
   destroy_dev() is called, you can provide a d_purge method in your cdevsw.
   This function is called from destroy_dev() every 100 milliseconds in a loop
   while there are waiting threads.
 - The Giant protection for new-bus should prevent attach/detach from running
   concurrently I believe (either that or the USB bus itself should ensure
   that the two instances of your device have seperate device_t instances with
   separate softc's, so current attach/detach should not matter except that
   they may both try to talk to the same hardware perhaps?  In that case that
   is something the USB bus driver should fix by prevent a device from
   attaching at an existing address until any existing device at that address
   is fully detached).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Is there some implicit locking of device methods?

2011-04-26 Thread John Baldwin

On Tuesday, April 26, 2011 10:27:14 am Warner Losh wrote:
> 
> On Apr 26, 2011, at 7:42 AM, John Baldwin wrote:
> > - The Giant protection for new-bus should prevent attach/detach from running
> >   concurrently I believe (either that or the USB bus itself should ensure
> >   that the two instances of your device have seperate device_t instances 
> > with
> >   separate softc's, so current attach/detach should not matter except that
> >   they may both try to talk to the same hardware perhaps?  In that case that
> >   is something the USB bus driver should fix by prevent a device from
> >   attaching at an existing address until any existing device at that address
> >   is fully detached).
> 
> I thought that if we held Giant when we're about to go to sleep that we drop
> it as a special case.  So if any newbus-releated function sleeps, we can
> have a situation where attach is running and detach gets called.  There is
> (or was) some code to cope with this in CardBus, iirc.  I'm surprised there
> isn't any in USB, since Hans was the one that alerted me to this issue.

Yes, Giant doesn't really provide too much help here.  However, the real fix
should be in the USB bus, and USB peripheral drivers should not have to worry
about handling concurrent attach/detach (they can't really handle it safely
anyway).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: problem with build mcelog

2011-04-26 Thread John Baldwin

On Tuesday, April 26, 2011 10:10:44 am Vladimir Laskov wrote:
> have problem for i386
> ==
> 
> # gmake FREEBSD=yes i386=yes
> Makefile:92: .depend: No such file or directory
> cc -MM -I. p4.c k8.c mcelog.c dmi.c tsc.c core2.c bitfield.c intel.c 
> nehalem.c dunnington.c tulsa.c config.c memutil.c msg.c eventloop.c 
> leaky-bucket.c memdb.c server.c client.c cache.c rbtree.c memstream.c > 
> .depend.X && mv .depend.X .depend
> cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers 
> -Wno-unused-parameter -Wstrict-prototypes -Wformat-security 
> -Wmissing-declarations -Wdeclaration-after-statement  -o mcelog.o mcelog.c
> In file included from mcelog.c:52:
> mcelog.h:112: error: expected identifier before numeric constant
> mcelog.c: In function 'bankname':
> mcelog.c:138: error: 'CPU_NEHALEM' undeclared (first use in this function)
> mcelog.c:138: error: (Each undeclared identifier is reported only once
> mcelog.c:138: error: for each function it appears in.)
> mcelog.c:138: error: 'CPU_DUNNINGTON' undeclared (first use in this 
> function)
> mcelog.c:138: error: 'CPU_TULSA' undeclared (first use in this function)
> mcelog.c: In function 'mce_filter':
> mcelog.c:163: error: 'CPU_NEHALEM' undeclared (first use in this function)
> mcelog.c:163: error: 'CPU_DUNNINGTON' undeclared (first use in this 
> function)
> mcelog.c:163: error: 'CPU_TULSA' undeclared (first use in this function)
> mcelog.c: At top level:
> mcelog.c:218: error: 'CPU_NEHALEM' undeclared here (not in a function)
> mcelog.c:218: error: array index in initializer not of integer type
> mcelog.c:218: error: (near initialization for 'cputype_name')
> mcelog.c:219: error: 'CPU_DUNNINGTON' undeclared here (not in a function)
> mcelog.c:219: error: array index in initializer not of integer type
> mcelog.c:219: error: (near initialization for 'cputype_name')
> mcelog.c:220: error: 'CPU_TULSA' undeclared here (not in a function)
> mcelog.c:220: error: array index in initializer not of integer type
> mcelog.c:220: error: (near initialization for 'cputype_name')
> mcelog.c: In function 'decodefatal':
> mcelog.c:835: warning: integer constant is too large for 'long' type
> mcelog.c:838: warning: integer constant is too large for 'long' type
> mcelog.c:921: warning: integer constant is too large for 'long' type
> mcelog.c:923: warning: integer constant is too large for 'long' type
> gmake: *** [mcelog.o] Error 1

Oops, please try this additional patch:

--- //depot/projects/mcelog/mcelog.c2010-08-25 11:41:19.0 
+++ /home/jhb/work/p4/mcelog/mcelog.c   2010-08-25 11:41:19.0 
@@ -29,6 +29,10 @@
 #include 
 #include 
 #include 
+#ifdef __i386__
+/* Conflicts with 'enum cputype' in . */
+#undef CPU_P4
+#endif
 #include 
 #include 
 #include 

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Is there some implicit locking of device methods?

2011-04-26 Thread John Baldwin

On Tuesday, April 26, 2011 10:42:17 am Hans Petter Selasky wrote:
> On Tuesday 26 April 2011 16:37:17 John Baldwin wrote:
> > On Tuesday, April 26, 2011 10:27:14 am Warner Losh wrote:
> > > On Apr 26, 2011, at 7:42 AM, John Baldwin wrote:
> > > > - The Giant protection for new-bus should prevent attach/detach from
> > > > running
> > > > 
> > > >   concurrently I believe (either that or the USB bus itself should
> > > >   ensure that the two instances of your device have seperate device_t
> > > >   instances with separate softc's, so current attach/detach should not
> > > >   matter except that they may both try to talk to the same hardware
> > > >   perhaps?  In that case that is something the USB bus driver should
> > > >   fix by prevent a device from attaching at an existing address until
> > > >   any existing device at that address is fully detached).
> > > 
> > > I thought that if we held Giant when we're about to go to sleep that we
> > > drop it as a special case.  So if any newbus-releated function sleeps,
> > > we can have a situation where attach is running and detach gets called. 
> > > There is (or was) some code to cope with this in CardBus, iirc.  I'm
> > > surprised there isn't any in USB, since Hans was the one that alerted me
> > > to this issue.
> > 
> > Yes, Giant doesn't really provide too much help here.  However, the real
> > fix should be in the USB bus, and USB peripheral drivers should not have
> > to worry about handling concurrent attach/detach (they can't really handle
> > it safely anyway).
> 
> Hi,
> 
> All detach/attach/suspend/resume functions on a device tree belonging to the 
> same USB controller are executed from a single thread, which is called the 
> root HUB thread.

Ok, that should work fine then to serialize the detach and attach.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Is there some implicit locking of device methods?

2011-04-27 Thread John Baldwin

On Tuesday, April 26, 2011 8:17:09 pm Bartosz Fabianowski wrote:
> > If you needs per-file private data for cdev, you would be better served
> > by cdevpriv(9) KPI. Cloning is too hard to use correctly for such task.
> 
> Thanks, I just got that working. To help those going down a similar path 
> in the future, I would like to note quickly that the following must be 
> added to the cdevsw structure to ensure proper clean-up:
> 
> .d_flags = D_TRACKCLOSE
> 
> I just spent hours debugging panics until I realized only the last 
> close() was triggering a call to my .d_close method.

Err, if you use cdevpriv you shouldn't even have a d_close method.  All your 
d_close logic should be in the cdevpriv destructor, and the kernel will call 
your destructor when all references to an open file descriptor go away (i.e. 
it is closed).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: problem with build mcelog

2011-04-27 Thread John Baldwin

On Wednesday, April 27, 2011 3:41:07 am Vladimir Laskov wrote:
> On 04/26/2011 07:43 PM, John Baldwin wrote:
> > --- //depot/projects/mcelog/mcelog.c2010-08-25 11:41:19.0 
> > 
> > +++ /home/jhb/work/p4/mcelog/mcelog.c   2010-08-25 11:41:19.0 
> > 
> > @@ -29,6 +29,10 @@
> >   #include
> >   #include
> >   #include
> > +#ifdef __i386__
> > +/* Conflicts with 'enum cputype' in. */
> > +#undef CPU_P4
> > +#endif
> >   #include
> >   #include
> >   #include
> >
> thanks, it work
> 
> my questions:
> 
>   - how to work mcelog without mcelogdevice ?
>   - Is it possible to use mcelog in daemon mode in FreeBSD?

mcelog uses the hw.mca sysctls, not a device in /dev in FreeBSD.  The daemon 
mode is not currently supported.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Is there some implicit locking of device methods?

2011-04-27 Thread John Baldwin

On Wednesday, April 27, 2011 9:22:43 am Bartosz Fabianowski wrote:
> > Err, if you use cdevpriv you shouldn't even have a d_close method.  All 
your
> > d_close logic should be in the cdevpriv destructor
> 
> I see. There is no documentation for any of this, so I just implemented 
> it in the way I *thought* it should work:
> 
> .d_close = drv_close,
> 
> int drv_close(...) {
>devfs_clear_cdevpriv();
> }
> 
> static void cdevpriv_dtr(void *data) {
>free(data, M_USBDEV);
> }
> 
> If I understand you correctly, I can leave out the drv_close() method. 
> When close() is called, devfs_clear_cdevpriv() will be executed 
> implcitly for me and my dstructor will run - right?

Yes, if you only care about cleaning up per-fd data.

If you have some sort of state that needs to get created on first open and 
then removed on last close, you may still want to use a d_close() method, but 
there are actually edge cases where that can still not be called.  So, for 
that sort of data I would still depend on the cdevpriv destructor and use a 
reference count between open() and the destructor to know when to cleanup 
shared state.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Look of boot2, on HDD

2011-05-02 Thread John Baldwin

On Saturday, April 30, 2011 12:48:35 pm Alexander Motin wrote:
> Garrett Cooper wrote:
> > 2011/4/29  :
> >> /boot/boot2STAGE 2 bootstrap file
> >> Understands the FreeBSD file system enough, to find files on it, and can 
> >> provide a simple interface to choose the kernel or loader to run.
> >>
> >> Once sys is fully booted, HDD is 'ada0'.
> >> However, STAGE 2, sees it, as a 'ad4', at boot process, which is same 
> >> seen, by booted sys, when I turn off AHCI.
> >>
> >> So, here is the riddle ...
> >> On fully booted sys, how do I query STAGE 2, to tell me, how it'll see, my 
> >> 'ada0' HDD?
> > 
> > This is a very interesting catch:
> > 
> > /usr/src/sys/boot/pc98/boot2/boot2.c:static const char *const
> > dev_nm[NDEV] = {"ad", "da", "fd"};
> > /usr/src/sys/boot/i386/boot2/boot2.c:static const char *const
> > dev_nm[NDEV] = {"ad", "da", "fd"};
> > 
> > It probably will be a no-op soon because of some of the
> > compatibility changes Alex made, but still a potential point of
> > confusion nonetheless.
> 
> Pardon my ignorance, but could somebody shed some light for me on this
> list of names? Why much more sophisticated loader(8) operates disks as
> diak0/1/..., while boot2 tries to mimic something he has no any idea
> about, using very limited information from random sources? Does this
> names important for anything?

They are no longer important.  Before /boot/loader existed, boot2 passed
the root device to the kernel via 'bootdev'.  It basically handled
floppies (fdX for BIOS devices < 0x80) and hard drives (devices starting
at 0x80, either ATA (wdX) or SCSI (sdX)).  I think the user could hint
what the root device was via /boot.config similar to 'vfs.mountroot' in
loader.conf.

Due to CAM (in 3.x) and sos's new ATA (in 4.x), wd and sd were renamed to
'ad' and 'da'.  At this point however, it is mostly archaic.  boot2
still passes info in bootdev that the loader uses, but all the loader
cares about is the BIOS device number partition/slice information on that
device.

I would be happy for boot2 to be changed to use the same naming scheme
that /boot/loader uses (diskX), but it's fairly low priority.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [UPDATE] New Boot-Loader Menu -- version 1.1

2011-05-03 Thread John Baldwin

On Monday, May 02, 2011 8:48:31 pm Devin Teske wrote:
> This version (1.1) works nearly identically to the standard menu that ships 
> with
> FreeBSD in that it detects whether ACPI is enabled (truth be told, I actually
> re-used the "acpienabled?" function verbatim from /boot/beastie.4th by Scott
> Long and Aleksander Fafula). The ACPI detection of my boot loader (version 1.1
> or higher) should be identical to the detection of the current boot-loader.
> 
> I would be willing to bet that your workstation -- while running the default
> boot loader -- displays "Boot FreeBSD with ACPI enabled" for option #2
> (indicating that ACPI appears to be disabled from your system's perspective).
> 
> As far as I know, the loader does not know that ACPI is compiled into your
> kernel. Rather the ACPI menuitem (both in the default boot-loader menu and in 
> my
> version 1.1) hinges on whether "acpi_load" is defined (and is enabled).
> 
> On a side-note, the same exact code is displaying ACPI as enabled for me
> (running under Parallels 4 on Mac OS X 10.6.7) at boot time. Yet, I do not 
> have
> acpi_load in loader.conf(5), though I do have a kernel with ACPI built-in. My
> guess is that loader(8) is setting load_acpi="YES", which I verify immediately
> after executing loader(8) and the loader.4th start-word (which reads
> loader.conf(5) among other things).

Err, note that the acpienabled stuff is all different in HEAD than in 7/8
since acpi.ko no longer exists.  You should use the scheme from HEAD for
handling ACPI present vs ACPI enabled/disabled.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Runtime check for PAE option on BSD 6+ i386

2011-05-03 Thread John Baldwin

On Tuesday, May 03, 2011 9:10:26 am Philip Soeberg wrote:
> Hi fellow FreeBSD hackers,
> 
> I've been using the following poor-man's approach in my driver init for 
> ages in an attempt at detecting PAE option on BSD 6 (or greater) i386 
> kernels, as I depend on dmabus(9) but provide a loadable kernel module only.
> 
>  >>>
>if (sizeof(void*) == 4) {
>  if (((uint64_t)(cnt.v_page_count * cnt.v_page_size) / 1073741824) 
>  >= 4) {
>printf("FreeBSD i386 detected with PAE option enabled. FreeBSD 
> PAE type\n");
>printf("kernels does not support loadable modules which use DMA. 
> Please\n");
>printf("reconfigure your kernel for non-PAE or switch to amd64 
> kernel.\n");
>return EFAULT;
>  }
>}
> <<<

Hmmm, even this isn't really accurate as some folks may choose to enable PAE
even with < 4GB to get PG_NX functionality.
 
> afaik there's a sysctl method of checking this per BSD7 (or is it 8?), 
> but what about BSD6? Any hints on how I can runtime detect the above?

Definitely a kern.features.pae sysctl in 7.  I don't see anything similar in 
6.  

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [UPDATE] New Boot-Loader Menu -- version 1.1

2011-05-03 Thread John Baldwin

On Tuesday, May 03, 2011 12:31:14 pm Devin Teske wrote:
> 
> On May 3, 2011, at 4:45 AM, John Baldwin wrote:
> 
> > On Monday, May 02, 2011 8:48:31 pm Devin Teske wrote:
> >> This version (1.1) works nearly identically to the standard menu that 
> >> ships with
> >> FreeBSD in that it detects whether ACPI is enabled (truth be told, I 
> >> actually
> >> re-used the "acpienabled?" function verbatim from /boot/beastie.4th by 
> >> Scott
> >> Long and Aleksander Fafula). The ACPI detection of my boot loader (version 
> >> 1.1
> >> or higher) should be identical to the detection of the current boot-loader.
> 
> Ugh. By "current", I meant 8.1-RELEASE (wasn't expecting this stuff to be 
> different in HEAD, which it is).
> 
> 
> > Err, note that the acpienabled stuff is all different in HEAD than in 7/8
> > since acpi.ko no longer exists.  You should use the scheme from HEAD for
> > handling ACPI present vs ACPI enabled/disabled.
> > 
> > -- 
> > John Baldwin
> 
> 
> Ok, I see the new "acpipresent?" word (which replaces the "arch-i386"
> environment-test). Does this imply that we're going to support ACPI on
> non-i386 platforms (or already do)?

amd64 and ia64 have always supported ACPI.  ia64 effectively requires it.
However, "hint.acpi.0.rsdp" is set by biosacpi.c in the i386 loader bits,
so other platforms will not set it, so the arch-i386 test is no longer
needed.
 
> I also see the rewritten "acpienabled?" word. Nice. I'll slurp it in to
> make my ACPI detection the same as HEAD.
> 
> I also performed some backward compatibility tests. Looks like this will be
> backward compatible with 8.1-RELEASE (loader_version == 11). However, the
> code in HEAD appears to not work in 8.0-RELEASE (loader_version == 8).

Hmm, which part does not work in 8.0?  arch-i386 has existed since at least
4.x I thought, and the ACPI bits have been setting hint.acpi.0.rsdp since 7.0
(sys/boot/i386/libi386/biosacpi.c CVS rev 1.12 added it).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Runtime check for PAE option on BSD 6+ i386

2011-05-03 Thread John Baldwin

On Tuesday, May 03, 2011 1:43:39 pm Kostik Belousov wrote:
> On Tue, May 03, 2011 at 11:44:32AM -0400, John Baldwin wrote:
> > On Tuesday, May 03, 2011 9:10:26 am Philip Soeberg wrote:
> > > Hi fellow FreeBSD hackers,
> > > 
> > > I've been using the following poor-man's approach in my driver init for 
> > > ages in an attempt at detecting PAE option on BSD 6 (or greater) i386 
> > > kernels, as I depend on dmabus(9) but provide a loadable kernel module 
> > > only.
> > > 
> > >  >>>
> > >if (sizeof(void*) == 4) {
> > >  if (((uint64_t)(cnt.v_page_count * cnt.v_page_size) / 1073741824) 
> > >  >= 4) {
> > >printf("FreeBSD i386 detected with PAE option enabled. FreeBSD 
> > > PAE type\n");
> > >printf("kernels does not support loadable modules which use DMA. 
> > > Please\n");
> > >printf("reconfigure your kernel for non-PAE or switch to amd64 
> > > kernel.\n");
> > >return EFAULT;
> > >  }
> > >}
> > > <<<
> > 
> > Hmmm, even this isn't really accurate as some folks may choose to enable PAE
> > even with < 4GB to get PG_NX functionality.
> >  
> > > afaik there's a sysctl method of checking this per BSD7 (or is it 8?), 
> > > but what about BSD6? Any hints on how I can runtime detect the above?
> > 
> > Definitely a kern.features.pae sysctl in 7.  I don't see anything similar 
> > in 
> > 6.  
> 
> Read %cr4 and test the bit there.

Oh, cute. :)

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [UPDATE] New Boot-Loader Menu -- version 1.1

2011-05-03 Thread John Baldwin

On Tuesday, May 03, 2011 2:57:34 pm Devin Teske wrote:
> > From: John Baldwin [mailto:j...@freebsd.org]
> > Sent: Tuesday, May 03, 2011 10:33 AM
> > To: Devin Teske
> > Cc: freebsd-hackers@freebsd.org; Olivier SMEDTS
> > Subject: Re: [UPDATE] New Boot-Loader Menu -- version 1.1
> >
> > On Tuesday, May 03, 2011 12:31:14 pm Devin Teske wrote:
> > >
> > > On May 3, 2011, at 4:45 AM, John Baldwin wrote:
> > >
> > > > On Monday, May 02, 2011 8:48:31 pm Devin Teske wrote:
> > > > > This version (1.1) works nearly identically to the standard menu
> > > > > that ships with FreeBSD in that it detects whether ACPI is
> > > > > enabled (truth be told, I actually re-used the "acpienabled?"
> > > > > function verbatim from /boot/beastie.4th by Scott Long and
> > > > > Aleksander Fafula). The ACPI detection of my boot loader (version
> > > > > 1.1 or higher) should be identical to the detection of the current
> > > > > boot-loader.
> > >
> > > Ugh. By "current", I meant 8.1-RELEASE (wasn't expecting this stuff
> > > to be different in HEAD, which it is).
> > >
> > >
> > > > Err, note that the acpienabled stuff is all different in HEAD than
> > > > in 7/8 since acpi.ko no longer exists.  You should use the scheme
> > > > from HEAD for handling ACPI present vs ACPI enabled/disabled.
> > > >
> > > > --
> > > > John Baldwin
> > >
> > >
> > > Ok, I see the new "acpipresent?" word (which replaces the "arch-i386"
> > > environment-test). Does this imply that we're going to support ACPI on
> > > non-i386 platforms (or already do)?
> >
> > amd64 and ia64 have always supported ACPI.  ia64 effectively requires it.
> > However, "hint.acpi.0.rsdp" is set by biosacpi.c in the i386 loader
> > bits, so other platforms will not set it, so the arch-i386 test is no longer
> needed.
> 
> If "hint.acpi.0.rsdp" is only set in the i386 pieces, wouldn't that imply that
> the
> "acpipresent?" would return FALSE on IA64?

Yes.  Right now the ACPI menu item is not displayed on ia64 and it never has
been.  You can't actually boot IA64 with ACPI disabled, so there's no reason
for it to be in the menu.

> > > I also see the rewritten "acpienabled?" word. Nice. I'll slurp it in
> > > to make my ACPI detection the same as HEAD.
> > >
> > > I also performed some backward compatibility tests. Looks like this
> > > will be backward compatible with 8.1-RELEASE (loader_version == 11).
> > > However, the code in HEAD appears to not work in 8.0-RELEASE
> > > (loader_version == 8).
> >
> > Hmm, which part does not work in 8.0?  arch-i386 has existed since at
> > least 4.x I thought, and the ACPI bits have been setting
> > hint.acpi.0.rsdp since 7.0 (sys/boot/i386/libi386/biosacpi.c CVS rev 1.12
> added it).
> >
> > --
> > John Baldwin
> 
> I've got this 8.0-STABLE box. I don't know exactly when it was installed, but
> /boot/loader has a timestamp from June 2010 and when I execute:
> 
>   s" loader_version" environment? drop .
> 
> I get "8", whereas when I boot the same exact hardware with 8.1-RELEASE, I get
> "11".
> 
> When I boot 8.0-STABLE (loader_version 8), I do not have "hint.acpi.0.rsdp"
> whereas if I boot 8.1-RELEASE (loader_version 11), I do get 
> "hint.acpi.0.rsdp".
> (NOTE: this is on the exact same hardware, without changing any BIOS settings
> between boots).
> 
> Is it possible that my 8.0-STABLE has a loader that is older than 7.0-RELEASE?
> I'm
> trying to figure out why this 8.0-STABLE box is not setting hint.acpi.0.rsdp.

What is the output of 'kenv | grep acpi' from your old loader?

Hmm, sys/boot/i386/loader/version was bumped to 1.1 in 5.0 release.  It was
bumped to 1.0 in 5.0-CURRENT.  It was last 0.8 (so "8") in 4.x.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [UPDATE] New Boot-Loader Menu -- version 1.1

2011-05-03 Thread John Baldwin

On Tuesday, May 03, 2011 4:17:23 pm Devin Teske wrote:
> > -Original Message-
> > From: John Baldwin [mailto:j...@freebsd.org]
> > Sent: Tuesday, May 03, 2011 12:20 PM
> > To: Devin Teske
> > Cc: freebsd-hackers@freebsd.org
> > Subject: Re: [UPDATE] New Boot-Loader Menu -- version 1.1
> > 
> > On Tuesday, May 03, 2011 2:57:34 pm Devin Teske wrote:
> > > > From: John Baldwin [mailto:j...@freebsd.org]
> > > > Sent: Tuesday, May 03, 2011 10:33 AM
> > > > To: Devin Teske
> > > > Cc: freebsd-hackers@freebsd.org; Olivier SMEDTS
> > > > Subject: Re: [UPDATE] New Boot-Loader Menu -- version 1.1
> > > >
> > > > On Tuesday, May 03, 2011 12:31:14 pm Devin Teske wrote:
> > > > >
> > > > > On May 3, 2011, at 4:45 AM, John Baldwin wrote:
> > > > >
> > > > > > On Monday, May 02, 2011 8:48:31 pm Devin Teske wrote:
> > > > > > > This version (1.1) works nearly identically to the standard
> > > > > > > menu that ships with FreeBSD in that it detects whether ACPI
> > > > > > > is enabled (truth be told, I actually re-used the "acpienabled?"
> > > > > > > function verbatim from /boot/beastie.4th by Scott Long and
> > > > > > > Aleksander Fafula). The ACPI detection of my boot loader
> > > > > > > (version
> > > > > > > 1.1 or higher) should be identical to the detection of the
> > > > > > > current boot-loader.
> > > > >
> > > > > Ugh. By "current", I meant 8.1-RELEASE (wasn't expecting this
> > > > > stuff to be different in HEAD, which it is).
> > > > >
> > > > >
> > > > > > Err, note that the acpienabled stuff is all different in HEAD
> > > > > > than in 7/8 since acpi.ko no longer exists.  You should use the
> > > > > > scheme from HEAD for handling ACPI present vs ACPI enabled/disabled.
> > > > > >
> > > > > > --
> > > > > > John Baldwin
> > > > >
> > > > >
> > > > > Ok, I see the new "acpipresent?" word (which replaces the "arch-i386"
> > > > > environment-test). Does this imply that we're going to support
> > > > > ACPI on
> > > > > non-i386 platforms (or already do)?
> > > >
> > > > amd64 and ia64 have always supported ACPI.  ia64 effectively requires 
> > > > it.
> > > > However, "hint.acpi.0.rsdp" is set by biosacpi.c in the i386 loader
> > > > bits, so other platforms will not set it, so the arch-i386 test is
> > > > no longer
> > > needed.
> > >
> > > If "hint.acpi.0.rsdp" is only set in the i386 pieces, wouldn't that
> > > imply that the "acpipresent?" would return FALSE on IA64?
> > 
> > Yes.  Right now the ACPI menu item is not displayed on ia64 and it never has
> > been.  You can't actually boot IA64 with ACPI disabled, so there's no reason
> for it
> > to be in the menu.
> 
> This raises a concern for my menu. Unlike the current menu, which blanks-out
> menuitem #2 for IA64, I've chosen instead to insert an inoperative menuitem 
> with
> the text "ACPI Support: N/A".

Hmm, I think you should just leave the menu item blank or not listed.  It
doesn't make sense to see a knob about ACPI support on a ppc box for example,
and other platforms may grow platform-specific knobs in the future as well.

The current menu item is only blank as a hack to avoid renumbering the items.
If you are already changing that around, then I'd just leave it out altogether
unless ACPI is detected by the loader.

> So what do you think I should do?
> 
> a. Rewrite both "acpipresent?" and "acpienabled?" to be backward compatible 
> with
> 6.x/older or
> b. embrace the future and simply warn about backward compatibility (or lack
> thereof) with respect to ACPI support.
> 
> NOTE: Route (a) may not be possible unless the loader_version was bumped at 
> the
> same time that hint.acpi.0.rsdp was added.

(a) is not possible for the reason you mention.  I wouldn't worry about 
supporting
6.x at this point, esp. if it is going to be a pain.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [UPDATE] New Boot-Loader Menu -- version 1.1

2011-05-03 Thread John Baldwin

On Tuesday, May 03, 2011 4:47:26 pm Devin Teske wrote:
> > -Original Message-
> > From: John Baldwin [mailto:j...@freebsd.org]
> > Sent: Tuesday, May 03, 2011 1:36 PM
> > To: Devin Teske
> > Cc: freebsd-hackers@freebsd.org
> > Subject: Re: [UPDATE] New Boot-Loader Menu -- version 1.1
> > 
> > On Tuesday, May 03, 2011 4:17:23 pm Devin Teske wrote:
> > > > -Original Message-
> > > > From: John Baldwin [mailto:j...@freebsd.org]
> > > > Sent: Tuesday, May 03, 2011 12:20 PM
> > > > To: Devin Teske
> > > > Cc: freebsd-hackers@freebsd.org
> > > > Subject: Re: [UPDATE] New Boot-Loader Menu -- version 1.1
> > > >
> > > > On Tuesday, May 03, 2011 2:57:34 pm Devin Teske wrote:
> > > > > > From: John Baldwin [mailto:j...@freebsd.org]
> > > > > > Sent: Tuesday, May 03, 2011 10:33 AM
> > > > > > To: Devin Teske
> > > > > > Cc: freebsd-hackers@freebsd.org; Olivier SMEDTS
> > > > > > Subject: Re: [UPDATE] New Boot-Loader Menu -- version 1.1
> > > > > >
> > > > > > On Tuesday, May 03, 2011 12:31:14 pm Devin Teske wrote:
> > > > > > >
> > > > > > > On May 3, 2011, at 4:45 AM, John Baldwin wrote:
> > > > > > >
> > > > > > > > On Monday, May 02, 2011 8:48:31 pm Devin Teske wrote:
> > > > > > > > > This version (1.1) works nearly identically to the
> > > > > > > > > standard menu that ships with FreeBSD in that it detects
> > > > > > > > > whether ACPI is enabled (truth be told, I actually re-used the
> > "acpienabled?"
> > > > > > > > > function verbatim from /boot/beastie.4th by Scott Long and
> > > > > > > > > Aleksander Fafula). The ACPI detection of my boot loader
> > > > > > > > > (version
> > > > > > > > > 1.1 or higher) should be identical to the detection of the
> > > > > > > > > current boot-loader.
> > > > > > >
> > > > > > > Ugh. By "current", I meant 8.1-RELEASE (wasn't expecting this
> > > > > > > stuff to be different in HEAD, which it is).
> > > > > > >
> > > > > > >
> > > > > > > > Err, note that the acpienabled stuff is all different in
> > > > > > > > HEAD than in 7/8 since acpi.ko no longer exists.  You should
> > > > > > > > use the scheme from HEAD for handling ACPI present vs ACPI
> > enabled/disabled.
> > > > > > > >
> > > > > > > > --
> > > > > > > > John Baldwin
> > > > > > >
> > > > > > >
> > > > > > > Ok, I see the new "acpipresent?" word (which replaces the
> "arch-i386"
> > > > > > > environment-test). Does this imply that we're going to support
> > > > > > > ACPI on
> > > > > > > non-i386 platforms (or already do)?
> > > > > >
> > > > > > amd64 and ia64 have always supported ACPI.  ia64 effectively 
> > > > > > requires
> it.
> > > > > > However, "hint.acpi.0.rsdp" is set by biosacpi.c in the i386
> > > > > > loader bits, so other platforms will not set it, so the
> > > > > > arch-i386 test is no longer
> > > > > needed.
> > > > >
> > > > > If "hint.acpi.0.rsdp" is only set in the i386 pieces, wouldn't
> > > > > that imply that the "acpipresent?" would return FALSE on IA64?
> > > >
> > > > Yes.  Right now the ACPI menu item is not displayed on ia64 and it
> > > > never has been.  You can't actually boot IA64 with ACPI disabled, so
> > > > there's no reason
> > > for it
> > > > to be in the menu.
> > >
> > > This raises a concern for my menu. Unlike the current menu, which
> > > blanks-out menuitem #2 for IA64, I've chosen instead to insert an
> > > inoperative menuitem with the text "ACPI Support: N/A".
> > 
> > Hmm, I think you should just leave the menu item blank or not listed.  It
> doesn't
> > make sense to see a knob about ACPI support on a ppc box for example, and
> > other platforms may grow platform-specific knobs in the future as well.
> > 
> > The current menu item is only blank as a hack to avoid renumbering the 
> > items.
> > If you are already changing that around, then I'd just leave it out 
> > altogether
> > unless ACPI is detected by the loader.
> > 
> 
> I too avoid renumbering of the items.
> 
> Having never actually booted a PPC or IA64 FreeBSD installation... is it the
> case that the numbers displayed jump from 1 to 3 (no blank line in-between 1 
> and
> 3, correct)?

Actually, I think PPC/IA64, etc. do not display the ACPI menu item at all and
they are numbered differently from i386 and amd64.  The ACPI menu item is only
blank if ACPI is not present on i386 and amd64.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [UPDATE] New Boot-Loader Menu -- version 1.1

2011-05-03 Thread John Baldwin

On Tuesday, May 03, 2011 5:22:20 pm Devin Teske wrote:
> > -Original Message-
> > From: John Baldwin [mailto:j...@freebsd.org]
> > Sent: Tuesday, May 03, 2011 2:01 PM
> > To: Devin Teske
> > Cc: freebsd-hackers@freebsd.org
> > Subject: Re: [UPDATE] New Boot-Loader Menu -- version 1.1
> > 
> > On Tuesday, May 03, 2011 4:47:26 pm Devin Teske wrote:
> > > > -Original Message-
> > > > From: John Baldwin [mailto:j...@freebsd.org]
> > > > Sent: Tuesday, May 03, 2011 1:36 PM
> > > > To: Devin Teske
> > > > Cc: freebsd-hackers@freebsd.org
> > > > Subject: Re: [UPDATE] New Boot-Loader Menu -- version 1.1
> > > >
> > > > On Tuesday, May 03, 2011 4:17:23 pm Devin Teske wrote:
> > > > > > -Original Message-
> > > > > > From: John Baldwin [mailto:j...@freebsd.org]
> > > > > > Sent: Tuesday, May 03, 2011 12:20 PM
> > > > > > To: Devin Teske
> > > > > > Cc: freebsd-hackers@freebsd.org
> > > > > > Subject: Re: [UPDATE] New Boot-Loader Menu -- version 1.1
> > > > > >
> > > > > > On Tuesday, May 03, 2011 2:57:34 pm Devin Teske wrote:
> > > > > > > > From: John Baldwin [mailto:j...@freebsd.org]
> > > > > > > > Sent: Tuesday, May 03, 2011 10:33 AM
> > > > > > > > To: Devin Teske
> > > > > > > > Cc: freebsd-hackers@freebsd.org; Olivier SMEDTS
> > > > > > > > Subject: Re: [UPDATE] New Boot-Loader Menu -- version 1.1
> > > > > > > >
> > > > > > > > On Tuesday, May 03, 2011 12:31:14 pm Devin Teske wrote:
> > > > > > > > >
> > > > > > > > > On May 3, 2011, at 4:45 AM, John Baldwin wrote:
> > > > > > > > >
> > > > > > > > > > On Monday, May 02, 2011 8:48:31 pm Devin Teske wrote:
> > > > > > > > > > > This version (1.1) works nearly identically to the
> > > > > > > > > > > standard menu that ships with FreeBSD in that it
> > > > > > > > > > > detects whether ACPI is enabled (truth be told, I
> > > > > > > > > > > actually re-used the
> > > > "acpienabled?"
> > > > > > > > > > > function verbatim from /boot/beastie.4th by Scott Long
> > > > > > > > > > > and Aleksander Fafula). The ACPI detection of my boot
> > > > > > > > > > > loader (version
> > > > > > > > > > > 1.1 or higher) should be identical to the detection of
> > > > > > > > > > > the current boot-loader.
> > > > > > > > >
> > > > > > > > > Ugh. By "current", I meant 8.1-RELEASE (wasn't expecting
> > > > > > > > > this stuff to be different in HEAD, which it is).
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Err, note that the acpienabled stuff is all different in
> > > > > > > > > > HEAD than in 7/8 since acpi.ko no longer exists.  You
> > > > > > > > > > should use the scheme from HEAD for handling ACPI
> > > > > > > > > > present vs ACPI
> > > > enabled/disabled.
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > John Baldwin
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Ok, I see the new "acpipresent?" word (which replaces the
> > > "arch-i386"
> > > > > > > > > environment-test). Does this imply that we're going to
> > > > > > > > > support ACPI on
> > > > > > > > > non-i386 platforms (or already do)?
> > > > > > > >
> > > > > > > > amd64 and ia64 have always supported ACPI.  ia64 effectively
> > > > > > > > requires
> > > it.
> > > > > > > > However, "hint.acpi.0.rsdp" is set by biosacpi.c in the i386
> > > > > > > > loader bits, so other platforms will not set it, so the
> > > > > > > > arch-i386 test is no longer
> >

Re: thread_lock vs panic/trap

2011-05-06 Thread John Baldwin

On Friday, May 06, 2011 5:11:57 am Andriy Gapon wrote:
> 
> Can a current thread panic or receive a trap while some other thread holds its
> thread_lock (the same lock as pointed to by the td_lock)?

I'm sure it's theoretically possible.  If the thread is running just about
anywhere and another thread is changing its cpuset for example, then you could
run into this.

> And a related question, can there be a reason for a thread in panic or kdb
> context to try to get the thread_lock?

I think it isn't safe to try to grab one's own thread lock in panic or kdb
for this reason.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Fwd: [PATCH v2 3/4] x86, head_32/64.S: Enable SMEP

2011-05-18 Thread John Baldwin

On Wednesday, May 18, 2011 8:31:15 am Oliver Pinter wrote:
> On 5/18/11, Kostik Belousov  wrote:
> > On Wed, May 18, 2011 at 02:03:07AM +0200, Oliver Pinter wrote:
> >> -- Forwarded message --
> >> From: Fenghua Yu 
> >> Date: Mon, 16 May 2011 14:34:44 -0700
> >> Subject: [PATCH v2 3/4] x86, head_32/64.S: Enable SMEP
> >> To: Ingo Molnar , Thomas Gleixner ,
> >> H Peter Anvin , Asit K Mallick
> >> , Linus Torvalds
> >> , Avi Kivity , Arjan
> >> van de Ven , Andrew Morton
> >> , Andi Kleen 
> >> Cc: linux-kernel , Fenghua Yu
> >> 
> >>
> >> From: Fenghua Yu 
> >>
> >> Enable newly documented SMEP (Supervisor Mode Execution Protection) CPU
> >> feature in kernel.
> >>
> >> SMEP prevents the CPU in kernel-mode to jump to an executable page that
> >> does
> >> not have the kernel/system flag set in the pte. This prevents the kernel
> >> from executing user-space code accidentally or maliciously, so it for
> >> example
> >> prevents kernel exploits from jumping to specially prepared user-mode
> >> shell
> >> code. The violation will cause page fault #PF and will have error code
> >> identical to XD violation.
> >>
> >> CR4.SMEP (bit 20) is 0 at power-on. If the feature is supported by CPU
> >> (X86_FEATURE_SMEP), enable SMEP by setting CR4.SMEP. New kernel
> >> option nosmep disables the feature even if the feature is supported by
> >> CPU.
> >>
> >> Signed-off-by: Fenghua Yu 
> >
> > So, where is the mentioned documentation for SMEP ? Rev. 38 of the
> > Intel(R) 64 and IA-32 Architectures Software Developer's Manual does
> > not contain the description, at least at the places where I looked and
> > expected to find it.
> 
> http://www.intel.com/Assets/PDF/manual/325384.pdf
> 
> Intel® 64 and IA-32 Architectures Software Developer’s Manual
>Volume 3 (3A & 3B):
>  System Programming Guide

Which revision?  It is not documented in revision 38 from April 2011.

I just downloaded that link, and it is still revision 38 and has no mention 
'SMEP'.  Also, bit 20 of CR4 is still marked as Reserved in that manual 
(section 2.5).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: device_detach() on a device used by ixgbe driver (FreeBSD 7-STABLE through to 9-CURRENT)

2011-05-23 Thread John Baldwin

On Monday, May 23, 2011 10:13:41 am Philip Soeberg wrote:
> Hi fellow FreeBSD hackers,
> 
> I've just completed designing a new driver for the Intels IXGBE suite of 
> network adapters, but is building my driver as a kernel module to be 
> loaded after system boot.
> 
> The current sys/dev/ixgbe/ixgbe.c driver which attach to Intels adapters 
> return a zero in it's probe() function (which equals to 
> BUS_PROBE_SPECIFIC).. This has the distinct disadvantage that I cannot, 
> through my module, call a device_detach() on the devices I support, and 
> afterward expect being probed for them. A BUS_PROBE_SPECIFIC, according 
> to wording in sys/sys/bus.h, inform the OS that "Only I can use this 
> device".
> 
> I assume this (transcanding from FreeBSD 7.0-STABLE through to FreeBSD 
> 9-CURRENT) is in error? I would expect sys/dev/ixgbe/ixgbe.c's probe() 
> function to return BUS_PROBE_DEFAULT, which is the "Base OS default 
> driver"..

Yes, that is true.

> If this is true, then we should probably also update 
> sys/kern/device_if.m's description of the probe() method as to reflect 
> the BUS_PROBE_* return values in a clearer way than is currently described.
> Do you want me to provide a patch? (it's really a one liner for ixgbe.c 
> and a couple of alterations to the device_if.m, if need be)

device_if.m was probably just never updated from when BUS_PROBE_* were added.  
Updating it would be a good thing.

> I would also expect the ixgbe.c driver to do a quick resource_disabled() 
> in it's attach() function, so that we can disable specific adapters 
> through kenv hint.ix.0.disabled=1..

That is not universally supported (i.e. it's not a part of new-bus 
specifically).  For buses that support hinted devices, they do all generally 
support being able to disable a hinted device, but disabling bus-enumerated 
devices is not generally supported.

> Given that I can't use device_detach() on a device hogged by the IXGBE 
> driver, can any one of you help me with a way around this problem? I 
> can't use the hints, and I can't detach() the device.. how can I get my 
> kernel module to attach the device?

I think ixgbe has to be fixed to use BUS_PROBE_DEFAULT.  Very few drivers 
should use '0' for their probe return value.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: device_detach() on a device used by ixgbe driver ( FreeBSD 7-STABLE through to 9-CURRENT)

2011-05-24 Thread John Baldwin

On Monday, May 23, 2011 1:22:50 pm Philip Soeberg wrote:
> On 23-05-2011 16:32, John Baldwin wrote:
> >> I assume this (transcanding from FreeBSD 7.0-STABLE through to FreeBSD
> >> 9-CURRENT) is in error? I would expect sys/dev/ixgbe/ixgbe.c's probe()
> >> function to return BUS_PROBE_DEFAULT, which is the "Base OS default
> >> driver"..
> >
> > Yes, that is true.
> >
> >> If this is true, then we should probably also update
> >> sys/kern/device_if.m's description of the probe() method as to reflect
> >> the BUS_PROBE_* return values in a clearer way than is currently described.
> >> Do you want me to provide a patch? (it's really a one liner for ixgbe.c
> >> and a couple of alterations to the device_if.m, if need be)
> >
> > device_if.m was probably just never updated from when BUS_PROBE_* were 
> > added.
> > Updating it would be a good thing.
> I'll submit a patch tomorrow with an updated description and a fix for 
> the ixgbe then..
> >
> >> I would also expect the ixgbe.c driver to do a quick resource_disabled()
> >> in it's attach() function, so that we can disable specific adapters
> >> through kenv hint.ix.0.disabled=1..
>  >
> > I think ixgbe has to be fixed to use BUS_PROBE_DEFAULT.  Very few drivers
> > should use '0' for their probe return value.
> >
> 
> but since it does return zero, do you have any idea how I can force it 
> to detach allowing me in instead? I've been stabbing high and low at it 
> for hours now, and nothing seem to get me anywhere.. short of hacking 
> the ixgbe_attach() function address, I can't seem to figure out a way to 
> kill the systems way of re-attaching the device to the ixgbe just after 
> I've detached it.
> 
> rather frustrating.. It's like a catch-22 problem..
> 
> and worse, the ixgbe driver is per default included as a static module, 
> so loader.conf "ix_load=no" will have no effect (unless I'm mistaken?)
> 
> I'm running out of ideas as to how I can attach myself to that Intel 
> device instead of the ixgbe when it is linked static and with a return 
> of zero in it's probe().. And I'm also out of ideas as to how to disable 
> that damn module altogether, short of recompiling the kernel..
> 
> any ideas?

Short of dynamically patching ixgbe_probe()'s return value at runtime?
No, no ideas. :(

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: device_detach() on a device used by ixgbe driver (FreeBSD 7-STABLE through to 9-CURRENT)

2011-05-24 Thread John Baldwin

On Monday, May 23, 2011 3:08:05 pm Andrew Boyer wrote:
> 
> On May 23, 2011, at 10:32 AM, John Baldwin wrote:
> 
> > On Monday, May 23, 2011 10:13:41 am Philip Soeberg wrote:
> >> I would also expect the ixgbe.c driver to do a quick resource_disabled() 
> >> in it's attach() function, so that we can disable specific adapters 
> >> through kenv hint.ix.0.disabled=1..
> > 
> > That is not universally supported (i.e. it's not a part of new-bus 
> > specifically).  For buses that support hinted devices, they do all 
generally 
> > support being able to disable a hinted device, but disabling bus-
enumerated 
> > devices is not generally supported.
> > 
> 
> FYI, I submitted a patch to Jack to add this in all of the e1000/ixgbe 
drivers.  Setting a disabled="1" hint causes the attach to fail with ENXIO.  I 
don't know if it's 'correct' or not but it serves a purpose in our testing and 
I thought it would be useful for others.

One patch I have had for a while is a way to disable specific PCI devices, but 
that's not quite the same thing as it disables all drivers for a given device.

(It adds support for a 'hw.pcidisabled=1'
tunable).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Problem with running simple pthreads program under gdb-7.2 (Invalid selected thread)

2011-05-26 Thread John Baldwin

On Wednesday, May 25, 2011 8:35:28 pm Raphael Kubo da Costa wrote:
> Dmitry Krivenok  writes:
> 
> > As you can see program exited normally w/o any errors.
> > Then I run the same program under gdb-7.2
> >
> > $ /usr/local/bin/gdb72 --args t
> > GNU gdb (GDB) 7.2 [GDB v7.2 for FreeBSD]
> > Copyright (C) 2010 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> > and "show warranty" for details.
> > This GDB was configured as "x86_64-portbld-freebsd8.2".
> > For bug reporting instructions, please see:
> > <http://www.gnu.org/software/gdb/bugs/>...
> > Reading symbols from /big/work/coverage/csxroot/src/t/t...done.
> > (gdb) r
> > Starting program: /big/work/coverage/csxroot/src/t/t
> > [New LWP 100162]
> > [New Thread 800a041c0 (LWP 100162)]
> > [New Thread 800a0ae40 (LWP 100171)]
> > [Thread 800a0ae40 (LWP 100171) exited]
> > Invalid selected thread.
> > (gdb) q
> > A debugging session is active.
> >
> >Inferior 1 [process 7756] will be killed.
> >
> > Quit anyway? (y or n) y
> > $
> >
> > In this case I got "Invalid selected thread." right after the thread has 
exited.
> > Looks like gdb is unable to switch to another thread.
> 
> In my case, I get the following error when running your program (and
> many others) with the gdb72 package (installed via portmaster -PP
> devel/gdb):
> 
> (gdb) r
> Starting program: /tmp/test-base
> [New LWP 100315]
> Cannot get thread info, Thread ID=100315, generic error
> (gdb) q
> A debugging session is active.
> 
> Inferior 1 [process 84832] will be killed.
> 
> Quit anyway? (y or n) y
> 
> If I compile the port myself, I can't run any binary (PR ports/152896,
> which has been unanswered despite my efforts):
> 
>   Reading symbols from /usr/local/bin/gdb72...I'm sorry, Dave, I can't
>   do that.  Symbol format `elf64-x86-64-freebsd' unknown.

You need to unininstall libreadline, or turn off the hack to try to use
libreadline from ports.  There is no easy way to fix the gdb build to
use /usr/local only for the bits that need readline and not have it use
the wrong binutils headers from /usr/local as well.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Problem with running simple pthreads program under gdb-7.2 (Invalid selected thread)

2011-05-26 Thread John Baldwin

On Thursday, May 26, 2011 3:37:13 am Andriy Gapon wrote:
> on 26/05/2011 03:35 Raphael Kubo da Costa said the following:
> > If I compile the port myself, I can't run any binary (PR ports/152896,
> > which has been unanswered despite my efforts):
> > 
> >   Reading symbols from /usr/local/bin/gdb72...I'm sorry, Dave, I can't
> >   do that.  Symbol format `elf64-x86-64-freebsd' unknown.
> > 
> 
> This is a somewhat known issue that John was going to fix a while ago.

Actually, it's not really fixable if you have libreadline installed from ports 
and binutils installed from ports.  I'd like to just remove the hack to use
libreadline from ports if possible.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Problem with running simple pthreads program under gdb-7.2 (Invalid selected thread)

2011-05-27 Thread John Baldwin

On Thursday, May 26, 2011 8:03:19 pm Raphael Kubo da Costa wrote:
> Andriy Gapon  writes:
> 
> > on 26/05/2011 16:33 John Baldwin said the following:
> >> On Thursday, May 26, 2011 3:37:13 am Andriy Gapon wrote:
> >>> on 26/05/2011 03:35 Raphael Kubo da Costa said the following:
> >>>> If I compile the port myself, I can't run any binary (PR ports/152896,
> >>>> which has been unanswered despite my efforts):
> >>>>
> >>>>   Reading symbols from /usr/local/bin/gdb72...I'm sorry, Dave, I can't
> >>>>   do that.  Symbol format `elf64-x86-64-freebsd' unknown.
> >>>>
> >>>
> >>> This is a somewhat known issue that John was going to fix a while ago.
> >> 
> >> Actually, it's not really fixable if you have libreadline installed from 
ports 
> >> and binutils installed from ports.  I'd like to just remove the hack to 
use
> >> libreadline from ports if possible.
> >
> > I referred to this option as a fix.
> 
> Thanks, removing readline from ports worked like a charm after I
> recompiled devel/gdb. Installing it from the package still doesn't work,
> though.
> 
> Shouldn't the hack in the port be reversed, ie. if readline from ports
> is installed the port is marked as BROKEN?

Well, if you have readline from ports but don't have binutils in ports it 
actually works ok.

No idea why the package built in the package cluster is busted though.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: ndis driver presents the valid WiFi network as having the name 0x000000

2011-05-31 Thread John Baldwin

On Friday, May 27, 2011 5:14:09 pm Yuri wrote:
> Underlying card is Broadcom BCM94312MCGSG (mini-card for laptop) with 
> Windows driver.
> This same card and driver work fine with pretty much any other network I 
> tried.
> But this one particular network shows as 0x00 and I can't connect to it.
> Another FreeBSD desktop with native ath driver and apple both connect to 
> it fine.
> 
> What might be causing such weird behavior?
> Is this a known problem?
> Any way to troubleshoot this?

I have this same problem.  I've had to resort to using wpa_cli to 'select' my 
network at work that has this issue and then using 'ap_scan 2' to force 
wpa_supplicant to associate with it.  You also will want ndis_events running 
if you need to do WPA authentication.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Active slice, only for a next boot

2011-05-31 Thread John Baldwin

On Monday, May 30, 2011 1:42:39 pm Dieter BSD wrote:
> And it works great.  Except that one of the 27 stages of boot
> code that FreeBSD uses INSISTS on booting the active slice,
> so you can tell the MBR to boot slice 3 and slice 3's boot
> code sees that slice 4 is active and boots slice 4.

There are only 3 stages, and boot1.S is what looks at the active slice.  
Unfortunately it doesn't have a better way to do this as the only input it 
gets from boot0 or any other MBR boot loader is the BIOS drive number in %dl.
I'm not sure how else you would detect that a non-active slice was booted from 
when that is your only input.

One could define some extended structure to pass that information and send it
in a register, but you'd still have to cope with MBR boot loaders that don't 
pass it (e.g. the Windows ones if you are dual-booting with Windows) and you'd 
need to have some sanity checks to make sure one doesn't treat garbage input 
as valid.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: ndis driver presents the valid WiFi network as having the name 0x000000

2011-05-31 Thread John Baldwin

On Tuesday, May 31, 2011 12:36:43 pm Bernhard Schmidt wrote:
> On Tuesday 31 May 2011 16:29:15 John Baldwin wrote:
> > On Friday, May 27, 2011 5:14:09 pm Yuri wrote:
> > > Underlying card is Broadcom BCM94312MCGSG (mini-card for laptop) with 
> > > Windows driver.
> > > This same card and driver work fine with pretty much any other network I 
> > > tried.
> > > But this one particular network shows as 0x00 and I can't connect to 
it.
> > > Another FreeBSD desktop with native ath driver and apple both connect to 
> > > it fine.
> > > 
> > > What might be causing such weird behavior?
> > > Is this a known problem?
> > > Any way to troubleshoot this?
> > 
> > I have this same problem.  I've had to resort to using wpa_cli to 'select' 
my 
> > network at work that has this issue and then using 'ap_scan 2' to force 
> > wpa_supplicant to associate with it.  You also will want ndis_events 
running 
> > if you need to do WPA authentication.
> 
> Are you using -D bsd or -D ndis as the driver for wpa_supplicant?
> 
> -- 
> Bernhard
> 

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: ndis driver presents the valid WiFi network as having the name 0x000000

2011-05-31 Thread John Baldwin

On Tuesday, May 31, 2011 12:36:43 pm Bernhard Schmidt wrote:
> On Tuesday 31 May 2011 16:29:15 John Baldwin wrote:
> > On Friday, May 27, 2011 5:14:09 pm Yuri wrote:
> > > Underlying card is Broadcom BCM94312MCGSG (mini-card for laptop) with 
> > > Windows driver.
> > > This same card and driver work fine with pretty much any other network I 
> > > tried.
> > > But this one particular network shows as 0x00 and I can't connect to 
> > > it.
> > > Another FreeBSD desktop with native ath driver and apple both connect to 
> > > it fine.
> > > 
> > > What might be causing such weird behavior?
> > > Is this a known problem?
> > > Any way to troubleshoot this?
> > 
> > I have this same problem.  I've had to resort to using wpa_cli to 'select' 
> > my 
> > network at work that has this issue and then using 'ap_scan 2' to force 
> > wpa_supplicant to associate with it.  You also will want ndis_events 
> > running 
> > if you need to do WPA authentication.
> 
> Are you using -D bsd or -D ndis as the driver for wpa_supplicant?

Whatever the defaults are (which would be -B ndis).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Bug in ksched_setscheduler?

2011-06-02 Thread John Baldwin

On Wednesday, June 01, 2011 12:42:42 pm Dmitry Krivenok wrote:
> Hello Hackers,
> I think I found a bug in ksched_setscheduler() function.
> 
> 209 rtp.prio = 
> p4prio_to_rtpprio(param->sched_priority);
>
> Shouldn't we use p4prio_to_tsprio instead of p4prio_to_rtpprio at the line 
> 209?
> This macro is defined but never used in kernel code:
> 
> $ grep -r 'p4prio_to_tsprio' /usr/src/sys/
> /usr/src/sys/kern/ksched.c:#define p4prio_to_tsprio(P)
> ((PRI_MAX_TIMESHARE - PRI_MIN_TIMESHARE) - (P))
> $
> 
> Is it a real bug or just my misunderstanding of something?

I think it is a real bug.  Can you come up with a test case to show it?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: [RFC] Enabling invariant TSC timecounter on SMP

2011-06-03 Thread John Baldwin

On Friday, June 03, 2011 2:03:55 am Andriy Gapon wrote:
> > Consecutive RDTSCs used on a same CPU is always incremental but we 
> > cannot 100% guarantee that on two cores, even if TSC is derived from 
> > the same clock.  I am hoping at least latency difference (I believe 
> > it's about few tens of cycles max) is "eaten up" by lowering 
> > resolution.  It's not perfect but it's better than serialization 
> > (Linux) or heuristics (OpenSolaris), just because there are few rare 
> > conditions to consider.  Thoughts?
> 
> I am still not sure which case this code should solve.
> 
> Thread T1: x1 = rdtsc() on CPU1;
> Thread T1: x2 = rdtsc() on CPU2;
> x2 < x1 ?
> Or?

Yes, that can happen.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: some strange constructs (bugs?) in if_tun.c

2011-06-03 Thread John Baldwin

On Thursday, June 02, 2011 12:24:21 pm Martin Birgmeier wrote:
> I am looking at net/if_tun.c, function tunwrite() (this is 7.4, but 8.2 
> is nearly the same):
> 
> There is a local variable "error" which is initialized to zero and then 
> seemingly never changed, until it is used as a return value if 
> m_uiotombuf() fails:
> 
> ...
>  int error = 0;
> ...
>  if ((m = m_uiotombuf(uio, M_DONTWAIT, 0, 0, M_PKTHDR)) == NULL) {
>  ifp->if_ierrors++;
>  return (error);
>  }
> ...
> a little further down, we see
> ...
>  if (m->m_len < sizeof(family) &&
>  (m = m_pullup(m, sizeof(family))) == NULL)
>  return (ENOBUFS);
> ...
> 
> As far as I can see, the first return amounts to "drop the packet, but 
> don't tell anything about it", whereas the second amounts to "drop the 
> packet and say it's due to ENOBUFS".
> 
> However, the first case is much more like ENOBUFS, so shouldn't we 
> simply say "return (ENOBUFS)" there and remove the "error" variable 
> altogether?

Yes, this error seems to have been introduced in 137101 when if_tun was 
switched to use m_uiotombuf() rather than a home-rolled version.  tap(4) had 
the same bug, but it was fixed in 163986.  I think this patch should be ok for 
tun(4):

Index: if_tun.c
===
--- if_tun.c(revision 222565)
+++ if_tun.c(working copy)
@@ -126,7 +126,7 @@ static void tunclone(void *arg, struct ucred *cred
int namelen, struct cdev **dev);
 static voidtuncreate(const char *name, struct cdev *dev);
 static int tunifioctl(struct ifnet *, u_long, caddr_t);
-static int tuninit(struct ifnet *);
+static voidtuninit(struct ifnet *);
 static int tunmodevent(module_t, int, void *);
 static int tunoutput(struct ifnet *, struct mbuf *, struct sockaddr *,
struct route *ro);
@@ -494,14 +494,13 @@ tunclose(struct cdev *dev, int foo, int bar, struc
return (0);
 }
 
-static int
+static void
 tuninit(struct ifnet *ifp)
 {
struct tun_softc *tp = ifp->if_softc;
 #ifdef INET
struct ifaddr *ifa;
 #endif
-   int error = 0;
 
TUNDEBUG(ifp, "tuninit\n");
 
@@ -528,7 +527,6 @@ tuninit(struct ifnet *ifp)
if_addr_runlock(ifp);
 #endif
mtx_unlock(&tp->tun_mtx);
-   return (error);
 }
 
 /*
@@ -552,12 +550,12 @@ tunifioctl(struct ifnet *ifp, u_long cmd, caddr_t
mtx_unlock(&tp->tun_mtx);
break;
case SIOCSIFADDR:
-   error = tuninit(ifp);
-   TUNDEBUG(ifp, "address set, error=%d\n", error);
+   tuninit(ifp);
+   TUNDEBUG(ifp, "address set\n");
break;
case SIOCSIFDSTADDR:
-   error = tuninit(ifp);
-   TUNDEBUG(ifp, "destination address set, error=%d\n", error);
+   tuninit(ifp);
+   TUNDEBUG(ifp, "destination address set\n");
break;
case SIOCSIFMTU:
ifp->if_mtu = ifr->ifr_mtu;
@@ -857,7 +855,6 @@ tunwrite(struct cdev *dev, struct uio *uio, int fl
struct tun_softc *tp = dev->si_drv1;
struct ifnet*ifp = TUN2IFP(tp);
struct mbuf *m;
-   int error = 0;
uint32_tfamily;
int isr;
 
@@ -877,7 +874,7 @@ tunwrite(struct cdev *dev, struct uio *uio, int fl
 
if ((m = m_uiotombuf(uio, M_DONTWAIT, 0, 0, M_PKTHDR)) == NULL) {
ifp->if_ierrors++;
-   return (error);
+   return (ENOBUFS);
}
 
m->m_pkthdr.rcvif = ifp;


-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Why user time of the process depends on machine load?

2011-06-16 Thread John Baldwin

On Wednesday, June 15, 2011 5:55:17 pm Dan Nelson wrote:
> In the last episode (Jun 15), Yuri said:
> > When I test performance of the code, I always observe dependency of CPU
> > user time on the presence of other CPU intense processes.  Same CPU-only
> > deterministic process that on the quiet machine completes in 220 user
> > seconds in the presence of, for example, kde rebuild would complete in
> > 261, 266 or even 379 user seconds.  I am talking about times shown by
> > time(1), not actual an execution time.  It's the same time as getrusage(2)
> > returns in ru_utime field.
> > 
> > Why time that process takes in user seconds depends on what other 
> > processes are running?
> > 
> > FreeBSD-8.2 STABLE on i7 CPU @ 9200 @ 2.67GHz.
> 
> Some possible factors:
> 
> o Intel Turbo Boost, which raises the clock rate of a single core if the
>   other cores are idle.  A single process on an idle system will run faster.
> 
> o i7 chips have a shared L3 cache across all cores, so a single process on
>   an idle system will tend to have more of its data in cache compared to a
>   system with multiple processes, so it spends less time waiting for slower
>   physical memory lookups.
> 
> o Process accounting isn't exact.  I may be wrong, but I don't think
>   timestamps are taken every time a syscall is invoked and returns.  Some
>   time marked as "user" may actually be "system" time, in which case you may
>   be seeing the effect of contention in the kernel as more processes are
>   run.

This is very true.  You can only really trust the sum of system + user time
and compare that across runs.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Fwd: Shooting trouble on a PCI bus hang

2011-06-20 Thread John Baldwin

On Sunday, June 19, 2011 12:35:49 pm Ansar Mohammed wrote:
> I appreciate that. The system works fine with NetBSD, LInux and Windows XP,
> so I doubt its hardware.
> 
> Interesting though that OpenBSD has the same issue.
> 
> A question about the debug kernel load process: as it hangs on *
> pci_print_verbose* in pci.c, can I deduce that this is the exact code
> segment that is the issue?

Well, if that was the last device on the bus it might be in a device driver
probe routine.  You can try adding more printfs to device_probe(), etc. to 
output each driver name as it probes each device perhaps.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Shooting trouble on a PCI bus hang

2011-06-20 Thread John Baldwin

On Monday, June 20, 2011 4:03:26 pm Ansar Mohammed wrote:
> So with a plethora of printfs in pci.c I managed to track it down to a call
> to *pci_read_bar *against an AMD CS5536 PCI-ISA bridge. This somehow
> intermittently hangs on bootup.
> Any suggestions?

Hmm, do you know what kind of BAR it is, or perhaps the raw value of the BAR
that we first read that was setup by the firmware?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: IPI and I/O interrupts

2011-06-22 Thread John Baldwin

On Wednesday, June 22, 2011 3:59:06 am Sushanth Rai wrote:
> Hi,
> 
> I would like to understand little bit about the FreeBSD interrupt handling 
on x86.
> 
> When a cpu is processing an IPI, let's say cpu is running IPI_STOP handler, 
are I/O interrupts like the timer interrupt disabled ? Conversely if the cpu 
is holding a spinlock, which means it has disabled interrupts, can it process 
an IPI. My understanding is executing "cli" instruction disables the maskable 
interrupts. I was wondering if IPIs are part of that.

Yes, IPIs generally are blocked.  We do use an NMI IPI when entering the
debugger (and possibly during panics), but general IPIs like TLB shootdowns, 
etc. are all maskable interrupts.  Also, all of the IPI handlers (and the 
lapic timer interrupt) operate like normal device interrupt handlers using 
interrupt gates (which block interrupts equivalent to cli).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Unit Tests for FreeBSD Kernel APIs?

2011-06-24 Thread John Baldwin

On Friday, June 24, 2011 3:23:11 am Sebastian Huber wrote:
> Hello,
> 
> exists there some unit tests for FreeBSD kernel APIs, e.g. mutex(9),
> condvar(9), etc.?
> 
> Have a nice day!

Hmm, I have a kernel module that does some tests, but it is not in the tree.  
One of the issues is that many of the tests you want to do for some of these
APIs involve timing.  For rwlocks, for example, I used KTR traces and used
a kernel module that forked 4 threads to all compete over a single lock.  I
then verified via KTR traces that every branch was taken (and made liberal
use of KASSERT()s which caught a few edge cases I had missed initially).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: pri_to_rtp returns invalid initial priority

2011-07-07 Thread John Baldwin

On Thursday, July 07, 2011 6:37:02 am Dmitry Krivenok wrote:
> Hi Hackers,
> I've developed a simple kld which demonstrates a problem I found on my
> FreeBSD-8.2.

Maybe revision 222802?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: System Fails to Boot (Deadlock?)

2011-07-13 Thread John Baldwin

On Tuesday, July 12, 2011 6:10:05 pm Brandon Falk wrote:
> Hello,
> 
> My machine somehow fails to boot into FreeBSD (off the cd, since it 
> fails to boot, I can't install it in the first place). It goes through 
> and ends up stopping deadlock style. No printouts of errors (besides a 
> ppc problem, which is not fatal) and no crash/oops/panic. It just... 
> stops and locks up. I know my motherboards PCI system has failed, as 
> well as lan (getting a new one tomorrow), which is probably what is 
> causing the problem. The thing is that Windows 7 and Linux happen to 
> boot just fine on my machine, so although my system has a failing 
> motherboard and may cause errors, I still feel like it shouldn't be 
> causing this much of an issue on boot. I tried verbose logging and got 
> no more information anyways. It will be nearly impossible to diagnose 
> this error so I'm looking for tips on where to look.

How far does it get when it locks?  Is it able to load /boot/loader off
of the CD ok?  Is it getting into the kernel far enough to output stuff to the 
console?  Is it getting into sysinstall and hanging at some sysinstall screen?

Also, is this host on the network and able to PXE?  It's a lot easier to test 
custom kernels if needed using PXE than CDs (no need to burn a new CD each 
time, etc.).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1599 matches

Mail list logo