Re: Doesn't LAPIC timer stop when CPU goes to sleep state?

2012-07-09 Thread Andriy Gapon
on 09/07/2012 08:41 mnln.l4 said the following:
> I have FreeBSD 9.0-STABLE r237285, In my systl -a output, I see
> 
> kern.eventimer.et.LAPIC.flags: 7
> 
> I am under the impression LAPIC timer may stop when CPU goes to sleep
> state. Shouldn't the flag be 15 as the example in EVENTTIMERS(4) man
> page?

You don't have to be under an impression when you can get the facts from 
publicly
available specifications and code.

> I have H67MA-E35 motherboard with Intel H67 chipset, and G840 dual core CPU.
> 
> Thanks.

-- 
Andriy Gapon

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: bge problems in RELENG_9, bge0: watchdog timeout -- resetting

2012-07-09 Thread Sean Bruno
On Wed, 2012-07-04 at 18:01 -0700, YongHyeon PYUN wrote:
> here is a WIP version at the following URL.
> http://people.freebsd.org/~yongari/bge/if_bge.c
> http://people.freebsd.org/~yongari/bge/if_bgereg.h
> http://people.freebsd.org/~yongari/bge/brgphy.c
> 
> I have a couple of positive feedbacks but it seems it still has
> some issues. Let me know whether it makes any difference on your
> box. 

I grabbed these updates and applied them cleanly to stable/9 on a Dell
R620 with a quad port BCM5720, I still see watchdog timeouts and reset
indications.  I am able to ping out of the box for a short amount of
time before the device hangs and times out.



-bash-4.2# ping XXX.XXX.XXX.1
PING XXX.XXX.XXX.1 (XXX.XXX.XXX.XXX): 56 data bytes
ping: sendto: Network is down
ping: sendto: Network is down
ping: sendto: Network is down
ping: sendto: Network is down
ping: sendto: Network is down
Jul  9 17:31:41  x89 kernel: bge2: watchdog timeout --
resetting
Jul  9 17:31:41  x89 kernel: bge2: link state changed to
DOWN
Jul  9 17:31:41  x89 kernel: bge2: link state changed to
DOWN
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from XXX.XXX.XXX.1: icmp_seq=9 ttl=64 time=1.408 ms
Jul  9 17:31:45  x89 kernel: bge2: link state changed to UP
Jul  9 17:31:45  x89 kernel: bge2: link state changed to UP
64 bytes from 10.73.149.1: icmp_seq=10 ttl=64 time=1.697 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=11 ttl=64 time=1.835 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=12 ttl=64 time=1.390 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=13 ttl=64 time=1.392 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=14 ttl=64 time=1.392 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=15 ttl=64 time=1.848 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=16 ttl=64 time=1.389 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=17 ttl=64 time=1.541 ms
64 bytes from XXX.XXX.XXX.1: icmp_seq=18 ttl=64 time=1.575 ms

The stats counters don't really show much here, but here they are
regardless.
dev.bge.2.%desc: Broadcom NetXtreme Gigabit Ethernet, ASIC rev.
0x572
dev.bge.2.%driver: bge
dev.bge.2.%location: slot=0 function=0 handle=\_SB_.PCI0.PE1C.NDX0
dev.bge.2.%pnpinfo: vendor=0x14e4 device=0x165f subvendor=0x1028
subdevice=0x1f5b class=0x02
dev.bge.2.%parent: pci1
dev.bge.2.forced_collapse: 0
dev.bge.2.msi: 1
dev.bge.2.forced_udpcsum: 0
dev.bge.2.stats.FramesDroppedDueToFilters: 0
dev.bge.2.stats.DmaWriteQueueFull: 0
dev.bge.2.stats.DmaWriteHighPriQueueFull: 0
dev.bge.2.stats.NoMoreRxBDs: 0
dev.bge.2.stats.InputDiscards: 0
dev.bge.2.stats.InputErrors: 0
dev.bge.2.stats.RecvThresholdHit: 0
Jul  9 17:33:35  x89 kernel: bge2: link state changed to
DOWN
dev.bge.2.stats.rx.ifHCInOctets: 109580
dev.bge.2.stats.rx.Fragments: 0
dev.bge.2.stats.rx.UnicastPkts: 212
dev.bge.2.stats.rx.MulticastPkts: 282
dev.bge.2.stats.rx.BroadcastPkts: 543
dev.bge.2.stats.rx.FCSErrors: 0
dev.bge.2.stats.rx.AlignmentErrors: 0
dev.bge.2.stats.rx.xonPauseFramesReceived: 0
dev.bge.2.stats.rx.xoffPauseFramesReceived: 0
dev.bge.2.stats.rx.ControlFramesReceived: 0
dev.bge.2.stats.rx.xoffStateEntered: 0
dev.bge.2.stats.rx.FramesTooLong: 0
dev.bge.2.stats.rx.Jabbers: 0
dev.bge.2.stats.rx.UndersizePkts: 0
dev.bge.2.stats.tx.ifHCOutOctets: 30916
dev.bge.2.stats.tx.Collisions: 0
dev.bge.2.stats.tx.XonSent: 0
dev.bge.2.stats.tx.XoffSent: 0
dev.bge.2.stats.tx.InternalMacTransmitErrors: 0
dev.bge.2.stats.tx.SingleCollisionFrames: 0
dev.bge.2.stats.tx.MultipleCollisionFrames: 0
dev.bge.2.stats.tx.DeferredTransmissions: 0
dev.bge.2.stats.tx.ExcessiveCollisions: 0
dev.bge.2.stats.tx.LateCollisions: 0
dev.bge.2.stats.tx.UnicastPkts: 203
dev.bge.2.stats.tx.MulticastPkts: 0
dev.bge.2.stats.tx.BroadcastPkts: 3




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Build failure xorg-drivers with Clang

2012-07-09 Thread Robert
Greetings

I am trying to build a 9.0 Stable system and am getting this error when
building xorg meta port. I have clang set up as follows

make.conf
 cat /etc/make.conf
CC=clang
CXX=clang++
CPP=clang-cpp
# added by use.perl 2012-07-09 07:23:29
PERL_VERSION=5.16.0

src.conf
CC=clang
CXX=clang++
CPP=clang-cpp

uname -a
FreeBSD pita9bsd.shasta204.local 9.0-STABLE FreeBSD 9.0-STABLE #0: Sun
Jul  8 18:52:16 PDT 2012
root@pita9bsd.shasta204.local:/usr/obj/usr/src/sys/GENERIC  i386 

The error:

In file included from xf86Helper.c:54:
In file included from ../../../hw/xfree86/os-support/xf86_OSlib.h:451:
./compiler.h:1104:24: error: invalid operand in inline asm: 'in${0:B}
($1)' __asm__ __volatile__("in%B0 (%1)" :
   ^
./compiler.h:1104:24: error: unknown use of instruction mnemonic
without a size suffix
:1:2: note: instantiated into assembly here
in (%dx)
^
In file included from xf86Helper.c:54:
In file included from ../../../hw/xfree86/os-support/xf86_OSlib.h:451:
./compiler.h:1104:24: error: invalid operand in inline asm: 'in${0:B}
($1)' __asm__ __volatile__("in%B0 (%1)" :
   ^
./compiler.h:1104:24: error: unknown use of instruction mnemonic
without a size suffix
:1:2: note: instantiated into assembly here
in (%dx)

In file included from xf86Helper.c:54:
In file included from ../../../hw/xfree86/os-support/xf86_OSlib.h:451:
./compiler.h:1104:24: error: invalid operand in inline asm: 'in${0:B}
($1)' __asm__ __volatile__("in%B0 (%1)" :
   ^
./compiler.h:1104:24: error: unknown use of instruction mnemonic
without a size suffix
:1:2: note: instantiated into assembly here
in (%dx)
^
In file included from xf86Helper.c:54:
In file included from ../../../hw/xfree86/os-support/xf86_OSlib.h:451:
./compiler.h:1104:24: error: invalid operand in inline asm: 'in${0:B}
($1)' __asm__ __volatile__("in%B0 (%1)" :
   ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
gmake[5]: *** [xf86Helper.lo] Error 1
gmake[4]: *** [all] Error 2
gmake[3]: *** [all-recursive] Error 1
gmake[2]: *** [all] Error 2
gmake[1]: *** [all-recursive] Error 1
gmake: *** [all-recursive] Error 1
*** [do-build] Error code 1

I just retrieved the ports this morning with portsnap fetch extract.

Any help appreciated.

Robert
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: nfs-bug when server for 9-Stable becomes client as well ?

2012-07-09 Thread Arno J. Klaassen
Vincent Hoffman  writes:

> On 06/07/2012 18:51, Arno J. Klaassen wrote:
>> Vincent Hoffman  writes:
>>
>>> On 06/07/2012 14:19, Arno J. Klaassen wrote:
 Hello,

 looks like I discouvered a probable bug in the nfs-code, very
 easy to reproduce in my setup :


Machine-1 : Today's 9-stable, exporting /files (ufs) and /z2 (zfs)

Machine-2 : 8-stable as of April the 10th exporting /raid1

 On Machine-1 I mount /raid1 (rw,nfsv3,intr,tcp,rsize=32768,wsize=32768)
 and start a script on this mount looping something like :

   dd if=/dev/random of=BIG bs=1048576 count=${SIZE}
   cp -fp BIG BIG2
   cmp -x BIG BIG2

 I let this run for 24 hours (from time to time stressing Machine-1 with
 other scripts, including provoking heavy swapping), no problem at all.

 However, then I mount /z2 (rw,nfsv3,intr,tcp,rsize=32768,wsize=32768)
 on Machine-2, and *immediately* the above loop on Machine-1 fails :

   Copying file ...cp: BIG: Permission denied

 No console messages this time, last time I got 

   kernel: nfs_getpages: error 13
   kernel: vm_fault: pager read error, pid 87803 (cmp)

 on Machine-1.

 I repeated this scenario by replacing Machine-2 with a good old
 6-4-stable one, same outcome.

 Please tell me what I could do to nail this down a bit more.
>>> Its possible (although not definite) that you have hit the a mountd bug
>>> as documented in PRs
>>>
>>> kern/131342
>>> kern/136865
>> especially kern/131342 looks similar and quite old; funny I never hit
>> this before, I basically do the same tests since 'ages' on each new box.
>> Could be that faster network/cpu unreveals some race condition; I notice
>> as well that this server is the first (IIRC) who uses 3 different IRQs
>> for network interrupts (em(4) Intel(R) PRO/1000).
> Certainly possible and seems reasonable enough.

just my $0.02, I glanced kern/131342, looks like the culprit should be
something like a 'non-atomic'-operation in-between invalidating old
/etc/exports and validating new /etc/exports.
Wonder if just verifying /var/run/mountd.pid is newer than /etc/exports
and if true just skip that operation would be an acceptable band-aid (if
I understood correctly, a rewrite of mountd correcting this (amongst
others) is close to hit -current (?))

>>> I've recently asked on -CURRENT about this and had a patch to try from
>>> Rick, I'm testing it now but it doesnt seem to fix it for me, just
>>> improve it alothough I'm trying to get enough runs to be a valid sample.
>>> (see
>>> http://docs.freebsd.org/cgi/getmsg.cgi?fetch=377627+0+archive/2012/freebsd-current/20120701.freebsd-current
>>> )
>>>
>>> What I did for my production nas was edit mount.c so it didnt send a
>>> SIGHUP to mountd as suggested by rick, as it was easy to do and non
>>> intrusive.
>> hmm, this means I should patch each fbsd-client, no? May be easier to
>> patch mountd to ignore SIHGUP and use some non-standard signal to force
>> re-init?
> No just patch /sbin/mount on the nfs server so it doesnt send the SIGHUP
> to mountd.

[In my case] it's the mount on a client which causes the server to fail,
I don't see how patching /sbin/mount on the nfs server should fix this?
As I don't remember if it's possible to discriminate a -1 signal send
from a process against one sent from terminal, if so, another bandaid,
one sent from a process could be ignored at all?

Merci

Arno


> you can manually HUP mountd if needed.
>>
>> Arno
>>
>>
>>> Vince
>>>
 Thanx in advance,

 Best, Arno
>
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>

-- 

  Arno J. Klaassen

  SCITO S.A.
  8 rue des Haies
  F-75020 Paris, France
  http://scito.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Doesn't LAPIC timer stop when CPU goes to sleep state?

2012-07-09 Thread mnln.l4
http://wiki.freebsd.org/TuningPowerConsumption/ says LAPIC timer stops
when CPU in C3, while I saw the flag is set to 7. Of course, I can
read full spec and code
(http://fxr.watson.org/fxr/source/x86/x86/local_apic.c?im=bigexcerpts#L251)
to find out some processor's LAPIC timer runs in C3, but mailing list
could be helpful to get some quick answer.

On Mon, Jul 9, 2012 at 3:59 AM, Andriy Gapon  wrote:
> on 09/07/2012 08:41 mnln.l4 said the following:
>> I have FreeBSD 9.0-STABLE r237285, In my systl -a output, I see
>>
>> kern.eventimer.et.LAPIC.flags: 7
>>
>> I am under the impression LAPIC timer may stop when CPU goes to sleep
>> state. Shouldn't the flag be 15 as the example in EVENTTIMERS(4) man
>> page?
>
> You don't have to be under an impression when you can get the facts from 
> publicly
> available specifications and code.
>
>> I have H67MA-E35 motherboard with Intel H67 chipset, and G840 dual core CPU.
>>
>> Thanks.
>
> --
> Andriy Gapon
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ? IO performance regression, post 8.1

2012-07-09 Thread Charles Owens


Charles Owens
Great Bay Software, Inc.
v: 603.617.4844   m: 603.866.0860

On 6/22/12 10:22 AM, John Baldwin wrote:

On Thursday, June 21, 2012 10:36:04 pm Charles Owens wrote:

On 6/15/12 8:04 AM, John Baldwin wrote:

On Friday, June 15, 2012 12:28:59 am Charles Owens wrote:

Hello FreeBSD folk,

We're seeing what appears to be a storage performance regression as we
try to move from 8.1 (i386) to 8.3.   We looked at 8.2 also and it
appears that the regression happened between 8.1 and 8.2.

Our system is an Intel S5520UR Server with 12 GB RAM, dual 4-core CPUs.
Storage is a LSI MegaSAS 1078 controller (mfi) in a RAID-10
configuration, using UFS + geom_journal for filesystem.

Postgresql performance, as seen via pgbench, dropped by approx 20%.
This testing was done with our usual PAE-enabled kernels.  We then went
back to GENERIC kernels and did comparisons using "bonnie", results
below.  Following that is a kernel boot log.

Notably, we're seeing this regression only with our RAID mfi(4) based
systems.  Notably, from looking at FreeBSD source changelogs it appears
that the mfi(4) code has seen some changes since 8.1.

Between 8.1 and 8.2 mfi has not had any significant changes.  The only changes
made to sys/dev/mfi were to add a new constant:


svn diff svn+ssh://svn.freebsd.org/base/releng/8.1/sys/dev/mfi

svn+ssh://svn.freebsd.org/base/releng/8.2/sys/dev/mfi
Index: mfireg.h
===
--- mfireg.h(.../8.1/sys/dev/mfi)   (revision 237134)
+++ mfireg.h(.../8.2/sys/dev/mfi)   (revision 237134)
@@ -975,7 +975,9 @@
  MFI_PD_STATE_OFFLINE = 0x10,
  MFI_PD_STATE_FAILED = 0x11,
  MFI_PD_STATE_REBUILD = 0x14,
-   MFI_PD_STATE_ONLINE = 0x18
+   MFI_PD_STATE_ONLINE = 0x18,
+   MFI_PD_STATE_COPYBACK = 0x20,
+   MFI_PD_STATE_SYSTEM = 0x40
   };
   
   union mfi_ld_ref {


The difference in write performance must be due to something else.  You
mentioned you are using UFS + gjournal.  I think gjournal uses BIO_FLUSH, so I
wonder if this is related:


r212939 | gibbs | 2010-09-20 19:39:00 -0400 (Mon, 20 Sep 2010) | 61 lines

MFC 212160:

Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic.
Add the BIO_ORDERED flag for struct bio and update bio clients to use it.

The barrier semantics of bioq_insert_tail() were broken in two ways:

   o In bioq_disksort(), an added bio could be inserted at the head of
 the queue, even when a barrier was present, if the sort key for
 the new entry was less than that of the last queued barrier bio.

   o The last_offset used to generate the sort key for newly queued bios
 did not stay at the position of the barrier until either the
 barrier was de-queued, or a new barrier (which updates last_offset)
 was queued.  When a barrier is in effect, we know that the disk
 will pass through the barrier position just before the
 "blocked bios" are released, so using the barrier's offset for
 last_offset is the optimal choice.

sys/geom/sched/subr_disk.c:
sys/kern/subr_disk.c:
  o Update last_offset in bioq_insert_tail().

  o Only update last_offset in bioq_remove() if the removed bio is
at the head of the queue (typically due to a call via
bioq_takefirst()) and no barrier is active.

  o In bioq_disksort(), if we have a barrier (insert_point is non-NULL),
set prev to the barrier and cur to it's next element.  Now that
last_offset is kept at the barrier position, this change isn't
strictly necessary, but since we have to take a decision branch
anyway, it does avoid one, no-op, loop iteration in the while
loop that immediately follows.

  o In bioq_disksort(), bypass the normal sort for bios with the
BIO_ORDERED attribute and instead insert them into the queue
with bioq_insert_tail().  bioq_insert_tail() not only gives
the desired command order during insertion, but also provides
barrier semantics so that commands disksorted in the future
cannot pass the just enqueued transaction.

sys/sys/bio.h:
  Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio.

sys/cam/ata/ata_da.c:
sys/cam/scsi/scsi_da.c
  Use an ordered command for SCSI/ATA-NCQ commands issued in
  response to bios with the BIO_ORDERED flag set.

sys/cam/scsi/scsi_da.c
  Use an ordered tag when issuing a synchronize cache command.

  Wrap some lines to 80 columns.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
sys/geom/geom_io.c
  Mark bios with the BIO_FLUSH command as BIO_ORDERED.

Sponsored by:   Spectra Logic Corporation


Can you try perhaps commenting out the 'bp->bio_flags |= BIO_ORDERED' line