Re: Libmicro!! pthread_create Resource temporarily unavailable

2005-11-27 Thread David Xu

Ricardo A. Reis wrote:


Hi George,

 


pthread_create: Resource temporarily unavailable

 


Which version of FreeBSD please?
   



I test libmicro in 6.0-RELEASE and last cvsuped RELENG_6, both doesn't
work.


Thanks


Ricardo A. Reis
UNIFESP
Unix and Network Admin

 


the pthread_create benchmark is trying to create 2 threads, the
default allowed number is 1500.

see sysctl kern.threads.max_threads_per_proc


___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Libmicro!! pthread_create Resource temporarily unavailable

2005-11-29 Thread David Xu


Ricardo A. Reis wrote:



>the pthread_create benchmark is trying to create 2 threads, the
>default allowed number is 1500.


Hi,

I increase default limits for  *per_proc for 4

sysctl -a |grep threads
kern.threads.thr_concurrency: 0
kern.threads.thr_scope: 0
kern.threads.virtual_cpu: 2
kern.threads.max_threads_hits: 0
kern.threads.max_groups_per_proc: 4
kern.threads.max_threads_per_proc: 4
kern.threads.debug: 0
vm.stats.vm.v_kthreads: 69


But this not solution,



Thanks


Ricardo A. Reis
UNIFESP
Unix and Network Admin


You can not create 4 threads, because each thread
defaults has 2M stack, you will out of address space.

David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Libmicro!! pthread_create Resource temporarily unavailable

2005-11-29 Thread David Xu

Ricardo A. Reis wrote:


>>You can not create 4 threads, because each thread
>>defaults has 2M stack, you will out of address space.

David,

I don't attempt to create 40k threads, i barely increase the limit for 
single process for 40k.

I like a test libmicro with default sets.



Ricardo A. Reis
UNIFESP
___


try to give it an option, for example:
./pthread_create -B 1499




___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: More threads

2005-12-06 Thread David Xu

Michael Vince wrote:

Hi All,

I been benchmarking a Java servlet I have created and I want to be able 
to handle a massive amount of simultaneous connections.
So far using benchmarking tools I have been able to get around 2,565 
threads on the Tomcat 5.5 Java process (with native 1.4.2 Java) 
according to ps -auxwH | grep -c 'java'


But I just can't seem to get past that mark, I have a lot of memory 
currently the Tomcat is allocated 2gigs of memory.


When I max it out with my benchmarks I get this in catalina Tomcat 5.5 
but there is still plenty of free memory to the Tomcat process.
SEVERE: Caught exception (java.lang.OutOfMemoryError: unable to create 
new native thread) executing [EMAIL PROTECTED], 
terminatingthread


I have been testing with my libmap.conf with these different types of 
implementations.


[/usr/local/jdk1.4.2/]
#libpthread.so   libc_r.so
#libpthread.so.2   libc_r.so.6
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so
libc_r.so.6 libthr.so.2
libc_r.so   libthr.so

I have been getting the best performance with libthr and have been using 
the above libthr and with these settings below.


I have set my max_threads in /etc/sysctl.conf to a massive amount.

kern.threads.max_threads_per_proc=4
kern.threads.max_groups_per_proc=4

ps -auxwH | grep -c 'java'
2565

I am using Apache2.2 with the new built in AJP module which has been a 
great addition to Apache 2.2.
I have been able to get the setup performing most stable (no 503 status 
errors) but with less performance / threads with the libmap.conf below.

[/usr/local/jdk1.4.2/]
libpthread.so   libc_r.so
libpthread.so.2   libc_r.so.6

I am on 6.0 Release i386, dual Intel P4, generic SMP kernel.

With Tomcat on Windows XP I have been able to get it running better.
Does any one know of some other sysctls that increase the amount of 
threads in libthr which I am assuming utilizes the above sysctls. Does 
any one know how many threads can be created in Java on FreeBSD?


Cheers,
Mike


Number of threads you can create if you use libthr is limited by
following factors:
1) sysctl:
kern.threads.max_threads_per_proc
kern.threads.max_groups_per_proc

2) stack
per-thread userland stack, default number on 64 bits platform is 2M,
on 32 bits platform, it is 1M. I don't know whether java supports
adjusting default per-thread stack size. if can not, we may add
an environment variable to thread libraries, for example:
LIBPTHREAD_THREAD_STACKSIZE allows user to set default stack size.

Thread also needs a kernel mode stack when it is in kernel, if
I am right, it is 16K bytes per-thread. if you create too many
threads, make sure both side won't exhaust address space, and
because kernel stack can not be swapped out when process is
running, you'd make sure physical memory can not be exhausted.

3) check memory limits
type limits command:

Resource limits (current):
  cputime  infinity secs
  filesize infinity kB
  datasize   524288 kB
  stacksize   65536 kB
  coredumpsize infinity kB
  memoryuseinfinity kB
  memorylocked infinity kB
  maxprocesses 5547
  openfiles   11095
  sbsize   infinity bytes
  vmemoryuse   infinity kB

if address space is not large enough, you have to reconfigure kernel
to allow larger space. Fix me if I am wrong.

David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: new benchmarks. WAS: FreeBSD MySQL still WAY slower than Linux

2005-12-08 Thread David Xu

Greg 'groggy' Lehey wrote:


I've heard this claim again and again, and I intend to look at it when
I have time.  I find it difficult to believe that this alone could
explain the sometimes horrendous performance differences (3 to 1) that
have been reported.

Can somebody tell me:

1.  How many calls there are per second?
2.  Where they're coming from?  This would involve profiling, of
   course.

Greg
--
See complete headers for address and phone numbers.
 


You find ktrace result of mysql:
http://people.freebsd.org/~davidxu/mysql/mysql_ktrace.txt
gettimeofday() almost follows every network I/O.

Also you can find its I/O size:
http://people.freebsd.org/~davidxu/mysql/iosize.txt

I guess the gettimeofday() call is relevant to mysql's
connection keepalive work, sounds like a very silly method.

David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: mysql benchmarks

2005-12-09 Thread David Xu

Gustavo A. Baratto wrote:


Since the last post just had freebsd numbers, I'm re-posting it including
Linux as well. Both linux and freebsd numbers were taken from the same 
box:


...

Can you try TSC timer on FreeBSD ? someone reported that using TSC
timer boosts performance of super-smack significantly.

David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Benchmark for MySQL 5.0 + FreeBSD 6-STABLE

2005-12-11 Thread David Xu

Gea-Suan Lin wrote:

3*2*2*2 = 24 cases:

# Compile Options: none, WITH_PROC_SCOPE_PTH=yes, WITH_LINUXTHREADS=yes
# /etc/libmap.conf: none (libpthread), libthr
# kern.timecounter.choice: ACPI-fast, TSC
# kernel: ULE+PREEMPTION, ULE

I put the detail information in my blog:

http://blog.gslin.org/archives/2005/12/12/252/


It it first time that I heard that one can map LINUXTHREADS
to libthr or others by using libmap.conf, can anyone confirm
it is possible ?

David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: mysql benchmarks

2005-12-13 Thread David Xu

Peter Pentchev wrote:


So, if both systems use gettimeofday, then slow may be somewhere else.
   



E...  I think David might have meant that the original poster should
simply set kern.timecounter.hardware to 'TSC', not i8254 or something
else.  This would not change whether MySQL uses gettimefday() or not,
it would simply change the in-kernel method of obtaining the actual
time of day - at least that's how I understand it :)

G'luck,
Peter

 


Err, I just wanted to know the best performance current FreeBSD can
achieve. The timer problem had already been discussed several times
in the list, I don't want to repeat it here.

David Xu


___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: mysql performance on 4 * dualcore opteron

2006-04-05 Thread David Xu
在 Wednesday 05 April 2006 00:42,Sven Petai 写道:
> 
> hi
> 
> Before I begin, let me just say that I'm probably aware most of the threads 
> about mysql performance in various fbsd lists over last couple of years, so 
> please let's not consentrate on the usual points made over and over again 
> like how filesystems are mounted under linux, how fast time() is or how 
> various combinations of scheduler/threding library/compiler flags give you 
> ~5-10% better performance. It's very unlikely that any of these reasons, or 
> even all of them together can explain performance differences of 2-3 * 
> 
Can you disable log-bin option in my.cnf to see if it is  a FS bottleneck
when you are running update-smack ? please run Linux and FreeBSD
with same hardware and my.cnf configuration, thanks.
I know this is not very right, but it can be used to narrow down some
kernel performance problem.

Regards,
David Xu
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: mysql performance on 4 * dualcore opteron

2006-04-08 Thread David Xu
在 Thursday 06 April 2006 17:12,Michael Vince 写道:

> I have also done benchmarking with libthr against Apache using 'ab' and 
> found it can deliver an extra amount of megabytes/sec of data (I think 
> it was about an extra 2000/requests sec) at the cost of giving the 
> server from what I remember almost double the 'average load' according 
> to 'top'
> Given that if your machine has nothing else to do but deliver data 
> purely from Apache then even libthr is more worth while for Apache as well.
> 
> Mike

libpthread default uses M:N threads which means a thread  may be on
userland scheduler's run queue, and FreeBSD kernel does not know, 
so it will be not shown on average load, default system tools are not
very useful here.

David Xu
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: mysql performance on 4 * dualcore opteron

2006-04-08 Thread David Xu
On Saturday 08 April 2006 17:44, Michael Vince wrote:
> I have also tried putting my Perl under libthr for a single thread log
> analyzer and to my surprise it even could process logs faster.
>
I don't know why, but I only know I did some micro optimizations in libthr,
and the library is small and may be fully cached in L1 cache on athlon
xp/64 CPU, don't take it seriously. ;-)

> libthr is also really useful for actually paying attention to tops 'thr'
> column since it does show actual true thread number activity, under
> pthread it shows a couple and under libc_r I could have 1000 threads
> going but top just shows 1.
>
> Mike
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets)

2006-05-06 Thread David Xu
On Sunday 07 May 2006 06:39, Attila Nagy wrote:
> On 2006. 05. 06. 22:50, Robert Watson wrote:
> >> The machine is a quad core Xeon LV server, the client side is
> >> sysbench, accessing mysql 4.1.8 on a socket. Heap table, simple test.
> >
> > Which threading library is that with, btw?  If libpthread, could you run
> > the same test with libthr, and vice versa?
>
> thr and with dynamically linked mysql, because when I link it with
> -static, it dies with sig11 at startup.
>
Our static thread library build is broken, my standpoint is weak symbol 
overriding never works correctly, if you have any idea, we can discuss
it on arch@, we of course will be at disadvantageous position if it won't
be fixed.

> You can find the updated picture at the previous link.

David Xu
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets)

2006-05-07 Thread David Xu
On Saturday 06 May 2006 22:16, Robert Watson wrote:
> Dear all,
>
> Attached, please find a patch implementing more fine-grained locking for
> the POSIX local socket subsystem (UNIX domain socket subsystem).  In the
> current implementation, we use a single global subsystem lock to protect
> all local IPC over the PF_LOCAL socket type.  This has low overhead, but
> can result in significant contention, especially for workloads like MySQL
> which push a great deal of data through UNIX domain sockets, and involve
> many parties.  The hope is that by improving granularity, we can reduce
> contention sufficiently to overcome the added cost of increased locking
> overhead (a side-effect of greater granularity).  At a high level, here are
> the changes made:

I have tested the patch on my dual PIII machine, the test is super-smack's
select-key.smack, as you said, performance is improved. it seems uidinfo
lock is one of the bottlenecks in my test, it might be caused by
chgsbsize() because it seems each socket I/O has to execute the code
though they don't have any relations. Note unlike kris, I don't have
local changes.

   maxtotal   count   avg cnt_hold cnt_lock name
  2913  1110657 1200300 02569034374 
kern/kern_resource.c:1172 (sleep mtxpool)
  5603  7259406  204321352089920075 
kern/kern_descrip.c:378 (Giant)
  6987  1817410 1369257 110739 7324 
kern/kern_descrip.c:1990 (filedesc structure)
  3713  3771857  12005331 4553 4612 
kern/uipc_usrreq.c:617 (unp_global_mtx)
  7339  1903685 1574656 1 3389 3710 
kern/kern_descrip.c:2145 (sleep mtxpool)
 91334  1798700 1369257 1 3227 7916 
kern/kern_descrip.c:2011 (filedesc structure)
  4764   223419  204440 1 2549 1693 
kern/kern_descrip.c:385 (filedesc structure)
410546  2002932 1369257 1 2238 4103 
kern/kern_descrip.c:2010 (sleep mtxpool)
  5064   248004  169944 1 1152 1777 
kern/kern_sig.c:1002 (process lock)
39   149033   74420 2  760  866 
kern/kern_synch.c:220 (process lock)
  5567   209654  204321 1  691 1566 
kern/kern_descrip.c:438 (filedesc structure)
70   386915   63807 6  527  412 
kern/subr_sleepqueue.c:374 (process lock)
  4358   486842   66802 7  463  291 
kern/vfs_bio.c:2424 (vnode interlock)
  6430   488186  214393 2  347  420 
kern/kern_lock.c:163 (lockbuilder mtxpool)
  3057   251010   68159 3  313 2290 
kern/vfs_vnops.c:796 (vnode interlock)
 13126 15731880 108009214  294  227 
kern/uipc_usrreq.c:581 (so_snd)
  316167316   66402 1  293  267 
kern/vfs_bio.c:1464 (vnode interlock)
  3395   205078  204321 1  270  447 
kern/kern_descrip.c:433 (sleep mtxpool)
  301197692232342  213  239 
kern/kern_synch.c:218 (Giant)
71 99333721 2  185  185 
i386/i386/pmap.c:2235 (vm page queue mutex)
14 34543512 0  121  155 
vm/vm_fault.c:909 (vm page queue mutex)
  3700  2120096  12004617  119   22 
kern/uipc_usrreq.c:705 (so_rcv)
 9 23792024 1  103  121 
vm/vm_fault.c:346 (vm page queue mutex)
  9389   131186  120070 1   941 
kern/uipc_socket.c:1101 (so_snd)
 4 24003016 0   85 4385 
vm/vm_fault.c:851 (vm page queue mutex)
99 6403 59610   84   50 
i386/i386/pmap.c:2649 (vm page queue mutex)
  5109   201972  204330 0   77  306 
kern/sys_socket.c:176 (so_snd)
  189247770203023   66   15 
vm/vm_fault.c:686 (vm object)
  3380   360770   66716 5   61  117 
kern/vfs_bio.c:1445 (buf queue lock)
 7 15971380 1   53   90 
vm/vm_fault.c:136 (vm page queue mutex)
   34789307826710   50  330 
kern/kern_timeout.c:240 (Giant)
1410199   12448 0   38  100 
kern/kern_sx.c:157 (lockbuilder mtxpool)
15104213046 3   37   32 
vm/vm_fault.c:1009 (vm page queue mutex)
  3704  2757298  12004622   310 
kern/uipc_usrreq.c:696 (unp_mtx)
27 1519 510 2   27   21 
vm/vm_object.c:637 (vm page queue mutex)
29 76515374 1   26   44 
geom/geom_io.c:68 (bio queue)
  4517255406495 3   23  250 
kern/kern_um

Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets)

2006-05-07 Thread David Xu
On Monday 08 May 2006 07:04, Kris Kennaway wrote:

> i.e. apparently not a large difference, but still a large proportion
> of cases where multiple CPUs are woken at once on the same chain.
>
> Kris
This becauses there is no sleepable mutex available, so I had to use
msleep and wakeup, this is suboptimal, I may put flag MTX_QUIET there
to let WITNESS shut up.
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets)

2006-05-08 Thread David Xu
On Monday 08 May 2006 14:52, Kris Kennaway wrote:
> OK, David's patch fixes the umtx thundering herd (and seems to give a
> 4-6% boost).  I also fixed a thundering herd in FILEDESC_UNLOCK (which
> was also waking up 2-7 CPUs at once about 30% of the time) by doing
> s/wakeup/wakeup_one/.  This did not seem to give a performance impact
> on this test though.
>
> filedesc contention is down by a factor of 3-4, with corresponding
> reduction in the average hold time.  The process lock contention
> coming from the signal delivery wakeup has also gone way down for some
> reason.
>

I found that mysqld frequently calls alarm() in its file thr_alarm.c and 
thr_kill() to send SIGALRM to its timer thread to wake it up, the timer 
thread itself is being blocked in sigwait(), normally the alarm timer will
be expired in a second, so the kernel will periodically call psignal to find
a thread which can handle the signal, it means kernel has to periodically
walk through thread list with process lock and scheduler held, this is
very expensive.

thr_kill will in most time wake up the timer thread earlier, in thr_kill
syscall,  kernel has to walk through thread list to find a thread whose
thread is matching the given id, the function thread_find()
uses a linear searching algorithm, it is slow, if there are lots of thread in
the process,  the process lock will be holden too long, I think that's the 
reason why you have seen so many process lock contention, if you
define USE_ALARM_THREAD in mysql header file, the contention should
be decreased ( I hope ), patch:

--- my_pthread.h.oldMon May  8 18:16:56 2006
+++ my_pthread.hMon May  8 18:17:07 2006
@@ -267,6 +267,8 @@
 
 /* Test first for RTS or FSU threads */
 
+#define USE_ALARM_THREAD
+
 #if defined(PTHREAD_SCOPE_GLOBAL) && !defined(PTHREAD_SCOPE_SYSTEM)
 #define HAVE_rts_threads
 extern int my_pthread_create_detached;


> unp contention has risen a bit.  The other big gain is to sleep
> mtxpool contention, which roughly doubled:
>
> /*
>  * Change the total socket buffer size a user has used.
>  */
> int
> chgsbsize(uip, hiwat, to, max)
> struct  uidinfo *uip;
> u_int  *hiwat;
> u_int   to;
> rlim_t  max;
> {
> rlim_t new;
>
> UIDINFO_LOCK(uip);
>
> So the next question is how can that be optimized?
>
may use atomic_cmpset_int in a loop to avoid context switch or use an
adaptive mutex, but there is no adaptive mutex type you can specify.
rlim_t is a 64bit integer, so atomic operation can not be used, but 64bit 
integer might not be necessary for socket buffer size.

> Kris

David Xu
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets)

2006-05-08 Thread David Xu
On Tuesday 09 May 2006 02:43, Kris Kennaway wrote:

> Hmm, with this patch mysql 4.1 seems to crash at startup.  I haven't
> yet had time to investigate.  Is anyone else seeing this?
>
> Kris

I only have tested mysql 4.0, I will try 4.1 later.
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets)

2006-05-09 Thread David Xu
On Wednesday 10 May 2006 02:23, Kris Kennaway wrote:
> There are at least several issues here:
>
> * Factor of >two difference in performance across the board (all
> loads) relative to Linux.  This may be general issues like
> gettimeofday() on Linux vs FreeBSD; clearly there is something *very
> big* to blame here.  Mysql does do *lots* of such calls, so the cost
> of them is surely a component in performance, the only question is if
> it's the main one.
>
My last recall is that gettimeofday is not a syscall on Linux, they called
it vgettimeofday, they also have a lower overhead syscall: vsyscall which
uses sysenter/sysexit when CPU supports, they do calculation in userland,
the page is exported by kernel which can be executed by userland.
at least I saw the idea on one serious hacker's blog, but now I can not find 
the URL.

> * When you get some of the locking out of the way (per this thread)
> FreeBSD has 44% better peak performance on Sven's test on amd64, but
> tails off by about 33% at higher loads compared to unpatched.  I see
> similar changes on 12-CPU E4500, but not as much performance gain (may
> be due to other reasons).  i.e. optimizing the locking allows a new,
> bigger bottleneck to suck on center stage.  This is the basis for my
> observation about libthr at high loads.  It is not the same as the
> above issue.
>
> Kris

Fixing one of big lock contentions is not enough, you have to fix them
all, it is easy to see that a second contention becomes a top one. :-)

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets)

2006-05-09 Thread David Xu

Tim Kientzle wrote:


I recall Matt talking about implementing gettimeofday()
without a syscall.  The basic idea is to have the kernel
record some constants in a page that's mapped across
all processes, then libc can just read the time from
a known location.

It might be nice to combine this with some of the
other ideas being tossed around here:
  * On each clock tick, store a base time in
a known location (page mapped read-only, no-execute
across all memory maps)
  * libc can just read the base time (accurate
to the clock rate) from a constant.  Very, very fast.
  * For higher resolution, the kernel could record
TSC and CPU clock speed data (per-CPU? Hmmm...)
and libc could use that to fine-tune the time?

Still some details I need to think through...

Tim



One of the problems to implement it is that atomic operations,
if there are multiple integer needs to be updated by kernel,
userland maybe gets an inconsistent result, the way to avoid the
problem is using two generation numbers.

http://gsu.linux.org.tr/~mpekmezci/kernelapi/unitedlinux/arch/x86_64/kernel/vsyscall.c.html
check do_vgettimeofday(struct timeval * tv):

Another problem is how you tell userland the address of the kernel
page ? do you use fixed address or tell it via program headers like
the PT_TLS set by kernel, check /usr/src/lib/libc/gen/tls.c.

David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Fine-grained locking for POSIX local sockets ( UNIX domain sockets )

2006-05-11 Thread David Xu
On Friday 12 May 2006 01:58, Robert Watson wrote:
> On Thu, 11 May 2006, Scott Long wrote:
> >> So I guess the real question is: do we want to merge the UNIX domain
> >> socket locking work?  The MySQL gains sound good, the performance drop
> >> under very high load seems problematic, and there are more general
> >> questions about performance with other workloads.
> >>
> >> Maintaining this patch for a month or so is no problem, but as the tree
> >> changes it will get harder.
> >
> > The only thing I'm afraid of is that it'll get pushed onto the
> > back-burner once it's in CVS, and we'll have a mad scramble to fix it
> > when it's time for 7.0.  That's not a show-stopper for it going in, as
> > there are also numerous benefits.  It's just something that needs to be
> > tracked and worked on.
>
> I should be able to support/improve UNIX domain sockets moving forward
> without a problem -- the maintenance issue is maintaining it in P4
> indefinitely, not in the tree indefinitely, as the patch basically touches
> every line in the file, so any change in the vendor branch (FreeBSD CVS)
> will put the entire file into conflict.  To be specific: I'll track and own
> this, but want to avoid having it in P4 indefinitely, because it will get
> stale :-).
>
> Robert N M Watson

Your patch makes other bottlenecks more visible than before, for example,
file descriptor locking, but it is not a problem of your patch, so I think
it is fine to commit it.

David Xu
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Initial 6.1 questions

2006-06-12 Thread David Xu
On Tuesday 13 June 2006 04:32, Kris Kennaway wrote:
> On Mon, Jun 12, 2006 at 09:08:12PM +0100, Robert Watson wrote:
> > On Mon, 12 Jun 2006, Scott Long wrote:
> > >I run a number of high-load production systems that do a lot of network
> > >and filesystem activity, all with HZ set to 100.  It has also been shown
> > >in the past that certain things in the network area where not fixed to
> > >deal with a high HZ value, so it's possible that it's even more
> > >stable/reliable with an HZ value of 100.
> > >
> > >My personal opinion is that HZ should gop back down to 100 in 7-CURRENT
> > >immediately, and only be incremented back up when/if it's proven to be
> > > the right thing to do. And, I say that as someone who (errantly) pushed
> > > for the increase to 1000 several years ago.
> >
> > I think it's probably a good idea to do it sooner rather than later.  It
> > may slightly negatively impact some services that rely on frequent timers
> > to do things like retransmit timing and the like.  But I haven't done any
> > measurements.
>
> As you know, but for the benefit of the list, restoring HZ=100 is
> often an important performance tweak on SMP systems with many CPUs
> because of all the sched_lock activity from statclock/hardclock, which
> scales with HZ and NCPUS.
>
> Kris

sched_lock is another big bottleneck, since if you 32 CPUs, in theory
you have 32X context switch speed, but now it still has only 1X speed,
and there are code abusing sched_lock, the M:N bits dynamically inserts
a thread into thread list at context switch time, this is a bug, this
causes thread list in a proc has to be protected by scheduler lock, 
and delivering a signal to process has to hold scheduler lock and
find a thread, if the proc has many threads, this will introduce
long scheduler latency, a proc lock is not enough to find a thread,
this is a bug, there are other code abusing scheduler lock which
really can use its own lock.

David Xu
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Updated fine-grain locking patch for UNIX domain sockets

2006-07-02 Thread David Xu
On Friday 30 June 2006 07:14, Robert Watson wrote:
> Attached, and at the below URL, find an updated copy of the UNIX domain
> socket fine-grained locking patch.  Since the last revision, I've updated
> the patch to close several race conditions historically present in UNIX
> domain sockets (which should be merged regardless of the rest of the
> patch), as well as move to an rwlock for the global lock.
>
> 
> http://www.watson.org/~robert/freebsd/netperf/20060630-uds-fine-grain.diff
>
> This patch increases locking overhead, but decreases contention.  Depending
> on the number of CPUs, it may improve (or not) performance to varying
> degrees; very good reports on sun4v; middling reports on 2-proc, etc. 
> Stability and performance results for UNIX domain socket sensitive
> workloads, such as MySQL, X11, etc, would be appreciated.  Micro-benchmark
> performance should show a small loss, but load under contention and
> scalability are (ideally) improved.
>
> Robert N M Watson
> Computer Laboratory
> University of Cambridge
>

I found 5% performance decrease on dual P4, maybe P4 is quite bad when
doing atomic operation. ;-)
Thanks,

David Xu
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Is the fsync() fake on FreeBSD6.1?

2006-07-02 Thread David Xu
On Tuesday 27 June 2006 11:34, Greg 'groggy' Lehey wrote:

> This is not the case for Linux, where fsync syncs the entire file
> system.  That could explain some of the performance difference, but
> not all of it.  I suppose it's worth noting that, in general, people
> report much better performance with MySQL on Linux than on FreeBSD.
>

I recent have tested SCHED_CORE, the scheduler has same dynamic
priority algorithm as Linux 2.6, it can make 10% performance boost
for super-smack on my dual PIII, I tested it on local host,  but 
its user interaction is quite bad under heavy load,  scheduling
alogrithm makes sense, but 4BSD is still best scheduler for me.

> > I mean than the data is only written to the drives memory and so can
> > be lost if power goes down.
>
> I don't believe that fsync is required to flush the drive buffers.  It
> would be nice to have a function that did, though.
>
> > And how I can confirm this?
>
> Trial and error?
>
> Greg
> --
> See complete headers for address and phone numbers.
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: MySQL 5.0.22 , FreeBSD 6.1-STABLE: Benchmark

2006-07-02 Thread David Xu

Hugo Silva wrote:


Today I decided to benchmark MySQL 5 performance on FreeBSD 6.1-STABLE.
This server is a Dual Xeon 2.8GHz, 4GB of RAM and 2x73GB SCSI disks that 
do 320MB/s


For all the tests, I restarted mysqld prior to starting the test,  
waited for about 1 minute for it to settle down, and ran super smack. 
For the consecutive runs, I executed super-smack right after the 
previous run ended.


Switching from HTT to no HTT was achieved by 
machdep.hyperthreading_allowed, and switching from/to libpthread/libthr 
was done via libmap.conf.


System:

FreeBSD ?? 6.1-STABLE FreeBSD 6.1-STABLE #3: Mon Jul  3 03:10:35 UTC 
2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/DATABASE  i386


Here are the results:


MySQL 5.0.22, built with BUILD_OPTIMIZED=yes and WITH_PROC_SCOPE_PTH=yes




Please don't run mysql in PROC_SCOPE with libthr, it has no benefit and
can only hurt performance, you can forcely turn it off by:

sysctl kern.threads.thr_scope=2

the proc scope support may be dropped near future in libthr, thanks for 
your evaluation.


David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: MySQL 5.0.22 , FreeBSD 6.1-STABLE: Benchmark

2006-07-03 Thread David Xu

Michael Vince wrote:


Hugo Silva wrote:


Today I decided to benchmark MySQL 5 performance on FreeBSD 6.1-STABLE.
This server is a Dual Xeon 2.8GHz, 4GB of RAM and 2x73GB SCSI disks 
that do 320MB/s


For all the tests, I restarted mysqld prior to starting the test,  
waited for about 1 minute for it to settle down, and ran super smack. 
For the consecutive runs, I executed super-smack right after the 
previous run ended.


Switching from HTT to no HTT was achieved by 
machdep.hyperthreading_allowed, and switching from/to 
libpthread/libthr was done via libmap.conf.


System:

FreeBSD ?? 6.1-STABLE FreeBSD 6.1-STABLE #3: Mon Jul  3 03:10:35 UTC 
2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/DATABASE  i386


Here are the results:


MySQL 5.0.22, built with BUILD_OPTIMIZED=yes and WITH_PROC_SCOPE_PTH=yes


=== 4BSD + libthr + HTT on ===

Run #1
connect: max=4ms  min=1ms avg= 3ms from 10 clients
Query_type  num_queries max_timemin_timeq_per_s
select_index20  0   0   20405.86


I think that this, does show impressive scaling to actually see 
performance increase with HTT enabled, from what I have seen on 
benchmarks on many hardware sites testing on MS Windows is that on the 
average best you get is an extra 5% performance out of HTT per core.
I don't have any quad core machines either, but my dual CPU Dells that 
are around 3.[46]ghz get score of around 25,000


The other promising benchmark I saw on per CPU scaling was a few months 
ago with a posted super smack benchmark on a -current box that was 
getting a score of around 60,000 on a slightly better Quad core AMD64 
machine which proves consistent scaling per core, which as far as my 
memory goes shows good scaling when entering the 4+ core arena on MySQL.


Mike



Actually, with proper scheduling behaviour, HTT is usefull,
I saw very high performance boosts when running sysbench :

sysbench --test=oltp --oltp-table-size=100 
--mysql-host=192.168.82.170 --mysql-user=test --mysql-db=test 
--oltp-read-only --num-threads=256 --max-requests=1 run


This benchmark runs on my Dual XEON (2.8Ghz, HTT enabled), when the
scheduler is SCHED_CORE, it only requires 30 seconds, while a 4bsd 
scheduler needs 52 seconds, last time, I wrongly wiped some code in

SCHED_CORE (which is now in tree), performance is degraded.
I need some time to make the scheduler works properly.

David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: MySQL 5.0.22 , FreeBSD 6.1-STABLE: Benchmark

2006-07-03 Thread David Xu
On Tuesday 04 July 2006 02:40, Mike Jakubik wrote:
> Michael Vince wrote:
> > HTT was Intels best early stab to help path the way for their multi
> > core technologies to come into use as quickly as possible for the
> > masses over just the server end.
>
> Exactly, thats why i wouldn't spend too much time bothering with HTT. It
> was a transitional technology for multi core CPUs, which are now the
> standard. It will be interesting how the new Conroe processors fair on
> FreeBSD, the early benchmarks show better performance than AMDs offerings.

For conroe, google the "fair-cache", you may find what should be done
in scheduler, that's one of many reasons why I was saying libpthread should
be stopped. Unless conroe is very special and does not need this 
work.



___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: MySQL 5.0.22 , FreeBSD 6.1-STABLE: Benchmark

2006-07-03 Thread David Xu
On Tuesday 04 July 2006 06:21, David Xu wrote:

> For conroe, google the "fair-cache", you may find what should be done
> in scheduler, that's one of many reasons why I was saying libpthread should
> be stopped. Unless conroe is very special and does not need this
> work.

Here is one of such interesting paper:
http://people.freebsd.org/~davidxu/doc/osdi-2006-submission.pdf

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Help with improving mysql performance on 6.2PRE

2006-10-05 Thread David Xu
On Friday 06 October 2006 07:24, Jerry Bell wrote:
> I always thougt that compiling something static increased performance, but
> then that's probably true for things that have to startup and shutdown
> frequently.
>
> Thanks again.
>
> Jerry
>

static compiling will link libpthread but not libthr.
I found setting larger buffer in /etc/my.cnf will yield much better result
than default configuration, also turnning off log-bin option makes a 
different on my machine.
mysql 5.x is much better than 4.1 on select benchmark, almost extra 25%
performance improvement I can get on athlon64 X2 3800+ running in
64-bit mode.

David Xu
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: DNS Performance Numbers

2006-10-24 Thread David Xu
On Wednesday 25 October 2006 10:59, [EMAIL PROTECTED] wrote:
> I am running some performance tests on named to see how it performs
> with different configurations on FreeBSD and figured I would share the
> first results.  The first tests are  for serving up static data.
>
> System:
>   Supermicro PDSMi Motherboard
>   1G Memory
>   Intel Pentium D CPU 3.40GHz
>   Intel Gigibit NIC
>   Bind 9.2.3
>
> OS  UP  UP+PMP  MP+PMP+TP   MP+TT   MP+TP+P
> MP+TT+P
> ---
> FreeBSD 4.1128455   28370   28976   X   X   X   X  
> X FreeBSD 6.1 29074   34260   34635   35730   17846   38780   19776  
> 44188 FreeBSD Stable  30190   34707   33294   36651   18893   39374   19449
>   44169 FreeBSD Current 30707   34029   32300   33689   15535   40554  
> 13886   42071 Ubuntu 6.06 X   X   X   X   X   37294
>   X   X
>
> UP = Uni-processor Kernel
> MP = Multi-processor Kernel
> P  = Device Polling
> TP = Threaded Bind using libpthread
> TT = Threaded Bind using libthr
>
> --
> Dave

Thanks for your benchmark result! so it blows away the rumor that our thread
library is slow.

David Xu
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Cached file read performance

2006-12-21 Thread David Xu

Mark Kirkwood wrote:
I recently did some testing on the performance of cached reads using two 
(almost identical) systems, one running FreeBSD 6.2PRE and the other 
running Gentoo Linux - the latter acting as a control. I initially 
started a thread of the same name on -stable, but it was suggested I 
submit a mail here.


My background for wanting to examine this is that I work with developing 
database software (postgres internals related) and cached read 
performance is pretty important - since we typically try hard to 
encourage cached access whenever possible.


Anyway on to the results: I used the attached program to read a cached 
781MB file sequentially and randomly with a specified block size (see 
below). The conclusion I came to was that our (i.e FreeBSD) cached read 
performance (particularly for smaller block sizes) could perhaps be 
improved... now I'm happy to help in any way - the machine I've got 
running STABLE can be upgraded to CURRENT in order to try out patches 
(or in fact to see if CURRENT is faster at this already!)...


Best wishes

Mark



I suspect in such a test, memory copying speed will be a key factor,
I don't have number to back up my idea, but I think Linux has lots
of tweaks, such as using MMX instruction to copy data.

Regards,
David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Cached file read performance

2006-12-22 Thread David Xu
On Saturday 23 December 2006 03:29, Alexander Leidinger wrote:

> I want to point out http://www.freebsd.org/projects/ideas/#p-memcpy
> here. Just in case someone wants to play around a little bit.
>
> Bye,
> Alexander.

I have read the code, if a buffer is not aligned at 16 bytes boundary,
it will not use FPU to copy data, but user buffer is not always 16 bytes
aligned.

David Xu
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Massive performance loss from OS::sleep hack

2007-09-16 Thread David Xu

Kris Kennaway wrote:

Hi,

I have been running the volano java benchmark 
(http://www.volano.com/benchmarks.html) on an 8-core i386 system, and 
out of the box jdk15 on FreeBSD performs extremely poorly.  The system 
is more than 90% idle, and profiling shows that the ~800 threads in the 
benchmark are spending most of their time doing short nanosleep() calls.



I traced it to the following FreeBSD-specific hack in the jdk:

// XXXBSD: understand meaning and workaround related to yield
...
// XXXBSD: done differently in 1.3.1, take a look
int os::sleep(Thread* thread, jlong millis, bool interruptible) {
  assert(thread == Thread::current(),  "thread consistency check");
...

  if (millis <= 0) {
// NOTE: workaround for bug 4338139
if (thread->is_Java_thread()) {
  ThreadBlockInVM tbivm((JavaThread*) thread);
// BSDXXX: Only use pthread_yield here and below if the system thread
// scheduler gives time slices to lower priority threads when yielding.
#ifdef __FreeBSD__
  os_sleep(MinSleepInterval, interruptible);
#else
  pthread_yield();
#endif

When I removed this hack (i.e. revert to pthread_yield()) I got an 
immediate 7-fold performance increase, which brings FreeBSD performance 
on par with Solaris.


What is the reason why this code is necessary?  Does FreeBSD's 
sched_yield() really have different semantics to the other operating 
systems, or was this a libkse bug that was being worked around?


Kris


Yeah, our sched_yield() really works well, in kernel, if the thread
scheduling policy is SCHED_OTHER (time-sharing), scheduler temporarily
lowers its priority to PRI_MAX_TIMESHARE, it is enough to give some CPU
time to other threads. Why doesn't the UNIX time-sharing work for java ?

Regards,
David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: ULE vs. 4BSD in RELENG_7

2007-10-23 Thread David Xu

Kris Kennaway wrote:

One major difference is that your workload is 100% user.  Also you were 
reporting ULE had more idle time, which looks like a bug since I would 
expect it be basically 0% idle on such a workload.


Kris


We can not ignore this performance bug, also I had found that ULE is
slower than 4BSD when testing super-smack's update benchmark on my
dual-core machine.

Regards,
David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: TTY task group scheduling

2010-11-18 Thread David Xu

Lucius Windschuh wrote:

2010/11/18 Bruce Cran :

Have you tried increasing kern.sched.preempt_thresh? According to
http://groups.google.com/group/mailing.freebsd.stable/browse_thread/thread/05a39f816fd8acc6/82affa9f195b747d?lnk=raot&fwc=1&pli=1
a good value for desktop use would be 224.


Hmm, I though I tried this -- but this helps indeed. :-)
The browser, movie player etc. behave much better when a "make -j4
buildworld" is running on my 2-core machine in the background. Thank
you.

2010/11/18 Bruce Cran :

If you're using UFS, I've found it to be quite a bottleneck when
doing parallel IO: I even ran a "svn up" in one terminal and tried to
login on another a couple of days ago only to find the motd took over 5
seconds to appear! That may be excessive since I was running a kernel
with WITNESS and INVARIANTS, but I've found ZFS to be far better if you
want good interactivity when reading/writing to disks.


This is indeed another issue, which I also encountered, but explicitly
left out since I don't blame the task scheduler for that. ;)

Unfortunately, I don't know how much SCHED_ULE's inability to cope
with more runnable threads than cores, as Steve mentioned, accounts to
the problem I observe. Time to switch back to SCHED_4BSD? *sigh*

Lucius


Sometimes, I am thinking that our thread scheduler should be split
into two layers, looks like Solaris did, the sched_ule really should
only be responsible for CPU dispatching, it only cares where a thread
should be dispatched based on CPU-affinity, each CPU's load, ...
Another layer is how to calculate thread's priority for time-sharing
thread, you can specify which priority algorithm to used, static
or dynamic priority scheduling, ULE's algorithm or 4BSD.
with cpuset, one even can bind all real-time processes to a specific
cpu group, they needn't to be a superuser to run real-time thread.

Regards,
David Xu




___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscr...@freebsd.org"


Re: PostgreSQL performance scaling

2010-11-22 Thread David Xu

Mark Felder wrote:

I recommend posting this on the Postgres performance list, too.




Regards,


Mark


I think if PostgreSQL uses semaphore for inter-process locking,
it might be a good idea to use POSIX semaphore exits in our head
branch, the new POSIX semaphore implementation now supports
process-shared, and is more light weight than SYSV semaphore,
if there is no contention, a process need not enter kernel to
acquire/release a lock. Note that I have just fixed a bug in head
branch. However RELENG_8 does not support process-shared semaphore
yet.

Regards,
David Xu

___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscr...@freebsd.org"


Re: PostgreSQL performance scaling

2010-11-22 Thread David Xu

Ivan Voras wrote:

On 11/23/10 01:26, Ivan Voras wrote:

On 11/22/10 17:37, David Xu wrote:

Mark Felder wrote:

I recommend posting this on the Postgres performance list, too.




Regards,


Mark


I think if PostgreSQL uses semaphore for inter-process locking,
it might be a good idea to use POSIX semaphore exits in our head
branch, the new POSIX semaphore implementation now supports
process-shared, and is more light weight than SYSV semaphore,
if there is no contention, a process need not enter kernel to
acquire/release a lock. Note that I have just fixed a bug in head
branch. However RELENG_8 does not support process-shared semaphore
yet.


Another thing might be that, despite that they appear to try to avoid
it, they possibly have a large number of processes hanging on the same
semaphore, leading to thundering herd problem.

There already is code for POSIX semaphores in PostgreSQL. It requires
some manual fiddling with the configuration to enable
(USE_UNNAMED_POSIX_SEMAPHORES).

However, I've just tried it on 9-CURRENT and it doesn't work:

Nov 23 01:23:02 biggie postgres[1515]: [1-1] FATAL: sem_init failed: No
space left on device


Ok, I've found the p1003_1b.sem_nsems_max sysctl.

It seems to help when used instead of sysv semaphores, but very little:

sysv semaphores:

-c#result
433549
864864
1279491
1679887
2066957
2452576
2850406
3249491
4045535
5039499
7529415

posix semaphores:

1679125
2070061
2455620

After 20 clients, sys time goes sharply up like before

 procs  memory  pagedisks faults   cpu
 r b w avmfre   flt  re  pi  pofr  sr mf0 mf1   in   sy cs 
us sy id
27 32 0  11887M  3250M 62442   0   0   0 0   0   0   0   10 255078 
109047 18 73 10
30 32 0  11887M  3162M 58165   0   0   012   0   0   17 272540 
114416 17 75  9
29 32 0  11887M  3105M 57487   0   0   0 0   0   0   08 279475 
117891 15 75 10
16 31 0  11887M  3063M 59215   0   0   0 0   0   0   06 295342 
121090 16 70 13



and the overall behaviour is similar - the processes spend a lot of time 
in "sbwait" and "ksem" states.



Strange, the POSIX semaphore in head branch does not use ksem, it is
based on umtx, there is no limit on POSIX semaphore, the only limit
is process's address space which limits how many semaphores can be
used.




___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscr...@freebsd.org"


Re: PostgreSQL performance scaling

2010-11-22 Thread David Xu

Ivan Voras wrote:

On 23 November 2010 10:35, David Xu  wrote:

Ivan Voras wrote:



and the overall behaviour is similar - the processes spend a lot of time
in "sbwait" and "ksem" states.


Strange, the POSIX semaphore in head branch does not use ksem, it is
based on umtx, there is no limit on POSIX semaphore, the only limit
is process's address space which limits how many semaphores can be
used.


*shrug*; I don't know how it could be wrong - this PostgreSQL was
built from ports after I upgraded & booted 9-current.

If it didn't use POSIX semaphores from HEAD, shared semaphores
wouldn't have worked, right?



It may work, but even it is shared in memory, it still enters
kernel to do P/V operation.



___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscr...@freebsd.org"


Re: Scaling and performance issues with FreeBSD 9 (& 10) on 4 socket systems

2013-06-13 Thread David Xu

On 2013/06/13 20:01, Remy Nonnenmacher wrote:


On 06/13/13 13:32, Mark Felder wrote:

On Wed, 12 Jun 2013 17:58:49 -0500, David O'Brien 
wrote:


We found FreeBSD 8.4 to perform better than FreeBSD 9.1, and Linux
considerably better than both on the same machine.


http://svnweb.freebsd.org/base?view=revision&revision=241246

The above link is likely why 8.4 is better than 9.1 on the same machine.


We've tried various things and haven't been able to explain why FreeBSD
isn't scaling on the new hardware.  Nor why it performs so much worse
than FreeBSD on the older "M2" machines.


The CPUs between those machines are quite different. I'm sure we're
looking at different cache sizes, different behavior for the
hyperthreading, etc. I'm sure others would be greatly interested in you
providing the same benchmark results for a recent snapshot of HEAD as
well.
___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to
"freebsd-performance-unsubscr...@freebsd.org"


We had same problem on 4x12 cores (AMD) machines. After investigating
using hwpmc, it appears that performance was killed by a scheduler
function trying to find "least used cpu" that unfortunately works on
contended structures (ie: lots a cores are fighting to get works). A
solution was found by using artificially long queue of stuck process
(steal_thresh bumped to over 8) and by cpu affinity crafting.

Was a year ago and from my memory. I guess you may give a try to see if
it helps.

Disregard is a scheduler specialist contradicts.

Thanks.



AMD's cache is very different than Intel, AFAIK eariler than Bulldozer, 
AMD's L3 is exclusive cache, util Bulldozer, AMD describes the L3 cache 
as a “non-inclusive victim cache”, it is still different than Intel 
which is inclusive.


"- In sched_pickcpu() change general logic of CPU selection. First
look for idle CPU, sharing last level cache with previously used one,
skipping SMT CPU groups. If none found, search all CPUs for the least loaded
one, where the thread with its priority can run now. If none found, search
just for the least loaded CPU."

For exclusive cache, the L3 has second-hand data, not hot data, when a 
thread is migrated, will have negative effect, its hot data is lost.

I'd prefer to search idle CPU from L2, then L3.


___
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscr...@freebsd.org"