On Sun, Nov 10, 2002 at 06:52:02PM +0200, Eran Tromer wrote:

> Hmmm. Then if the scheduler is unaware of SMT, then even on a
> single-processor box SMT may degrade performance due to memory cache
> issues -- when two unrelated threads are executed in parallel, the
> effective size of the L1 and L2 caches is halved. With today's processor
> vs. DRAM speed difference, this may be significant. So even large
> caches, such as Xeon's, may be better used without SMT for certain
> workloads (namely, CPU-intensive computation with large work sets).

According to this,
http://www.arstechnica.com/paedia/h/hyperthreading/hyperthreading-5.html,
for some workloads, HT _will_ degrade performance. 

Quoting: 

"Each of the Xeon's caches--the trace cache, L1, L2, and L3--is
SMT-unaware, and each treats all loads and stores the same regardless
of which logical processor issued the request. So none of the caches
know the difference between one logical processor and another, or
between code from one thread or another. This means that one executing
thread can monopolize virtually the entire cache if it wants to, and
the cache, unlike the processor's scheduling queue, has no way of
forcing that thread to cooperate intelligently with the other
executing thread. The processor itself will continue trying to run
both threads, though, issuing fetches from each one. This means that,
in a worst-case scenario where the two running threads have two
completely different memory reference patterns (i.e. they're accessing
two completely different areas of memory and sharing no data at all)
the cache will begin thrashing as data for each thread is alternately
swapped in and out and bus and cache bandwidth are maxed out. "

> It would be neat if the scheduler knew it's better to run my Apache
> processes on the first CPU and my MySQL processes on the second
> (assuming the loads are equal), thereby avoiding duplication of the
> shared memory pages. Does it?

I don't know, but I couldn't find any code that says it does. The only
relevant thing i saw, in kernel/sched.c, load_balance():

[when looking to balance the lists of tasks, one list per cpu] 

        /*
         * We do not migrate tasks that are:
         * 1) running (obviously), or
         * 2) cannot be migrated to this CPU due to cpus_allowed, or
         * 3) are cache-hot on their current CPU.
         */

So shared page tables or "common parent" do not seem to enter the
equation. You could do it manually with 'cpus_allowed', though. 
-- 
Muli Ben-Yehuda                             http://www.mulix.org/
[EMAIL PROTECTED]:~$ sctrace strace /bin/foo  http://syscalltrack.sf.net/
Quis custodes ipsos custodiet?              http://www.mulix.org/cv.html

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to