Re: [RFC][CFT] GEOM direct dispatch and fine-grained CAM locking

2013-09-06 Thread Jeremie Le Hen
On Fri, Sep 06, 2013 at 12:46:27AM +0200, Olivier Cochard-Labbé wrote:
> On Thu, Sep 5, 2013 at 11:38 PM, Alexander Motin  wrote:
> > I've found and fixed possible double request completion, that could cause
> > such symptoms if happened. Updated patch located as usual:
> > http://people.freebsd.org/~mav/camlock_patches/camlock_20130905.patch
> >
> 
> Good catch!
> this new patch (applied to r255188) fix the problem on my laptop.

With this new one I cannot boot any more (I also updated the source
tree).  This is a hand transcripted version:

Trying to mount root from zfs:zroot/root []...
panic: Batch flag already set
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper()
kdb_backtrace()
vpanic()
kassert_panic()
xpt_batch_start()
ata_interrupt()
softclock_call_cc()
softclock()
ithread_loop()
fork_exit()
fork_trampoline()


-- 
Jeremie Le Hen

Scientists say the world is made up of Protons, Neutrons and Electrons.
They forgot to mention Morons.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Again about pbuf_mtx

2013-09-06 Thread Alexander Motin

On 05.09.2013 15:40, Alexander Motin wrote:

Some may remember that not so long ago I complained about high lock
congestion on pbuf_mtx. At that time switching the mutex to padalign
reduced the problem. But now after improving scalability in CAM and GEOM
and doing more then half million IOPS on 32-core system I again heavily
hit that problem -- hwpmc shows about 30% of CPU time spent on that
mutex spinning and another 30% of time spent on attempt of threads to go
to sleep on that mutex and getting more collisions there.

Trying to mitigate that I've made a patch
(http://people.freebsd.org/~mav/pcpu_pbuf.patch) to split single queue
of pbufs into several. That definitely cost some amount of KVA and
memory, but on my tests it fixes problem redically, removing any
measurable congestion there. The patch is not complete and don't even
boot on i386 now, but I would like to hear opinions about the approach,
or may be some better propositions.


On kib@ proposition I've tried to reimplement that patch using vmem(9). 
Code indeed looks much better (at least looked before workarounds):

http://people.freebsd.org/~mav/pbuf_vmem.patch
and it works fast, but I have found number of problems:
 - now we have only 256 (or even less) pbufs and UMA used by vmem for 
quick caches tend to allocate up to 256 items per CPU and never release 
them back. I've partially workarounded that by passing fake MAGIC_SIZE 
value to vmem and down to UMA as size to make initial bucket sizes 
smaller, but that is a hack and not always sufficient since size may 
grow on congestion and again never shrink back.
 - UMA panics with "uma_zalloc: Bucket pointer mangled." if I am giving 
vmem zero as valid pointer. I've workarounded that by adding an offset 
to the value, but I think that assertion in UMA should be removed if we 
are going to use it for abstract values now.



Another patch I've made
(http://people.freebsd.org/~mav/si_threadcount.patch) removes lock
acquisition from dev_relthread() by using atomics for reference
counting. That fixes another congestion I see. This patch looks fine to
me and the only congestion I see after that is on HBA driver locks, but
may be I am missing something?


--
Alexander Motin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: [RFC][CFT] GEOM direct dispatch and fine-grained CAM locking

2013-09-06 Thread Alexander Motin

On 06.09.2013 11:06, Jeremie Le Hen wrote:

On Fri, Sep 06, 2013 at 12:46:27AM +0200, Olivier Cochard-Labbé wrote:

On Thu, Sep 5, 2013 at 11:38 PM, Alexander Motin  wrote:

I've found and fixed possible double request completion, that could cause
such symptoms if happened. Updated patch located as usual:
http://people.freebsd.org/~mav/camlock_patches/camlock_20130905.patch


With this new one I cannot boot any more (I also updated the source
tree).  This is a hand transcripted version:

Trying to mount root from zfs:zroot/root []...
panic: Batch flag already set
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper()
kdb_backtrace()
vpanic()
kassert_panic()
xpt_batch_start()
ata_interrupt()
softclock_call_cc()
softclock()
ithread_loop()
fork_exit()
fork_trampoline()


Thank you for the report. I see my fault. It is probably specific to 
ata(4) driver only. I've workarounded that in new patch version, but 
probably that area needs some rethinking.


http://people.freebsd.org/~mav/camlock_patches/camlock_20130906.patch

--
Alexander Motin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


need hint about build environment screwup

2013-09-06 Thread Gary Aitken
In trying to build
  /usr/ports/sysutils/lsof 
I get the following:

In file included from /usr/include/_ctype.h:94,
 from /usr/include/ctype.h:46,
 from lsof.h:49,
 from dmnt.c:39:
/usr/include/runetype.h:92: error: expected '=', ',', ';', 'asm' or 
'__attribute__' before 'const'
/usr/include/runetype.h: In function '__getCurrentRuneLocale':
/usr/include/runetype.h:96: error: '_ThreadRuneLocale' undeclared (first use in 
this function)
/usr/include/runetype.h:96: error: (Each undeclared identifier is reported only 
once
/usr/include/runetype.h:96: error: for each function it appears in.)

I'm running 9.1 RELEASE on an amd64 with a recently updated ports tree, 
and recently upgraded to using pkgng
/etc/make.conf contains
  WITH_PKGNG=yes

I suspect something is wrong with the build environment, but haven't a clue.
I'm trying to avoid rebuilding everything, and would like to find out what's
screwed up.
Since lsof basically doesn't depend on anything else, I'm baffled as to 
what got messed up.

Can someone give me a hint?

Thanks,

Gary
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: [RFC][CFT] GEOM direct dispatch and fine-grained CAM locking

2013-09-06 Thread Jeremie Le Hen
On Fri, Sep 06, 2013 at 11:29:11AM +0300, Alexander Motin wrote:
> On 06.09.2013 11:06, Jeremie Le Hen wrote:
> > On Fri, Sep 06, 2013 at 12:46:27AM +0200, Olivier Cochard-Labbé wrote:
> >> On Thu, Sep 5, 2013 at 11:38 PM, Alexander Motin  wrote:
> >>> I've found and fixed possible double request completion, that could cause
> >>> such symptoms if happened. Updated patch located as usual:
> >>> http://people.freebsd.org/~mav/camlock_patches/camlock_20130905.patch
> >>>
> > With this new one I cannot boot any more (I also updated the source
> > tree).  This is a hand transcripted version:
> >
> > Trying to mount root from zfs:zroot/root []...
> > panic: Batch flag already set
> > cpuid = 1
> > KDB: stack backtrace:
> > db_trace_self_wrapper()
> > kdb_backtrace()
> > vpanic()
> > kassert_panic()
> > xpt_batch_start()
> > ata_interrupt()
> > softclock_call_cc()
> > softclock()
> > ithread_loop()
> > fork_exit()
> > fork_trampoline()
> 
> Thank you for the report. I see my fault. It is probably specific to 
> ata(4) driver only. I've workarounded that in new patch version, but 
> probably that area needs some rethinking.
> 
> http://people.freebsd.org/~mav/camlock_patches/camlock_20130906.patch

I'm not sure you needed a confirmation, but it boots.  Thanks :).

I didn't quite understand the thread; is direct dispatch enabled for
amd64?  ISTR you said only i386 but someone else posted the macro for
amd64.

-- 
Jeremie Le Hen

Scientists say the world is made up of Protons, Neutrons and Electrons.
They forgot to mention Morons.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Glitch in ctfconvert

2013-09-06 Thread Shrikanth Kamath
There is a glitch with ctfconvert builds the .SUNW_ctf section. It affects
debugging the kernel modules with FBT provider of DTrace.

I observe that the CTF sections built for the kernel modules have a
problem if module symtab stripped or if the symbol table has symbols
reordered. This messes up the FBT probes and shows wrong function name against a
set of arguments.

I presume the ctf mapping of a function to its arguments is done this way after
looking at ctfdump code.

func_name from symtab   arguments dump from ctf_data_t
 ^^
 |___symidxctfdump_|

The detais are fetched from two different places. So when ctfconvert is run the
function arguments are mapped with a particular symbol order.

Post the linker stage the symbols may get reordered. Or if a strip utility is
run the symtab may be removed completely.

When first ctfconvert is run on module.kld
symbol_X (idx 1)<-> [args set a in ctf_data_t]
symbol_Y (idx 2)<-> [args set b in ctf_data_t]

if symbols get re-arranged post the linker stage

symbol_Y (idx 1)<-> [args set a in ctf_data_t]
symbol_X (idx 2)<-> [args set b in ctf_data_t]

which means symbol_Y now has args set of symbol_X
Or if 'strip' is run it has totally junk shown against symbol_Y and symbol_X.

Overall this affects when we do a Function Boundary Tracing on the functions to
inspect arguments.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: [RFC][CFT] GEOM direct dispatch and fine-grained CAM locking

2013-09-06 Thread Alexander Motin

On 07.09.2013 02:02, Jeremie Le Hen wrote:

On Fri, Sep 06, 2013 at 11:29:11AM +0300, Alexander Motin wrote:

On 06.09.2013 11:06, Jeremie Le Hen wrote:

On Fri, Sep 06, 2013 at 12:46:27AM +0200, Olivier Cochard-Labbé wrote:

On Thu, Sep 5, 2013 at 11:38 PM, Alexander Motin  wrote:

I've found and fixed possible double request completion, that could cause
such symptoms if happened. Updated patch located as usual:
http://people.freebsd.org/~mav/camlock_patches/camlock_20130905.patch


With this new one I cannot boot any more (I also updated the source
tree).  This is a hand transcripted version:

Trying to mount root from zfs:zroot/root []...
panic: Batch flag already set
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper()
kdb_backtrace()
vpanic()
kassert_panic()
xpt_batch_start()
ata_interrupt()
softclock_call_cc()
softclock()
ithread_loop()
fork_exit()
fork_trampoline()


Thank you for the report. I see my fault. It is probably specific to
ata(4) driver only. I've workarounded that in new patch version, but
probably that area needs some rethinking.

http://people.freebsd.org/~mav/camlock_patches/camlock_20130906.patch


I'm not sure you needed a confirmation, but it boots.  Thanks :).

I didn't quite understand the thread; is direct dispatch enabled for
amd64?  ISTR you said only i386 but someone else posted the macro for
amd64.


Yes, it is enabled for amd64. I've said x86, meaning both i386 and amd64.

--
Alexander Motin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"