unofficial pkgng repository at http://mirror.exonetric.net/pub/pkgng

2013-02-03 Thread Mark Blackman
Hi,

I'm just writing to let anyone know, who might find it useful, that we 
have set-up an unofficial but public pkgng repository available at

http://mirror.exonetric.net/pub/pkgng

until official pkgng format packages are available.

With the able assistance of Gavin Atkinson, who managed all the package
building for us, we currently have pkgng format packages for FreeBSD-8
and FreeBSD-9, for both i386 and amd64 kernels. These packages have no
particular blessing from the FreeBSD project, but as far as we know,
they don't mind either.

To use these packages, just set your PACKAGESITE variable in 
/usr/local/etc/pkg.conf like so, 

PACKAGESITE : http://mirror.exonetric.net/pub/pkgng/${ABI}/latest

Gavin is also working on a FreeBSD-10 set of packages, but those are a
few days or so away at this point.

Please let us know about any issues you find as well, of course.

Cheers,
Mark Blackman 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CLANG 3.2 breaks security/pam_ssh_agent_auth on stable/9

2013-02-03 Thread Chris Rees
On 3 February 2013 03:55, Kimmo Paasiala  wrote:
> On Sun, Feb 3, 2013 at 4:06 AM, Mark Linimon  wrote:
>> On Fri, Feb 01, 2013 at 11:53:03AM -0600, Brooks Davis wrote:
>>> I'm not sure why I'm being jumped on in this weeks old report of a
>>> now-fixed problem.
>>
>> I'm sorry, I'm that far behind in email.  I did not realize the problem
>> had already been solved.
>>
>> More often than not the problem is simply "thrown over the fence" for
>> the ports team to deal with.
>>
>> mcl
>
> There is no PR yet with my fix and therefor no commit to ports tree
> that would fix the problem. I'll file a PR soon (TM).

The problem was in base, and is fixed there.

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ports and WITH_LIBCPLUSPLUS

2013-02-03 Thread Dimitry Andric

On 2013-02-03 11:39, Andreas Nilsson wrote:

I wanted to try the new c++ stuff, ie clang-3.2, libc++ and libcxxrt, so I
used poudriere to build a jail setup for that ( WITH_LIBCPLUSPLUS=yes in
src.conf, CXXFLAGS+=-stdlib=libc++ and libsupc++.so.1  libcxxrt.so.1 in
libmap.conf ), and started to build my normal set of packages ( see
desktop.list ). Please note that I also have WITH_NEW_XORG=yes and
WITH_KMS=yes, as well as using the devel xorg repo.

First Great work for moving FreeBSD towards a more modern c++ world!

Results:
Some stuff works, some don't ;) Some may be due to clang and not just
libc++. Here is a list of packages that fails:

audio/libofa
databases/akonadi
devel/binutils
devel/kBuild
devel/libftdi
devel/libplist
devel/qca
devel/qt4-qdbusviewer
devel/qt4-script
emulators/qemu-devel
graphics/cairo
graphics/graphite2
graphics/libfpx
graphics/opencv-core
graphics/podofo
graphics/vigra
lang/sbcl
math/cln
net-im/libmsn
net-p2p/libtorrent
net/hupnp
net/ns3
net/xmlrpc-c-devel
science/openbabel
security/pinentry
sysutils/fusefs-kmod
sysutils/synergy
textproc/clucene
x11/nvidia-driver


Thanks for trying this out.  Is there also a list of ports that *do*
compile (and hopefully run) successfully? :-)

I expect there will be quite a few ports that are very difficult to get
building and running properly with libc++.  Some programs rely on
undocumented (or half-documented) libstdc++ internals, or on behaviour
specific to libstdc++.

Also, it would not suprise me if there are programs that even depend on
a very specific version of libstdc++, and its accompanying version of
gcc.

In short, we need some sort of system to specify the general C++ library
to use by default (e.g. in make.conf or ports.conf), and for ports that
only work with a very specific version, a variable similar to USE_GCC,
for example:

USE_LIBSTDCXX=4.6+



where at least ns3 can be ignored ( I created that port myself ). I think
nvidia-driver and fusefs-kmod now fails due to -stdlib=libc++ in LDFLAGS,
normally fusefs-kmod just fails install-phase due to missing pkg-message
file.

I saved the workdirs for poudriere, and they are available at
http://benriach.widell.net/~andrnils/libc++/ , both as individual tarballs
and one tarball that includes all the others. There is also the lists of
packages I try to build, as well as the ones that fails.


One general question: How am I supposed to include -stdlib=libc++ in
LDFLAGS just for c++? Having -stdlib=libc++ in LDFLAGS causes c compiles to
fail linking with "ld: unrecognized option '-stdlib=libc++'"


This is as yet an unsolved problem, as LDFLAGS is the same for both C
and C++ link jobs.  I think the easiest way would be to set your CXX
variable to:

CXX=cc -stdlib=libc++

The -std=c++11 flag should only have to be specified in CXXFLAGS.  I
don't think it has any influence on linking.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ports and WITH_LIBCPLUSPLUS

2013-02-03 Thread Andreas Nilsson
On Sun, Feb 3, 2013 at 2:28 PM, Dimitry Andric  wrote:

> On 2013-02-03 11:39, Andreas Nilsson wrote:
>
>> I wanted to try the new c++ stuff, ie clang-3.2, libc++ and libcxxrt, so I
>> used poudriere to build a jail setup for that ( WITH_LIBCPLUSPLUS=yes in
>> src.conf, CXXFLAGS+=-stdlib=libc++ and libsupc++.so.1  libcxxrt.so.1 in
>> libmap.conf ), and started to build my normal set of packages ( see
>> desktop.list ). Please note that I also have WITH_NEW_XORG=yes and
>> WITH_KMS=yes, as well as using the devel xorg repo.
>>
>> First Great work for moving FreeBSD towards a more modern c++ world!
>>
>> Results:
>> Some stuff works, some don't ;) Some may be due to clang and not just
>> libc++. Here is a list of packages that fails:
>>
>> audio/libofa
>> databases/akonadi
>> devel/binutils
>> devel/kBuild
>> devel/libftdi
>> devel/libplist
>> devel/qca
>> devel/qt4-qdbusviewer
>> devel/qt4-script
>> emulators/qemu-devel
>> graphics/cairo
>> graphics/graphite2
>> graphics/libfpx
>> graphics/opencv-core
>> graphics/podofo
>> graphics/vigra
>> lang/sbcl
>> math/cln
>> net-im/libmsn
>> net-p2p/libtorrent
>> net/hupnp
>> net/ns3
>> net/xmlrpc-c-devel
>> science/openbabel
>> security/pinentry
>> sysutils/fusefs-kmod
>> sysutils/synergy
>> textproc/clucene
>> x11/nvidia-driver
>>
>
> Thanks for trying this out.  Is there also a list of ports that *do*
> compile (and hopefully run) successfully? :-)
>
I didn't save the output for successful packages, but looking through just
the files gives libcpp.built at same URL.

>
> I expect there will be quite a few ports that are very difficult to get
> building and running properly with libc++.  Some programs rely on
> undocumented (or half-documented) libstdc++ internals, or on behaviour
> specific to libstdc++.
>
> Also, it would not suprise me if there are programs that even depend on
> a very specific version of libstdc++, and its accompanying version of
> gcc.
>
> In short, we need some sort of system to specify the general C++ library
> to use by default (e.g. in make.conf or ports.conf), and for ports that
> only work with a very specific version, a variable similar to USE_GCC,
> for example:
>
> USE_LIBSTDCXX=4.6+


Sounds good :) Perhaps something like
if (( CLANG_IS_CC || CC==clang ) && WITH_LIBCPLUSPLUS )
CXXFLAGS+=-stdlib=libc++
endif
could be added to correct mk-file

>
>
>
>  where at least ns3 can be ignored ( I created that port myself ). I think
>> nvidia-driver and fusefs-kmod now fails due to -stdlib=libc++ in LDFLAGS,
>> normally fusefs-kmod just fails install-phase due to missing pkg-message
>> file.
>>
>> I saved the workdirs for poudriere, and they are available at
>> http://benriach.widell.net/~**andrnils/libc++/,
>>  both as individual tarballs
>> and one tarball that includes all the others. There is also the lists of
>> packages I try to build, as well as the ones that fails.
>>
>>
>> One general question: How am I supposed to include -stdlib=libc++ in
>> LDFLAGS just for c++? Having -stdlib=libc++ in LDFLAGS causes c compiles
>> to
>> fail linking with "ld: unrecognized option '-stdlib=libc++'"
>>
>
> This is as yet an unsolved problem, as LDFLAGS is the same for both C
> and C++ link jobs.  I think the easiest way would be to set your CXX
> variable to:
>
> CXX=cc -stdlib=libc++
>
Wouldn't CXX=CC -stdlib=libc++ be more appropriate, as cc is for c and CC
for c++, or have that convention gone away?


>
> The -std=c++11 flag should only have to be specified in CXXFLAGS.  I
> don't think it has any influence on linking.
>

My adding them to LDFLAGS comes from
https://wiki.freebsd.org/NewC++Stackwhere it says "Add -stdlib=libc++
to your compile and link flags..." It
actually made a bunch of the qt4- packages build, they wouldn't without it.

I guess a wiki page tracking the failing packages would be good, but I
couldn't get the hang of creating a page there :(

Best regards
Andreas
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ports and WITH_LIBCPLUSPLUS

2013-02-03 Thread Dimitry Andric

On 2013-02-03 14:56, Andreas Nilsson wrote:

On Sun, Feb 3, 2013 at 2:28 PM, Dimitry Andric  wrote:

...

This is as yet an unsolved problem, as LDFLAGS is the same for both C
and C++ link jobs.  I think the easiest way would be to set your CXX
variable to:

CXX=cc -stdlib=libc++


Wouldn't CXX=CC -stdlib=libc++ be more appropriate, as cc is for c and CC
for c++, or have that convention gone away?


Sorry, I should have taken one more cup of coffee. :-)  'CC' is not
really recommended anymore, just use:

CXX=c++ -stdlib=libc++



My adding them to LDFLAGS comes from
https://wiki.freebsd.org/NewC++Stackwhere it says "Add -stdlib=libc++
to your compile and link flags..."


Yes, that advice is just fine, but in some cases you cannot influence
the link flags used only for C++ linking.  In those cases, you will have
to trick the build system into doing so.

In most cases (but probably not all), this can be done by adding the
required flags to ${CXX}.



It
actually made a bunch of the qt4- packages build, they wouldn't without it.

I guess a wiki page tracking the failing packages would be good, but I
couldn't get the hang of creating a page there :(


This would better be done by a normal exp-run procedure, but those are
offline for the time being.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ports and WITH_LIBCPLUSPLUS

2013-02-03 Thread Andreas Nilsson
On Sun, Feb 3, 2013 at 3:25 PM, Dimitry Andric  wrote:

> On 2013-02-03 14:56, Andreas Nilsson wrote:
>
>> On Sun, Feb 3, 2013 at 2:28 PM, Dimitry Andric  wrote:
>>
> ...
>
>  This is as yet an unsolved problem, as LDFLAGS is the same for both C
>>> and C++ link jobs.  I think the easiest way would be to set your CXX
>>> variable to:
>>>
>>> CXX=cc -stdlib=libc++
>>>
>>>  Wouldn't CXX=CC -stdlib=libc++ be more appropriate, as cc is for c and
>> CC
>> for c++, or have that convention gone away?
>>
>
> Sorry, I should have taken one more cup of coffee. :-)  'CC' is not
> really recommended anymore, just use:
>
> CXX=c++ -stdlib=libc++
>

Ah, yes. To little coffee here as well.


>
>  My adding them to LDFLAGS comes from
>> https://wiki.freebsd.org/NewC+**+Stackwhereit
>>  says "Add -stdlib=libc++
>>
>> to your compile and link flags..."
>>
>
> Yes, that advice is just fine, but in some cases you cannot influence
> the link flags used only for C++ linking.  In those cases, you will have
> to trick the build system into doing so.
>
> In most cases (but probably not all), this can be done by adding the
> required flags to ${CXX}.


Good point.

>
>
>
>  It
>> actually made a bunch of the qt4- packages build, they wouldn't without
>> it.
>>
>> I guess a wiki page tracking the failing packages would be good, but I
>> couldn't get the hang of creating a page there :(
>>
>
> This would better be done by a normal exp-run procedure, but those are
> offline for the time being.
>

On a side note: I forget the actual logs. They are now available as
logs.tgz at same base URL.

Best regards
Andreas
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CLANG 3.2 breaks security/pam_ssh_agent_auth on stable/9

2013-02-03 Thread Kimmo Paasiala
On Sun, Feb 3, 2013 at 11:57 AM, Chris Rees  wrote:
> On 3 February 2013 03:55, Kimmo Paasiala  wrote:
>> On Sun, Feb 3, 2013 at 4:06 AM, Mark Linimon  wrote:
>>> On Fri, Feb 01, 2013 at 11:53:03AM -0600, Brooks Davis wrote:
 I'm not sure why I'm being jumped on in this weeks old report of a
 now-fixed problem.
>>>
>>> I'm sorry, I'm that far behind in email.  I did not realize the problem
>>> had already been solved.
>>>
>>> More often than not the problem is simply "thrown over the fence" for
>>> the ports team to deal with.
>>>
>>> mcl
>>
>> There is no PR yet with my fix and therefor no commit to ports tree
>> that would fix the problem. I'll file a PR soon (TM).
>
> The problem was in base, and is fixed there.
>
> Chris

I missed that fix if it was posted here, can someone point me to the
commit that fixed the issue?

-Kimmo
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: NFS-exported ZFS instability

2013-02-03 Thread Andriy Gapon
on 30/01/2013 00:44 Andriy Gapon said the following:
> on 29/01/2013 23:44 Hiroki Sato said the following:
>>   http://people.allbsd.org/~hrs/FreeBSD/pool-20130130.txt
>>   http://people.allbsd.org/~hrs/FreeBSD/pool-20130130-info.txt
> 
[snip]
> See tid 100153 (arc reclaim thread), tid 100105 (pagedaemon) and tid 100639
> (nfsd in kmem_back).
> 

I decided to write a few more words about this issue.

I think that the root cause of the problem is that ZFS ARC code performs memory
allocations with M_WAITOK while holding some ARC lock(s).

If a thread runs into such an allocation when a system is very low on memory
(even for a very short period of time), then the thread is going to be blocked
(to sleep in more exact terms) in VM_WAIT until a certain amount of memory is
freed.  To be more precise until v_free_count + v_cache_count goes above 
v_free_min.
And quoting from the report:
db> show page
cnt.v_free_count: 8842
cnt.v_cache_count: 0
cnt.v_inactive_count: 0
cnt.v_active_count: 169
cnt.v_wire_count: 6081952
cnt.v_free_reserved: 7981
cnt.v_free_min: 38435
cnt.v_free_target: 161721
cnt.v_cache_min: 161721
cnt.v_inactive_target: 242581

In this case tid 100639 is the thread:
Tracing command nfsd pid 961 tid 100639 td 0xfe0027038920
sched_switch() at sched_switch+0x17a/frame 0xff86ca5c9c80
mi_switch() at mi_switch+0x1f8/frame 0xff86ca5c9cd0
sleepq_switch() at sleepq_switch+0x123/frame 0xff86ca5c9d00
sleepq_wait() at sleepq_wait+0x4d/frame 0xff86ca5c9d30
_sleep() at _sleep+0x3d4/frame 0xff86ca5c9dc0
kmem_back() at kmem_back+0x1a3/frame 0xff86ca5c9e50
kmem_malloc() at kmem_malloc+0x1f8/frame 0xff86ca5c9ea0
uma_large_malloc() at uma_large_malloc+0x4a/frame 0xff86ca5c9ee0
malloc() at malloc+0x14d/frame 0xff86ca5c9f20
arc_get_data_buf() at arc_get_data_buf+0x1f4/frame 0xff86ca5c9f60
arc_read_nolock() at arc_read_nolock+0x208/frame 0xff86ca5ca010
arc_read() at arc_read+0x93/frame 0xff86ca5ca090
dbuf_read() at dbuf_read+0x452/frame 0xff86ca5ca150
dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x16a/frame
0xff86ca5ca1e0
dmu_buf_hold_array() at dmu_buf_hold_array+0x67/frame 0xff86ca5ca240
dmu_read_uio() at dmu_read_uio+0x3f/frame 0xff86ca5ca2a0
zfs_freebsd_read() at zfs_freebsd_read+0x3e9/frame 0xff86ca5ca3b0
nfsvno_read() at nfsvno_read+0x2db/frame 0xff86ca5ca490
nfsrvd_read() at nfsrvd_read+0x3ff/frame 0xff86ca5ca710
nfsrvd_dorpc() at nfsrvd_dorpc+0xc9/frame 0xff86ca5ca910
nfssvc_program() at nfssvc_program+0x5da/frame 0xff86ca5caaa0
svc_run_internal() at svc_run_internal+0x5fb/frame 0xff86ca5cabd0
svc_thread_start() at svc_thread_start+0xb/frame 0xff86ca5cabe0

Sleeping in VM_WAIT while holding the ARC lock(s) means that other ARC
operations may get blocked.  And pretty much all ZFS I/O goes through the ARC.
So that's why we see all those stuck nfsd threads.

Another factor greatly contributing to the problem is that currently the page
daemon blocks (sleeps) in arc_lowmem (a vm_lowmem hook) waiting for the ARC
reclaim thread to make a pass.  This happens before the page daemon makes its
own pageout pass.

But because tid 100639 holds the ARC lock(s), ARC reclaim thread gets blocked
and can not make any forward progress.  Thus the page daemon also gets blocked.
And thus the page daemon can not free up any pages.


So, this situation is not a true deadlock.  E.g. it is theoretically possible
that some other threads would free some memory at their own will and the
condition would clear up.  But in practice this is highly unlikely.

Some possible resolutions that I can think of.

The best one is probably doing ARC memory allocations without holding any locks.

Also, maybe we should make a rule that no vm_lowmem hooks should sleep.  That
is, arc_lowmem should signal the ARC reclaim thread to do some work, but should
not wait on it.

Perhaps we could also provide a mechanism to mark certain memory allocations as
"special" and use that mechanism for ARC allocations.  So that VM_WAIT unblocks
sooner: in this case we had 8842 free pages (~35MB), but thread 100639 was not
waken up.

I think that ideally we should do something about all the three directions.
But even one of them might turn out to be sufficient.
As I've said, the first one seems to be the most promising, but it would require
some tricky programming (flags and retries?) to move memory allocations out of
locked sections.
-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CLANG 3.2 breaks security/pam_ssh_agent_auth on stable/9

2013-02-03 Thread Stefan Bethke

Am 03.02.2013 um 10:57 schrieb Chris Rees :

> On 3 February 2013 03:55, Kimmo Paasiala  wrote:
>> 
>> There is no PR yet with my fix and therefor no commit to ports tree
>> that would fix the problem. I'll file a PR soon (TM).
> 
> The problem was in base, and is fixed there.

Huh? With -current r246283, I still get a segfault from sudo unless I have 
Kimmo's patch.

Is there some confusion about which problem is addressed by Kimmo's patch?


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CLANG 3.2 breaks security/pam_ssh_agent_auth on stable/9

2013-02-03 Thread Chris Rees
On 3 February 2013 17:15, Stefan Bethke  wrote:
>
> Am 03.02.2013 um 10:57 schrieb Chris Rees :
>
>> On 3 February 2013 03:55, Kimmo Paasiala  wrote:
>>>
>>> There is no PR yet with my fix and therefor no commit to ports tree
>>> that would fix the problem. I'll file a PR soon (TM).
>>
>> The problem was in base, and is fixed there.
>
> Huh? With -current r246283, I still get a segfault from sudo unless I have 
> Kimmo's patch.
>
> Is there some confusion about which problem is addressed by Kimmo's patch?
>

Hm, perhaps it might be necessary then.

Kimmo, please would you submit the patch you had as a PR?  I'm sure
Wesley would appreciate the hint.

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: NFS-exported ZFS instability

2013-02-03 Thread Rick Macklem
Andriy Gapon wrote:
> on 30/01/2013 00:44 Andriy Gapon said the following:
> > on 29/01/2013 23:44 Hiroki Sato said the following:
> >>   http://people.allbsd.org/~hrs/FreeBSD/pool-20130130.txt
> >>   http://people.allbsd.org/~hrs/FreeBSD/pool-20130130-info.txt
> >
> [snip]
> > See tid 100153 (arc reclaim thread), tid 100105 (pagedaemon) and tid
> > 100639
> > (nfsd in kmem_back).
> >
> 
> I decided to write a few more words about this issue.
> 
> I think that the root cause of the problem is that ZFS ARC code
> performs memory
> allocations with M_WAITOK while holding some ARC lock(s).
> 
> If a thread runs into such an allocation when a system is very low on
> memory
> (even for a very short period of time), then the thread is going to be
> blocked
> (to sleep in more exact terms) in VM_WAIT until a certain amount of
> memory is
> freed. To be more precise until v_free_count + v_cache_count goes
> above v_free_min.
> And quoting from the report:
> db> show page
> cnt.v_free_count: 8842
> cnt.v_cache_count: 0
> cnt.v_inactive_count: 0
> cnt.v_active_count: 169
> cnt.v_wire_count: 6081952
> cnt.v_free_reserved: 7981
> cnt.v_free_min: 38435
> cnt.v_free_target: 161721
> cnt.v_cache_min: 161721
> cnt.v_inactive_target: 242581
> 
> In this case tid 100639 is the thread:
> Tracing command nfsd pid 961 tid 100639 td 0xfe0027038920
> sched_switch() at sched_switch+0x17a/frame 0xff86ca5c9c80
> mi_switch() at mi_switch+0x1f8/frame 0xff86ca5c9cd0
> sleepq_switch() at sleepq_switch+0x123/frame 0xff86ca5c9d00
> sleepq_wait() at sleepq_wait+0x4d/frame 0xff86ca5c9d30
> _sleep() at _sleep+0x3d4/frame 0xff86ca5c9dc0
> kmem_back() at kmem_back+0x1a3/frame 0xff86ca5c9e50
> kmem_malloc() at kmem_malloc+0x1f8/frame 0xff86ca5c9ea0
> uma_large_malloc() at uma_large_malloc+0x4a/frame 0xff86ca5c9ee0
> malloc() at malloc+0x14d/frame 0xff86ca5c9f20
> arc_get_data_buf() at arc_get_data_buf+0x1f4/frame 0xff86ca5c9f60
> arc_read_nolock() at arc_read_nolock+0x208/frame 0xff86ca5ca010
> arc_read() at arc_read+0x93/frame 0xff86ca5ca090
> dbuf_read() at dbuf_read+0x452/frame 0xff86ca5ca150
> dmu_buf_hold_array_by_dnode() at
> dmu_buf_hold_array_by_dnode+0x16a/frame
> 0xff86ca5ca1e0
> dmu_buf_hold_array() at dmu_buf_hold_array+0x67/frame
> 0xff86ca5ca240
> dmu_read_uio() at dmu_read_uio+0x3f/frame 0xff86ca5ca2a0
> zfs_freebsd_read() at zfs_freebsd_read+0x3e9/frame 0xff86ca5ca3b0
> nfsvno_read() at nfsvno_read+0x2db/frame 0xff86ca5ca490
> nfsrvd_read() at nfsrvd_read+0x3ff/frame 0xff86ca5ca710
> nfsrvd_dorpc() at nfsrvd_dorpc+0xc9/frame 0xff86ca5ca910
> nfssvc_program() at nfssvc_program+0x5da/frame 0xff86ca5caaa0
> svc_run_internal() at svc_run_internal+0x5fb/frame 0xff86ca5cabd0
> svc_thread_start() at svc_thread_start+0xb/frame 0xff86ca5cabe0
> 
> Sleeping in VM_WAIT while holding the ARC lock(s) means that other ARC
> operations may get blocked. And pretty much all ZFS I/O goes through
> the ARC.
> So that's why we see all those stuck nfsd threads.
> 
> Another factor greatly contributing to the problem is that currently
> the page
> daemon blocks (sleeps) in arc_lowmem (a vm_lowmem hook) waiting for
> the ARC
> reclaim thread to make a pass. This happens before the page daemon
> makes its
> own pageout pass.
> 
> But because tid 100639 holds the ARC lock(s), ARC reclaim thread gets
> blocked
> and can not make any forward progress. Thus the page daemon also gets
> blocked.
> And thus the page daemon can not free up any pages.
> 
> 
> So, this situation is not a true deadlock. E.g. it is theoretically
> possible
> that some other threads would free some memory at their own will and
> the
> condition would clear up. But in practice this is highly unlikely.
> 
> Some possible resolutions that I can think of.
> 
> The best one is probably doing ARC memory allocations without holding
> any locks.
> 
> Also, maybe we should make a rule that no vm_lowmem hooks should
> sleep. That
> is, arc_lowmem should signal the ARC reclaim thread to do some work,
> but should
> not wait on it.
> 
> Perhaps we could also provide a mechanism to mark certain memory
> allocations as
> "special" and use that mechanism for ARC allocations. So that VM_WAIT
> unblocks
> sooner: in this case we had 8842 free pages (~35MB), but thread 100639
> was not
> waken up.
> 
> I think that ideally we should do something about all the three
> directions.
> But even one of them might turn out to be sufficient.
> As I've said, the first one seems to be the most promising, but it
> would require
> some tricky programming (flags and retries?) to move memory
> allocations out of
> locked sections.

For the NFSv4 stuff, I pre-allocate any structures that I might need
using malloc(..M_WAITOK) before going into the locked region. If I
don't need them, I just free() them at the end. (I assign "newp"
the allocation and set "newp" NULL if it is used. If "newp" != NULL
at the end, then free(