schrieb Attilio Rao am 29.10.2012 23:02 (localtime):
> On Mon, Oct 29, 2012 at 7:37 PM, Harald Schmalzbauer
> <h.schmalzba...@omnilan.de> wrote:
>>  schrieb Attilio Rao am 27.10.2012 23:07 (localtime):
>>> On Sat, Oct 27, 2012 at 9:46 PM, Attilio Rao <atti...@freebsd.org> wrote:
>>>> On Sat, Sep 8, 2012 at 12:48 AM, Attilio Rao <atti...@freebsd.org> wrote:
>>>>> On Thu, Sep 6, 2012 at 4:52 PM, Harald Schmalzbauer
>>>>> <h.schmalzba...@omnilan.de> wrote:
>>>>>>  schrieb Attilio Rao am 09.08.2012 20:26 (localtime):
>>>>>>> On 8/8/12, Harald Schmalzbauer <h.schmalzba...@omnilan.de> wrote:
>>>>>>>>  schrieb Pavel Polyakov am 06.03.2012 11:20 (localtime):
>>>>>>>>>>> mount -t unionfs -o noatime /usr /mnt
>>>>>>>>>>>
>>>>>>>>>>> insmntque: mp-safe fs and non-locked vp: 0xfffffe01d96704f0 is not
>>>>>>>>>>> exclusive locked but should be
>>>>>>>>>>> KDB: enter: lock violation
>>>>>>>>>> Pavel,
>>>>>>>>>> can you give a spin to this patch?:
>>>>>>>>>> http://www.freebsd.org/~attilio/unionfs_missing_insmntque_lock.patch
>>>>>>>>>>
>>>>>>>>>> I think that the unlocking is due at that point as the vnode lock can
>>>>>>>>>> be switch later on.
>>>>>>>>>>
>>>>>>>>>> Let me know what you think about it and what the test does.
>>>>>>>>> Thanks!
>>>>>>>>> This patch fixes the problem with lock violation. Sorry I've tested 
>>>>>>>>> it so
>>>>>>>>> late.
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> this patch still applies cleanly to RELENG_9_1. Was there another fix
>>>>>>>> for the issue or has it just not been PR-sent and thus forgotten?
>>>>>>> Can you and Pavel try the attached patch? Unfortunately I had no time
>>>>>>> to test it, I just made in 5 free mins from a non-FreeBSD workstation,
>>>>>> Sorry, couldn't test earlier, but now I did:
>>>>>> With this patch applied the machine hangs without debug kernel and the
>>>>>> latter gives the following panic:
>>>>>> System call nmount returning with the following locks held:
>>>>>> exclusive lockmgr ufs (ufs) r = 0 (0xc5438278) locked @
>>>>>> src/sys/fs/unionfs/union_vnops.c:1938
>>>>>> panic: witness_warn
>>>>>> cpuid = 0
>>>>>> KDB: stack backtrace:
>>>>>> db_trace_self_wrapper(c0a04f7f,c0c112c4,d1de3bb4,c097aa8c,fc,...) at
>>>>>> db_trace_self_wrapper+0x26
>>>>>> kdb_backtrace(c0a4965f,0,c09c2ede3c1c,0,...) at kdb_backtrace+0x2a
>>>>>> witness_warn(2,0,c0a4ac34,c0a0990a,286,...) at witness_warn+0x1e4
>>>>>> syscall(d1de3d08) ar syscall+0x415
>>>>>> Xint0x80_syscall() at Xint0x80_syscall+0x21
>>>>>> --- syscall (0, FreeBSD ELF32, nosys), eip = 0x280b883f,esp =
>>>>>> 0xbfbfe46c, ebp = 0xbfbfede8 ---
>>>>>> KDB: enter: panic
>>>>>> [ thread pid 86 tid 100054 ]
>>>>>> Stopped ad    kdb_enter+0x3a: movl $0,kdb_why
>>>>>> db> bt
>>>>>> Tracing pid 86 tid 100054 td 0xc541b000
>>>>>> kdb_enter(c0a00d16,c0a09130,0,0,0,...) at panix+0x190
>>>>>> witness_warn(2,0,x0a4ac34,c0a0990a,286,...) at witness_warn+0x1e4
>>>>>> syscall(d1de3d08) at syscall+0x415
>>>>>> Xint0x80_syscall() at Xint0x80_syscall+0x21
>>>>>>
>>>>>> Hmm, I guess I forgot to install kernel debug symbols...
>>>>>> Coming back if I have more
>>>>> Unfortunately unionfs does very wrong things with the insmntque() locking.
>>>>> It basically expects the vnode to return locked in the same way
>>>>> requested by the precedent namei() (when that happens) but when you do
>>>>> insmntque() you can only have an LK_EXCLUSIVE lock on the vnode.
>>>> Hello,
>>>> the following patch should workout the issues around unionfs_nodeget() a 
>>>> bit:
>>>> http://www.freebsd.org/~attilio/unionfs_nodeget2.patch
>>>>
>>>> Unfortunately unionfs code is rather messy in the lookup path about
>>>> locking requirements so follow what it needs to be done there is a bit
>>>> difficult.
>>>> I have no way to test this patch, so it is just test-compiled at the
>>>> moment, but I would need that you also test lookup path (so directory
>>>> "ls", find(1) on the whole unionfs volume, etc.) to validate it
>>>> someway.
>>> On a second thought, I think that locking in lookup (and also other
>>> operations) is so fragile and difficult to follow that it makes all
>>> vnops real locking landmines.
>>> I think that the following patch fixes the insmntque insertion and
>>> follows the old approach well enough to be committed separately:
>>> http://www.freebsd.org/~attilio/unionfs_nodeget3.patch
>>>
>> Unfortunately I have no idea about all those locking strategies and
>> implementations.
>> Applying unionfs_nodeget3.patch results in:
>>         sys/fs/unionfs/union_subr.c: In function 'unionfs_nodeget':
>>         sys/fs/unionfs/union_subr.c:332: error: expected statement
>> before ')' token
>>         *** [union_subr.o] Error code 1
>>
>> I guess there is a typo in this chunk:
>> @@ -317,11 +328,11 @@ unionfs_nodeget(struct mount *mp, struct vnode *up
>>
>>                 vref(vp);
>>         } else
>>                 *vpp = vp;
>> -
>> -unionfs_nodeget_out:
>> -       if (lkflags & LK_TYPE_MASK)
>> -               vn_lock(vp, lkflags | LK_RETRY);
>> -
>> +       if (lkflags & LK_TYPE_MASK) {
>> +               if (lkflags == LK_SHARED))
>> ---------------------------------------- ^
>> +                       vn_lock(vp, LK_DOWNGRADE | LK_RETRY);
>> +       } else
>> +               VOP_UNLOCK(vp, LK_RELEASE);
>>         return (0);
>>  }
>>
>> After removing the second right parenthesis kernel compiles.
>> But it still crashes:
>> panic: Lock (lockmgr) ufs not locked @ sys/kern/vfs_default.c:512
>> cpuid = 1
>> KDB: stack backtrace:
>> ...
>> If you can use the bt info I'll transcribe - no serial console available :-(
>>
>> Am I right that I should only apply _one_ unionfs-patchX.patch
>> (unionfs_nodeget3.patch in that case)?
> Yes, only that one.
> Can you please do "bt" from DDB and take a picture of you screen with a 
> camera?

Ok, now I had a reason to take some time finding out how ESXi handles
serial ports ;-) It's quiet easy and very flexible, so no problem
setting up a debug console.
Please find attached the backtrace.
Do I have to load any symbols? It's not very informative what I see, right?

Thanks,

-Harry
panic: Lock (lockmgr) ufs not locked @ 
/usr/local/share/deploy-tools/RELENG_9_1/src/sys/kern/vfs_default.c:512.
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x1cd
witness_assert() at witness_assert+0x225
__lockmgr_args() at __lockmgr_args+0xb65
vop_stdunlock() at vop_stdunlock+0x43
VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x9b
unionfs_unlock() at unionfs_unlock+0xe1
VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x9b
unionfs_nodeget() at unionfs_nodeget+0x5a9
unionfs_domount() at unionfs_domount+0x4ab
vfs_donmount() at vfs_donmount+0x960
sys_nmount() at sys_nmount+0x66
amd64_syscall() at amd64_syscall+0x2fa
Xfast_syscall() at Xfast_syscall+0xf7
--- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x80087798c, rsp = 
0x7fffffffd328, rbp = 0x7fffffffd750 ---
KDB: enter: panic
[ thread pid 72 tid 100072 ]
Stopped at      kdb_enter+0x3b: movq    $0,0x64cd52(%rip)
db> bt
Tracing pid 72 tid 100072 td 0xfffffe0007344470
kdb_enter() at kdb_enter+0x3b
panic() at panic+0x1c6
witness_assert() at witness_assert+0x225
__lockmgr_args() at __lockmgr_args+0xb65
vop_stdunlock() at vop_stdunlock+0x43
VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x9b
unionfs_unlock() at unionfs_unlock+0xe1
VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x9b
unionfs_nodeget() at unionfs_nodeget+0x5a9
unionfs_domount() at unionfs_domount+0x4ab
vfs_donmount() at vfs_donmount+0x960
sys_nmount() at sys_nmount+0x66
amd64_syscall() at amd64_syscall+0x2fa
Xfast_syscall() at Xfast_syscall+0xf7
--- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x80087798c, rsp = 
0x7fffffffd328, rbp = 0x7fffffffd750 ---
db>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to