Re: Reproducable Infiniband panic
On 06/06/2013 08:57 PM, John Baldwin wrote: > On Thursday, June 06, 2013 9:54:35 am Andriy Gapon wrote: [...] >> The problem seems to be in incorrect interaction between devfs_close_f and >> linux_file_dtor. The latter expects curthread->td_fpop to have a valid >> reasonable >> value. But the former sets curthread->td_fpop to fp only around >> vnops.fo_close() >> call and then restores it back to some (what?) previous value before calling >> devfs_fpdrop->devfs_destroy_cdevpriv. In this case the previous value is >> NULL. > > It is normally NULL in this case. Why does linux_file_dtor even look at > td_fpop? > > Ah. I think it should not do that and make the data it uses in the dtor more > self-contained: > > Index: sys/ofed/include/linux/linux_compat.c > === > --- linux_compat.c(revision 251465) > +++ linux_compat.c(working copy) > @@ -212,7 +212,7 @@ linux_file_dtor(void *cdp) > struct linux_file *filp; > > filp = cdp; > - filp->f_op->release(curthread->td_fpop->f_vnode, filp); > + filp->f_op->release(filp->f_vnode, filp); > kfree(filp); > } > > @@ -232,6 +232,7 @@ linux_dev_open(struct cdev *dev, int oflags, int d > filp->f_dentry = &filp->f_dentry_store; > filp->f_op = ldev->ops; > filp->f_flags = file->f_flag; > + filp->f_vnode = file->f_vnode; > if (filp->f_op->open) { > error = -filp->f_op->open(file->f_vnode, filp); > if (error) { > Doesn't compile for me. Did you forget to add the f_vnode member to struct linux_file? sys/ofed/include/linux/linux_compat.c: In function 'linux_file_dtor': sys/ofed/include/linux/linux_compat.c:214: error: 'struct linux_file' has no member named 'f_vnode' sys/ofed/include/linux/linux_compat.c: In function 'linux_dev_open': sys/ofed/include/linux/linux_compat.c:234: error: 'struct linux_file' has no member named 'f_vnode' Julian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ZFS crashing while zfs recv in progress
first I'd like to thank you for your time and effort. > - Disk da3 has a different drive firmware (A580) than the A800 > drives. Somehow I did miss that. I can replace this disk with a A800 one, although I don't think this will change much. > - I have not verified if any of these disks use 4KByte sectors (dmesg > is > not going to tell you the entire truth). I would appreciate seeing > "smartctl -x" output from {da0,da1,da3} so I could get an idea. >Your > pools use gpt labelling so I am left with the hope that your labels > refer to the partition with proper 4KB alignment regardless. The 'tank' disks are real 512bytes disks. The zpool currently in use is ashift=9. I've also tried ashift=12 in the past, but it didn't help. You'll find the output of smartctl in the attachment. > Can you tell me what exact disk (e.g. daXX) in the above list you > used > for swap, and what kind of both system and disk load were going on at > the time you saw the swap message? > > I'm looking for a capture of "gstat -I500ms" output (you will need a > VERY long/big terminal window to capture this given how many disks > you > have) while I/O is happening, as well as "top -s 1" in another > window. > I would also like to see "zpool iostat -v 1" output while things are > going on, to help possibly narrow down if there is a single disk > causing > the entire I/O subsystem for that controller to choke. The swap disk in use is da28. The last output of top -s 1 that could be writen to disk was: --- last pid: 3653; load averages: 0.03, 0.19, 0.30up 0+15:55:50 03:04:33 43 processes: 1 running, 41 sleeping, 1 zombie CPU: 0.3% user, 0.0% nice, 0.6% system, 0.1% interrupt, 99.0% idle Mem: 7456K Active, 27M Inact, 6767M Wired, 3404K Cache, 9053M Free Swap: 256G Total, 5784K Used, 256G Free PID USERNAME THR PRI NICE SIZERES STATE C TIME WCPU COMMAND 1917 root1 220 33420K 2356K piperd 2 41:24 3.96% zfs 1913 root1 210 71980K 5248K select 4 288:50 3.27% sshd 1853 root1 200 29484K 2788K nanslp 0 3:13 0.00% gstat 1803 root1 200 35476K 2128K nanslp 1 2:44 0.00% zpool 1798 root1 200 16560K 2240K CPU07 1:07 0.00% top 1780 root1 200 67884K 1792K select 2 0:23 0.00% sshd 1800 root1 200 12052K 1484K select 6 0:17 0.00% script 1747 root1 200 71980K 1868K select 1 0:13 0.00% sshd 3148 root1 20 -20 21140K 8956K pause 7 0:11 0.00% atop 1850 root1 200 12052K 1412K select 4 0:06 0.00% script 1784 root1 200 67884K 1772K select 7 0:05 0.00% sshd 1652 nagios 1 200 12012K 1044K select 7 0:02 0.00% nrpe2 1795 root1 200 12052K 1408K select 1 0:02 0.00% script 1538 root1 200 11996K 960K nanslp 1 0:01 0.00% ipmon 1670 root1 200 20272K 1876K select 1 0:01 0.00% sendmail 1677 root1 200 14128K 1548K nanslp 2 0:00 0.00% cron 1547 root1 200 12052K 1172K select 5 0:00 0.00% syslogd --- The last output of zpool iostat -v 1 capacity operationsbandwidth poolalloc free read write read write -- - - - - - - tank1.19T 63.8T 95 0 360K 0 raidz2 305G 16.0T 25 0 92.2K 0 gpt/disk3 - - 16 0 8.47K 0 gpt/disk9 - - 17 0 18.9K 0 gpt/disk15 - - 12 0 6.98K 0 gpt/disk19 - - 12 0 6.48K 0 gpt/disk23 - - 21 0 14.0K 0 gpt/disk27 - - 18 0 10.5K 0 gpt/disk31 - - 18 0 9.47K 0 gpt/disk36 - - 16 0 18.4K 0 gpt/disk33 - - 12 0 15.5K 0 raidz2 305G 16.0T 25 0 103K 0 gpt/disk1 - - 16 0 8.47K 0 gpt/disk4 - - 24 0 16.0K 0 gpt/disk7 - - 17 0 10.5K 0 gpt/disk10 - - 17 0 8.97K 0 gpt/disk13 - - 25 0 15.5K 0 gpt/disk16 - - 15 0 8.97K 0 gpt/disk24 - - 15 0 7.98K 0 gpt/disk32 - - 25 0 16.9K 0 gpt/disk37 - - 16 0 9.47K 0 raidz2 305G 16.0T 20 0 81.3K 0 gpt/disk2 - - 9 0 4.98K 0 gpt/disk5 - - 20 0 14.0K 0 gpt/disk8 - - 18 0 10.5K 0 gpt/disk11 - - 18 0 9.47K 0 gpt/disk17 - - 20 0 11.5K 0 gpt/disk21 - - 12 0 6.48K 0 gpt/disk25 - - 12
Re: ZFS crashing while zfs recv in progress
On Fri, 7 Jun 2013, Pascal Braun, Continum wrote: [snip] > > If you could put a swap disk on a dedicated controller (and no other > > disks on it), that would be ideal. Please do not use USB for this > > task > > (the USB stack may introduce its own set of complexities pertaining > > to > > interrupt usage). > > I can't easily do this in the current setup. I would have to recreate the > primary pool differently. Don't you have a place for (possibly, 2.5") disk inside a case, so you can connect it directly to mobo AHCI? -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Reproducable Infiniband panic
On Friday, June 07, 2013 5:07:34 am Julian Stecklina wrote: > On 06/06/2013 08:57 PM, John Baldwin wrote: > > On Thursday, June 06, 2013 9:54:35 am Andriy Gapon wrote: > [...] > >> The problem seems to be in incorrect interaction between devfs_close_f and > >> linux_file_dtor. The latter expects curthread->td_fpop to have a valid reasonable > >> value. But the former sets curthread->td_fpop to fp only around vnops.fo_close() > >> call and then restores it back to some (what?) previous value before calling > >> devfs_fpdrop->devfs_destroy_cdevpriv. In this case the previous value is NULL. > > > > It is normally NULL in this case. Why does linux_file_dtor even look at > > td_fpop? > > > > Ah. I think it should not do that and make the data it uses in the dtor more > > self-contained: > > > > Index: sys/ofed/include/linux/linux_compat.c > > === > > --- linux_compat.c (revision 251465) > > +++ linux_compat.c (working copy) > > @@ -212,7 +212,7 @@ linux_file_dtor(void *cdp) > > struct linux_file *filp; > > > > filp = cdp; > > - filp->f_op->release(curthread->td_fpop->f_vnode, filp); > > + filp->f_op->release(filp->f_vnode, filp); > > kfree(filp); > > } > > > > @@ -232,6 +232,7 @@ linux_dev_open(struct cdev *dev, int oflags, int d > > filp->f_dentry = &filp->f_dentry_store; > > filp->f_op = ldev->ops; > > filp->f_flags = file->f_flag; > > + filp->f_vnode = file->f_vnode; > > if (filp->f_op->open) { > > error = -filp->f_op->open(file->f_vnode, filp); > > if (error) { > > > > Doesn't compile for me. Did you forget to add the f_vnode member to > struct linux_file? > > sys/ofed/include/linux/linux_compat.c: In function 'linux_file_dtor': > sys/ofed/include/linux/linux_compat.c:214: error: 'struct linux_file' > has no member named 'f_vnode' > sys/ofed/include/linux/linux_compat.c: In function 'linux_dev_open': > sys/ofed/include/linux/linux_compat.c:234: error: 'struct linux_file' > has no member named 'f_vnode' Oof it's in another header: Index: sys/ofed/include/linux/fs.h === --- fs.h(revision 251494) +++ fs.h(working copy) @@ -73,6 +73,7 @@ struct linux_file { struct dentry f_dentry_store; struct selinfo f_selinfo; struct sigio*f_sigio; + struct vnode*f_vnode; }; #definefilelinux_file -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"