[Bug 246279] ciss device driver not allowing more than 48 drives to be detected by the CAM layer
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246279 Mark Linimon changed: What|Removed |Added Assignee|b...@freebsd.org|i...@freebsd.org Flags||mfc-stable14?, ||mfc-stable13? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 271238] mpr (LSI SAS3816) driver not finding ses devices in HP D6020 enclosures
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271238 Peter Eriksson changed: What|Removed |Added Version|12.4-RELEASE|13.4-RELEASE -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 271238] mpr (LSI SAS3816) driver not finding ses devices in HP D6020 enclosures
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271238 --- Comment #5 from Peter Eriksson --- Still seening strange behaviour with the SAS3816 controller and mpr driver on FreeBSD 14.1-RELEASE-p5 Now all the SES controllers are detected and accessible, but instead it doesn't correctly detect all disks. The setup: One HBA9500-16e (SAS3816) Two HPD D6020 external SAS enclosures, each with two drawers with 35 drives each (and one expander in each drawer). Things we've tried so far: 1. Connect one drawer a single port - finds all disks (in that drawer). 2. Move the cable to each other drawer - finds all disks (in each drawer) 3. Move the cabel to each other port - finds all the disks (in each drawer) So this rules out bad ports in the HBA, bad cables and bad ports in the drawers. 4. Connect all four drawers to separate ports - misses about 50% of disks in two of the drawers. 5. Daisy-chain two drawers with two other drawers that are connected to two ports on the HBA - misses about 50% of the disks again. All disks are seen by the expanders: # mprutil show expanders|egrep 'SAS Target'|wc -l 127 # camcontrol devlist|egrep '12000|1' |wc -l 100 # mprutil show expanders|fgrep 500 495001438041bb92bd0017 0001 0002 1 495001438041bb003d0018 0009 0003 1 49500143803089763d005e 0017 0004 2 495001438030889abd005f 0018 0005 2 This one is interresting though: # mprutil show devices BTSAS Address Handle ParentDeviceSpeed Enc Slot Wdt 5001438041bb92bd 00170001 SMP Target120002 004 5001438041bb003d 00180009 SMP Target120003 004 00 85 5000cca2912d0a39 00190017 SAS Target120002 691 00 86 5000cca2912d8f41 001a0017 SAS Target120002 701 00 87 5000cca2912d1645 001b0017 SAS Target120002 711 00 136 5000cca2912ceb9d 00380018 SAS Target120003 841 00 101 5000cca2912db98d 00390017 SAS Target120002 851 00 102 5000cca2912e3619 003a0017 SAS Target120002 861 5000cca2912d19ed 003b0017 SAS Target120002 871 5000cca2912d8bc5 003c0017 SAS Target120002 881 5000cca2912dfd49 003d0017 SAS Target120002 891 5000cca2912da57d 003e0017 SAS Target120002 901 5000cca29126fd89 003f0017 SAS Target120002 911 5000cca291257965 00400017 SAS Target120002 921 ... The missing disks are there but without entries for "BT". -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 271238] mpr (LSI SAS3816) driver not finding ses devices in HP D6020 enclosures
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271238 Peter Eriksson changed: What|Removed |Added Severity|Affects Only Me |Affects Some People -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 271238] mpr (LSI SAS3816) driver not finding ses devices in HP D6020 enclosures
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271238 Peter Eriksson changed: What|Removed |Added Version|13.4-RELEASE|14.1-RELEASE -- You are receiving this mail because: You are on the CC list for the bug.
Re: NFSd not registering on 14.2.
On Thu, Nov 21, 2024 at 9:16 PM Zaphod Beeblebrox wrote: > > lo0 has 127.0.0.1, ::1 (both first in their lists). It also has a pile of > other IPs that are used by jails. This has not changed I just did a trivial setup with the most recent snapshot for 14.2 and it worked ok. So, I have no idea what your problem is, but I'd start by looking at your network setup. Maybe reposting to freebsd-net@ might help. Make sure you mention that not registering to rpcbind is the problem in the subject line. (If you just mention "registering nfsd" most are liable to ignore the email, from what I've seen.) Good luck with it, rick > > On Thu, Nov 21, 2024 at 6:35 PM Rick Macklem wrote: >> >> On Thu, Nov 21, 2024 at 1:22 PM Zaphod Beeblebrox wrote: >> > >> > >> > I've tried a lot of different combinations of rc variables. On 13.3 and >> > 14.1, nfsd in most (non-v4-only) configurations registers to rpcbind as >> > expected. This is true of restarting nfsd and using nfsd -r. >> > >> > However on 14.2, I can't contrive any configuration that registers to >> > rpcbind. Minimally, on one fairly quiet 14.1 server, I simply have >> > >> > nfs_server_enable="YES" >> > mountd_enable="YES" >> > mountd_flags="-h -S" >> > >> > on another, I have more: >> > >> > mountd_enable="YES" >> > nfs_client_enable="YES" >> > nfs_server_enable="YES" >> > nfsv4_server_enable="NO" >> > #nfs_server_flags="-u -t -n 12" # Flags to nfsd (if enabled). >> > nfsuserd_enable="YES" >> > nfsuserd_flags="-domain daveg.ca" >> > nfscbd_enable="YES" >> > rpc_lockd_enable="YES" >> > rpc_statd_enable="YES" >> > >> > readup for what the 14.2 server has --- but I've tried configurations >> > going from the former to the latter. None of them register. >> > >> All I can suggest is checking lo0 to make sure it is using 127.0.0.1. >> See what >> # ifconfig -a >> shows. >> >> If lo0 is not 127.0.0.1, that would explain it, since the rpcbind stuff uses >> 127.0.0.1. >> >> Note that 127.0.0.1 gets added automatically when "-h" is used. >> >> Btw, I don't think I changed anything w.r.t. this between 14.1 and 14.2, >> so it is likely some other network related change. >> >> rick
[Bug 271238] mpr (LSI SAS3816) driver not finding all devices in HP D6020 enclosures
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271238 Peter Eriksson changed: What|Removed |Added Summary|mpr (LSI SAS3816) driver|mpr (LSI SAS3816) driver |not finding ses devices in |not finding all devices in |HP D6020 enclosures |HP D6020 enclosures -- You are receiving this mail because: You are on the CC list for the bug.
possible NVMe DMA buffer management issue in 14-stable
Hi, After updating from 13.4-stable to 14.2-stable earlier today, I've started seeing a few batches of entries like the following in my syslog: Nov 24 16:17:52 corona kernel: DMAR4: Fault Overflow Nov 24 16:17:52 corona kernel: nvme0: WRITE sqid:15 cid:121 nsid:1 lba:1615751416 len:256 Nov 24 16:17:52 corona kernel: DMAR4: nvme0: pci7:0:0 sid 700 fault acc 1 adt 0x0 reason 0x6 addr 42d000 Nov 24 16:17:52 corona kernel: nvme0: DATA TRANSFER ERROR (00/04) crd:0 m:1 dnr:1 p:1 sqid:15 cid:121 cdw0:0 Nov 24 16:17:52 corona kernel: (nda0:nvme0:0:0:1): WRITE. NCB: opc=1 fuse=0 nsid=1 prp1=0 prp2=0 cdw=604e68f8 0 ff 0 0 0 Nov 24 16:17:52 corona kernel: (nda0:nvme0:0:0:1): CAM status: Unknown (0x420) Nov 24 16:17:52 corona kernel: (nda0:nvme0:0:0:1): Error 5, Retries exhausted Nov 24 16:17:52 corona ZFS[11614]: vdev I/O failure, zpool=zroot path=/dev/nda0p4 offset=824843563008 size=131072 error=5 I've had Intel DMAR enabled on this machine for a long time and haven't seen anything like this before. The sequence of events here (with the DMAR fault first, followed by the NVMe transfer error), combined with the fact that I haven't yet seen DMAR faults for anything besides NVMe, plus the fact that I just upgraded from 13 to 14 a few hours ago, makes me suspect some nvme change between 13 and 14 introduced a subtle DMA buffer management bug that's being caught by the IOMMU. Has anyone else seen anything similar? Thanks, Jason