Re: nvd->nda switch and blocksize changes for ZFS

2023-09-25 Thread Eugene Grosbein
25.09.2023 13:42, Frank Behrens wrote:

> With these information I'm not sure, if I have really a problem with the 
> native blocksize.
> Does anybody know, how the stripesize is determined?

It is reported by underlying driver, these are nda or nvd in your case.

Eugene




Re: nvd->nda switch and blocksize changes for ZFS

2023-09-25 Thread Dimitry Andric
On 25 Sep 2023, at 08:42, Frank Behrens  wrote:
> 
> Hi Dimitry, Yuri and also Mark, thanks for your fast responses!
> 
> Am 23.09.2023 um 20:58 schrieb Yuri Pankov:
...
> # smartctl -a /dev/nvme0
> Namespace 1 Formatted LBA Size: 512
> ...
> Supported LBA Sizes (NSID 0x1)
> Id Fmt  Data  Metadt  Rel_Perf
>  0 + 512   0 0

This is the default compatibility sector size of 512 bytes, so it is not 
relevant.


> # nvmecontrol identify nda0 and # nvmecontrol identify nvd0 (after 
> hw.nvme.use_nvd="1" and reboot) give the same result:
> Number of LBA Formats:   1
> Current LBA Format:  LBA Format #00
> LBA Format #00: Data Size:   512  Metadata Size: 0  Performance: Best
> ...
> Optimal I/O Boundary:0 blocks
> NVM Capacity:1000204886016 bytes
> Preferred Write Granularity: 32 blocks
> Preferred Write Alignment:   8 blocks
> Preferred Deallocate Granul: 9600 blocks
> Preferred Deallocate Align:  9600 blocks
> Optimal Write Size:  256 blocks

My guess is that the "Preferred Write Granularity" is the optimal size, in this 
case 32 'blocks' of 512 bytes, so 16 kiB. This also matches the stripe size 
reported by geom, as you showed.

The "Preferred Write Alignment" is 8 * 512 = 4 kiB, so you should align 
partitions etc to at least this. However, it cannot hurt to align everything to 
16 kiB either, which is an integer multiple of 4 kiB.


> The recommended blocksize for ZFS is GEOM's stripesize and there I see a 
> difference:
> 
> # diff -w -U 10  gpart_list_nvd.txt gpart_list_nda.txt
> -Geom name: nvd0
> +Geom name: nda0
>  modified: false
>  state: OK
>  fwheads: 255
>  fwsectors: 63
>  last: 1953525127
>  first: 40
>  entries: 128
>  scheme: GPT
>  Providers:
> -1. Name: nvd0p1
> +1. Name: nda0p1
> Mediasize: 272629760 (260M)
> Sectorsize: 512
> -   Stripesize: 4096
> -   Stripeoffset: 0
> +   Stripesize: 16384
> +   Stripeoffset: 4096

Yeah, I am suspecting that nda reports the "stripesize" from the NVMe 
"Preferred Write Granularity" and "stripeoffset" from the NVMe "Preferred Write 
Alignment". I think Warner's the resident expert on NVMe drivers, so maybe he's 
got some clue. :)

-Dimitry



signature.asc
Description: Message signed with OpenPGP


Re: nvd->nda switch and blocksize changes for ZFS

2023-09-25 Thread Frank Behrens

Am 25.09.2023 um 13:58 schrieb Dimitry Andric:
# nvmecontrol identify nda0 and # nvmecontrol identify nvd0 (after 
hw.nvme.use_nvd="1" and reboot) give the same result:

Number of LBA Formats:   1
Current LBA Format:  LBA Format #00
LBA Format #00: Data Size:   512  Metadata Size: 0  Performance: Best
...
Optimal I/O Boundary:0 blocks
NVM Capacity:1000204886016 bytes
Preferred Write Granularity: 32 blocks
Preferred Write Alignment:   8 blocks
Preferred Deallocate Granul: 9600 blocks
Preferred Deallocate Align:  9600 blocks
Optimal Write Size:  256 blocks

My guess is that the "Preferred Write Granularity" is the optimal size, in this 
case 32 'blocks' of 512 bytes, so 16 kiB. This also matches the stripe size reported by 
geom, as you showed.

The "Preferred Write Alignment" is 8 * 512 = 4 kiB, so you should align 
partitions etc to at least this. However, it cannot hurt to align everything to 16 kiB 
either, which is an integer multiple of 4 kiB.


Eugene gave me a tip, so I looked into the drivers.

dev/nvme/nvme_ns.c:
nvme_ns_get_stripesize(struct nvme_namespace *ns)
{
    uint32_t ss;

    if (((ns->data.nsfeat >> NVME_NS_DATA_NSFEAT_NPVALID_SHIFT) &
    NVME_NS_DATA_NSFEAT_NPVALID_MASK) != 0) {
    ss = nvme_ns_get_sector_size(ns);
    if (ns->data.npwa != 0)
    return ((ns->data.npwa + 1) * ss);
    else if (ns->data.npwg != 0)
    return ((ns->data.npwg + 1) * ss);
    }
    return (ns->boundary);
}

cam/nvme/nvme_da.c:
    if (((nsd->nsfeat >> NVME_NS_DATA_NSFEAT_NPVALID_SHIFT) &
    NVME_NS_DATA_NSFEAT_NPVALID_MASK) != 0 && nsd->npwg != 0)
    disk->d_stripesize = ((nsd->npwg + 1) * 
disk->d_sectorsize);

    else
    disk->d_stripesize = nsd->noiob * disk->d_sectorsize;

So it seems, that nvd uses "sectorsize * Write Alignment" as stripesize  
while nda uses "sectorsize * Write Granularity".


My current interpretation is, that the nvd driver reports the wrong 
value for maximum performance and reliability. I should make a backup 
and re-create the pool.
Maybe we should note in the 14.0 release notes, that the switch to nda 
is not a "nop".


--
Frank Behrens
Osterwieck, Germany




Re: [UPDATE] FreeBSD 14.0-BETA3 Now Available

2023-09-25 Thread Ronald Klop

On 9/25/23 22:18, James Comfort wrote:

Just out of curiosity, is there a reason that virtualbox-ose-additions don’t 
show up in a pkg search? I’ve tried compiling it from ports, but keep getting 
errors.




Yes. The pkg build was failing on FreeBSD 14 and 15 for some time. Recently a 
patch was committed which fixes this. The first pkg build containing this patch 
just finished. The pkgs will probably appear on the download servers tomorrow 
or the day after if nothing unexpected happens.

This is the log of the successful build of virtualbox-ose-additions on 
releng/14:
https://pkg-status.freebsd.org/beefy12/data/140releng-amd64-default/e88d010d0a2b/logs/virtualbox-ose-additions-6.1.46.log

Regards,
Ronald.




Jim Comfort

Sent from Mail  for Windows

*From: *Nuno Teixeira 
*Sent: *Saturday, September 23, 2023 5:11 AM
*To: *Kurt Jaeger 
*Cc: *Glen Barber ; freebsd-curr...@freebsd.org 
; freebsd-sta...@freebsd.org 
; FreeBSD Release Engineering Team 

*Subject: *Re: [UPDATE] FreeBSD 14.0-BETA3 Now Available

Same error but it seems that upgrade completed and pkgs cleaned:

---

=>> Building lang/rust
build started at Sat Sep 23 12:26:01 WEST 2023
port directory: /usr/ports/lang/rust
package name: rust-1.72.0
building for: FreeBSD 140amd64-main-job-02 14.0-BETA3 FreeBSD 14.0-BETA3 amd64
maintained by: r...@freebsd.org
Makefile datestamp: -rw-r--r--  1 1001 1001 11640 Sep  9 08:02 
/usr/ports/lang/rust/Makefile
Poudriere version: poudriere-git-3.3.99.20220831
Host OSVERSION: 151
Jail OSVERSION: 1400097
Job Id: 02

---

Kurt Jaeger mailto:p...@freebsd.org>> escreveu no dia 
sábado, 23/09/2023 à(s) 12:33:

Hi!

 > On Fri, Sep 22, 2023 at 10:50:08PM +, Glen Barber wrote:
 > > === Upgrading ===
 > >
 > > Due to a known delay, freebsd-update(8) binary update builds are not 
yet
 > > ready for BETA3.  A separate email will be sent once they are 
available.
 > >
 >
 > Binary updates via freebsd-update(8) are now available for systems
 > already running 14.0-BETA.

If I try to update my poudriere 14.0-BETA2 jail with this command:

poudriere jail -u -j 140 -t 14.0-BETA3

it fails, see here:

https://people.freebsd.org/~pi/logs/pou-fail.txt 

(short, only 1550 bytes)

I have:

140   14.0-BETA2      amd64 http    2023-09-16 08:20:01 /pou/jails/140

with

/usr/local/bin/poudriere installed by package 
poudriere-devel-3.3.99.20220831

-- 
p...@freebsd.org         +49 171 3101372                  Now what ?




--

Nuno Teixeira
FreeBSD Committer (ports)






ifconfig -v ix0 output delay

2023-09-25 Thread mike tancsa

Hi All,

    A small annoyance, but I was wondering why "ifconfig -v ix0" seems 
to take a "long time" compared to other 10G nics.  e.g.



0(nfs3b2)# time ifconfig -v ix0
ix0: flags=8863 metric 0 mtu 1500
options=4e53fbb
    ether 0c:c4:7a:6f:20:a0
    inet6 fe80::ec4:7aff:fe6f:20a0%ix0 prefixlen 64 scopeid 0x1
    inet6 2607:f3e0:0:6:ec4:7aff:fe6f:20a0 prefixlen 64 autoconf
    inet 10.255.255.132 netmask 0xff00 broadcast 10.255.255.255
    media: Ethernet autoselect (1000baseT )
    status: active
    nd6 options=23
0.000u 1.251s 0:01.25 100.0%    167+198k 0+0io 0pf+0w
0(nfs3b2)#

vs

% time ifconfig -v cxl0
cxl0: flags=8843 metric 0 mtu 1500
options=6ec07bb
    ether 00:07:43:60:c4:b0
    inet 10.251.12.1 netmask 0xff00 broadcast 10.251.12.255
    media: Ethernet 10Gbase-Twinax 
    status: active
    nd6 options=29
    plugged: SFP/SFP+/SFP28 Unknown (Copper pigtail)
    vendor: OEM PN: SFP-H10GB-CU5M SN: S220304060201 DATE: 2022-03-24
0.000u 0.002s 0:00.02 0.0%  0+0k 0+0io 0pf+0w


Going through truss, the delay seems to be after 
"ioctl(3,SIOCGIFSTATUS,0x65d9929d0f0) ERR#22 'Invalid argument'"



socket(PF_INET,SOCK_DGRAM|SOCK_CLOEXEC,0)    = 6 (0x6)
ioctl(6,SIOCGIFINDEX,0x65d9929cbf0)  = 0 (0x0)
close(6) = 0 (0x0)
ioctl(5,SIOCGDEFIFACE_IN6,0x65d9929cca0) = 0 (0x0)
close(5) = 0 (0x0)
    nd6 options=23
write(1,"\tnd6 options=23