Libressl issue verifying self-signed certs with tls-auth and Openvpn

2017-06-20 Thread Andrew Lemin
Hi Misc,

Has anyone else come across any issues recently with Openvpn, Libressl and
TLS on OpenBSD 6.1?

I am using an .ovpn file with TLS auth static key and cert inline within
the file, to connect to VPN service. Running openvpn binary from command
line without any special params, just .ovpn file.

I have tested this is working fine on a Linux server with same config
(using Openssl), so the server side, CA and cert are fine etc.

I noticed on the Linux server the line; "Control Channel Authentication:
tls-auth using INLINE static key file", but I do not see this debug on the
OpenBSD version. Wondered if Libressl is not negotiating tls properly.


I have since found CVE-2017-8301 which I believe is related. And confirmed
that OpenBSD 6.1 seems to be running LibreSSL version 2.5.2

The CVE shows issue known between 2.5.1 and 2.5.3, and looking at the
OpenBSD trees I can see 2.5.4 was cut around 1st of May..

I used MTier to grab all major patches etc, but LibreSSL not in patch list
yet. openvpn did have a minor.

So downloaded Libressl 2.5.4 source, compiled and installed as per INSTALL
etc.. However notice that openvpn is still linking to 2.5.2.

It would be great if someone would be kind enough to confirm if this CVE is
indeed the same issue, and if 2.5.4 includes the relevant fixes for it?

And if yes, a gentle nudge as to how to get openvpn to link to the 2.5.4
install?

Thanks for your time.
Kind regards, Andy Lemin



Sent from a teeny tiny keyboard, so please excuse typos


Re: Libressl issue verifying self-signed certs with tls-auth and Openvpn

2017-06-20 Thread Andrew Lemin
Hi,

Sadly in my testing it seems that CVE-2017-8301 (
http://seclists.org/oss-sec/2017/q2/145) is still broken with the
latest LibreSSL
(2.5.4) and OpenVPN 2.4.2.

Here is someone else reporting the same issue;
https://discourse.trueos.org/t/libre-openssl-tls-error-when-using-openvpn/1358/4

Of course I may have gotten this wrong somewhere, but for now it seems not
possible to use OpenVPN as a client with TLS static certificate based
server on OpenBSD.

Hope this helps clarify for anyone else finding the same issue until some
clever person does a fix.


Error same with latest;

Tue Jun 20 22:51:15 2017 OpenVPN 2.4.2 x86_64-unknown-openbsd6.1 [SSL
(OpenSSL)] [LZO] [LZ4] [MH/RECVDA] [AEAD] built on Jun 20 2017

Tue Jun 20 22:51:15 2017 library versions: LibreSSL 2.5.4, LZO 2.10

.

.

Tue Jun 20 22:52:08 2017 VERIFY ERROR: depth=0, error=self signed
certificate: < Cert Info >

Tue Jun 20 22:52:08 2017 OpenSSL: error:14007086:SSL
routines:CONNECT_CR_CERT:certificate verify failed

Tue Jun 20 22:52:08 2017 TLS_ERROR: BIO read tls_read_plaintext error

Tue Jun 20 22:52:08 2017 TLS Error: TLS object -> incoming plaintext read
error

Tue Jun 20 22:52:08 2017 TLS Error: TLS handshake failed

Tue Jun 20 22:52:08 2017 SIGUSR1[soft,tls-error] received, process
restarting

On Tue, Jun 20, 2017 at 8:49 PM, Andy Lemin  wrote:

> I've just found this hint on GitHub for the Openvpn compile options for
> Libressl;
> https://gist.github.com/gsora/2b3e9eb31c15a356c7662b0f960e2995
>
> So will try a build later tonight and share back here if that CVE is fixed.
>
> Would prefer to rebuild with the same options as the packaged binary, and
> it occurred to me that I don't know how to find that on OpenBSD?
>
> Thanks again :)
>
>
> Sent from a teeny tiny keyboard, so please excuse typos
>
> On 20 Jun 2017, at 20:23, Andrew Lemin  wrote:
>
> Hi Misc,
>
> Has anyone else come across any issues recently with Openvpn, Libressl and
> TLS on OpenBSD 6.1?
>
> I am using an .ovpn file with TLS auth static key and cert inline within
> the file, to connect to VPN service. Running openvpn binary from command
> line without any special params, just .ovpn file.
>
> I have tested this is working fine on a Linux server with same config
> (using Openssl), so the server side, CA and cert are fine etc.
>
> I noticed on the Linux server the line; "Control Channel Authentication:
> tls-auth using INLINE static key file", but I do not see this debug on the
> OpenBSD version. Wondered if Libressl is not negotiating tls properly.
>
>
> I have since found CVE-2017-8301 which I believe is related. And confirmed
> that OpenBSD 6.1 seems to be running LibreSSL version 2.5.2
>
> The CVE shows issue known between 2.5.1 and 2.5.3, and looking at the
> OpenBSD trees I can see 2.5.4 was cut around 1st of May..
>
> I used MTier to grab all major patches etc, but LibreSSL not in patch list
> yet. openvpn did have a minor.
>
> So downloaded Libressl 2.5.4 source, compiled and installed as per INSTALL
> etc.. However notice that openvpn is still linking to 2.5.2.
>
> It would be great if someone would be kind enough to confirm if this CVE
> is indeed the same issue, and if 2.5.4 includes the relevant fixes for it?
>
> And if yes, a gentle nudge as to how to get openvpn to link to the 2.5.4
> install?
>
> Thanks for your time.
> Kind regards, Andy Lemin
>
>
>
> Sent from a teeny tiny keyboard, so please excuse typos
>
>


PF Outbound traffic Load Balancing over multiple tun/openvpn interfaces/tunnels

2018-09-11 Thread Andrew Lemin
Hi list,

I use an OpenVPN based internet access service (like NordVPN, AirVPN etc).

The issue with these public VPN services, is the VPN servers are always 
congested. The most I’ll get is maybe 10Mbits through one server.

Local connection is a few hundred mbps..

So I had the idea of running multiple openvpn tunnels to different servers, and 
load balancing outbound traffic across the tunnels.

Sounds simple enough..

However every vpn tunnel uses the same subnet and nexthop gw. This of course 
won’t work with normal routing.

So my question:
How can I use rdomains or rtables with openvpn clients, so that each VPN is 
started in its own logical VRF?

And is it then a case of just using PF to push the outbound packets into the 
various rdomains/rtables randomly (of course maintaining state)? LAN interface 
would be in the default rdomain/rtable..

My confusion is that an interface needs to be bound to the logical VRF, but the 
tunX interfaces are created dynamically by openvpn.

So I am not sure how to configure this within hostname.tunX etc, or if I’m even 
approaching this correctly?

Thanks, Andy.



Cannot mount install.fs disk image to create custom auto_install.conf based USB flash drive

2018-11-11 Thread Andrew Lemin
Hi list,
I really need some help mounting an install.fs disk image, and hope someone
can help :)
I have been trying and failing to create an auto-installing USB flash drive
for OpenBSD.

All of the below steps are being performed using an existing OpenBSD VM

1) Create /auto_install.conf file
https://man.openbsd.org/autoinstall
http://eradman.com/posts/autoinstall-openbsd.html
- Done

2) Install 'upobsd' package
pkg_add -i upobsd
- Done

3) Inject newly created 'auto_install.conf' into a local 'bsd.rd' RAM disk
upobsd -u /auto_install.conf -o /tmp/bsd.rd
- Done

4) Add updated 'bsd.rd' file into 'install.fs'
4a) Associate image with a vnd device so disk image can be mounted as a
filesystem image
vnconfig vnd1 /home/sysadmin/install64.fs
- Done

4b) Mount new vnd1c device (this is where I'm stuck)

** Here is where I get lost. All the guides refer only to using
install.iso (whos 'a:' and 'c:' partitions are ISO9660 filetypes - for CD
based installs), but I need to use the install.fs (for USB based installs)
**

fw1# mount /dev/vnd1c /mnt
mount_ffs: /dev/vnd1c on /mnt: Invalid argument
fw1# mount -t cd9660 /dev/vnd1c /mnt
mount_cd9660: /dev/vnd1c on /mnt: Invalid argument
fw1# mount -t msdos /dev/vnd1c /mnt
mount_msdos: /dev/vnd1c on /mnt: not an MSDOS filesystem
fw1# mount -t ext2fs /dev/vnd1c /mnt
mount_ext2fs: /dev/vnd1c on /mnt: Input/output error

As you can see, none of the the types I know about are working?

bsd1# disklabel vnd1
# /dev/rvnd1c:
type: vnd
disk: vnd device
label: fictitious
duid: e5445c1e269855f0
flags:
bytes/sector: 512
sectors/track: 100
tracks/cylinder: 1
sectors/cylinder: 100
cylinders: 7382
total sectors: 738240
boundstart: 1024
boundend: 737280
drivedata: 0
16 partitions:
#size   offset  fstype [fsize bsize   cpg]
  a:   736256 1024  4.2BSD   2048 16384 16142
  c:   7382400  unused
  i:  960   64   MSDOS

I cannot work out what the filesystem should be? It shows as 'unused' here.

NB; If I try with the 'install.iso' disk image the vnd mount works fine
(with '-t cd9660').
But I need this to work for a flash drive?



Assuming I could get past this, I think I would then need to do the
following;

4c) Copy in bsd.rd
cp /tmp/bsd.rd /mnt/

4d) Unmount /mnt
umount /mnt

4e) Disassociate vnd1
vnconfig -u /dev/vnd1

6) copy modified install.fs image to USB flash..
dd if=install*.fs of=/dev/rsd6c bs=1m

Thanks in advance for your time and help.
Andy.


Intel Celeron SoC support

2018-11-14 Thread Andrew Lemin
Hi,

I am running an ASRock J4105B-ITX board and wanting to run OpenBSD on this.
https://www.asrock.com/MB/Intel/J4105B-ITX/index.asp#BIOS

It boots up, and at the 'boot>' prompt I can use the keyboard find.

However after it boots up, the keyboard stops working, and no disks are
found by the installer (used auto_install to send test commands).
It appears that there is no chipset support, for the Intel Celeron J4105
CPU from what I can work out.

To test that it was working fine and is just OpebBSD which is not working,
I installed Linux and have included the dmesg below (from Linux).
I cannot run a dmesg from the OpenBSD installer as I cannot use the
keyboard etc.

Will support come for this SoC architecture? Or am I better of selling this
board?

Think its a Gemini Lake SoC Chipset;

[0.00] Linux version 4.9.0-8-amd64 (debian-ker...@lists.debian.org)
(gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) ) #1 SMP Debian
4.9.130-2 (2018-10-27)
[0.00] Command line: BOOT_IMAGE=/vmlinuz-4.9.0-8-amd64
root=/dev/mapper/virt1--vg-root ro quiet intel_iommu=on
[0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point
registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds
registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
[0.00] x86/fpu: xstate_offset[3]:  576, xstate_sizes[3]:   64
[0.00] x86/fpu: xstate_offset[4]:  640, xstate_sizes[4]:   64
[0.00] x86/fpu: Enabled xstate features 0x1b, context size is 704
bytes, using 'compacted' format.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0003dfff] usable
[0.00] BIOS-e820: [mem 0x0003e000-0x0003]
reserved
[0.00] BIOS-e820: [mem 0x0004-0x0009dfff] usable
[0.00] BIOS-e820: [mem 0x0009e000-0x000f]
reserved
[0.00] BIOS-e820: [mem 0x0010-0x0fff] usable
[0.00] BIOS-e820: [mem 0x1000-0x12150fff]
reserved
[0.00] BIOS-e820: [mem 0x12151000-0x76d93fff] usable
[0.00] BIOS-e820: [mem 0x76d94000-0x7963dfff]
reserved
[0.00] BIOS-e820: [mem 0x7963e000-0x7968efff] usable
[0.00] BIOS-e820: [mem 0x7968f000-0x796b6fff] ACPI
NVS
[0.00] BIOS-e820: [mem 0x796b7000-0x799eafff]
reserved
[0.00] BIOS-e820: [mem 0x799eb000-0x79a9bfff] type
20
[0.00] BIOS-e820: [mem 0x79a9c000-0x7a4c1fff] usable
[0.00] BIOS-e820: [mem 0x7a4c2000-0x7a56dfff]
reserved
[0.00] BIOS-e820: [mem 0x7a56e000-0x7abf] usable
[0.00] BIOS-e820: [mem 0x7ac0-0x7fff]
reserved
[0.00] BIOS-e820: [mem 0xd000-0xd0ff]
reserved
[0.00] BIOS-e820: [mem 0xd3709000-0xd3709fff]
reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff]
reserved
[0.00] BIOS-e820: [mem 0xfe042000-0xfe044fff]
reserved
[0.00] BIOS-e820: [mem 0xfe90-0xfe902fff]
reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff]
reserved
[0.00] BIOS-e820: [mem 0xfed01000-0xfed01fff]
reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff]
reserved
[0.00] BIOS-e820: [mem 0xff00-0x]
reserved
[0.00] BIOS-e820: [mem 0x0001-0x00017fff] usable
[0.00] NX (Execute Disable) protection: active
[0.00] efi: EFI v2.60 by American Megatrends
[0.00] efi:  ACPI 2.0=0x7968f000  ACPI=0x7968f000
SMBIOS=0x79948000  SMBIOS 3.0=0x79947000  ESRT=0x75cce798
MEMATTR=0x73b5e098
[0.00] SMBIOS 3.1.1 present.
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x18 max_arch_pfn = 0x4
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 00FF00 mask 7FFF00 write-combining
[0.00]   1 base 00 mask 7F8000 write-back
[0.00]   2 base 007B00 mask 7FFF00 uncachable
[0.00]   3 base 007C00 mask 7FFC00 uncachable
[0.00]   4 base 01 mask 7F8000 write-back
[0.00]   5 base 009000 mask 7FF000 write-combining
[0.00]   6 disabled
[0.00]   7 disabled
[0.00]   8 disabled
[0.00]   9 disabled
[0.00] x86/PAT: Configuration 

Re: Disable ftp in pkg_add syspatch sysupgrade

2019-10-30 Thread Andrew Lemin
Hi gents,

Sorry for the slow reply, and thank you for all your responses! :D

Raf, you are correct. It seems that the ftp client is performing an http(s)
downloads.
To me this seems unusual (was expecting 'curl' or 'wget' etc to avoid code
duplication) and confusing? What do you think?

Stuart, thanks for your suggestion. This confirmed the ftp client is using
http(s);
[HOME]root@testbsd1:/local#pgrep -lf ftp
40379 /usr/bin/ftp -o -
http://mirror.bytemark.co.uk/pub/OpenBSD/6.5/packages-stable/amd64/quirks-3.124.tgz

Tom/PJ, Understood. Was just very confused why ftp was getting involved..


Anyway, I have tested this some more, and it looks like the issue is
related to when using "flavors", and looks like maybe their is some sort of
timeout occuring maybe.

- We can see that pkg_add is working fine when specifying packages
explicitly;
[HOME]root@testbsd1:/local#pkg_add sudo--gettext bash htop
quirks-3.124 signed on 2019-10-16T20:27:45Z
[HOME]root@testbsd1:/local#pkg_add vim--no_x11-perl-python3-ruby
unzip--iconv bzip2 git fzf
quirks-3.124 signed on 2019-10-16T20:27:45Z

- But throws errors when I try and use flavours which is critical for
installing python for example (NB; This is a different error to before,
where I was getting 'timeout' instead of 'Invalid argument');
[HOME]root@testbsd1:/local#pkg_add python%2 py-pip python%3 py3-pip
py3-setuptools
quirks-3.124 signed on 2019-10-16T20:27:45Z
http://mirror.bytemark.co.uk/pub/OpenBSD/6.5/packages/amd64/py3-setuptools-40.0.0v0.tgz:
ftp: Receiving HTTP reply: Invalid argument
signify: gzheader truncated
Couldn't install py3-setuptools-40.0.0v0

- This package is accessible as seen here;
[HOME]root@testbsd1:/local#wget
http://mirror.bytemark.co.uk/pub/OpenBSD/6.5/packages/amd64/py3-setuptools-40.0.0v0.tgz
/tmp/
--2019-10-30 14:29:28--
http://mirror.bytemark.co.uk/pub/OpenBSD/6.5/packages/amd64/py3-setuptools-40.0.0v0.tgz
Resolving mirror.bytemark.co.uk (mirror.bytemark.co.uk)... 80.68.83.150,
212.110.163.12, 2001:41c8:20:5e6::150, ...
Connecting to mirror.bytemark.co.uk (mirror.bytemark.co.uk)|80.68.83.150|:80...
connected.
HTTP request sent, awaiting response... 200 OK
Length: 731604 (714K) [application/x-gzip]
Saving to: ‘py3-setuptools-40.0.0v0.tgz’

py3-setuptools-40.0.0v0.tgz
100%[===>]
714.46K   270KB/sin 2.6s

- And works if specified on its own;
[HOME]root@testbsd1:/local#pkg_add py3-setuptools
quirks-3.124 signed on 2019-10-16T20:27:45Z

If I try the line with flavors again "pkg_add python%2 py-pip python%3
py3-pip py3-setuptools" it works..

As others would be crying about this too if it were a wide issue, I thought
this was maybe a bad mirror...
So I have now tried every mirror in the UK, and they all do the same thing
- intermittent issues accessing packages when using flavours..

I am not running squid or any kind of web proxy, http and https are being
passed out with nothing more than standard NAT and a pass rule.
I will try and figure out what is going on. Leave this with me. If I find
anything meanful and useful I will let you know. For now, consider this an
issue with my setup..

PS; has anyone managed to get ftpproxy working in an rdomain?

Thanks for your time and responses.. :)
Andy.

On Wed, Oct 30, 2019 at 9:17 AM PJ  wrote:

> Am 30.10.19 um 07:32 schrieb tom ryan:
> > On 2019-10-29 20:19, PJ wrote:
> >> Am 28.10.19 um 23:52 schrieb Stuart Henderson:
> >>> On 2019-10-28, Andy Lemin  wrote:
>  Hi guys,
> 
>  Does anyone know if it is possible to completely disable ftp in the
> package management utilities; pkg_add, syspatch, sysupgrade etc?
> 
>  My PKG_PATH references http:// urls, as does /etc/install. But I
> cannot stop these tools trying to use ftp which does not work! :(
> >>> Can you show some example URLs, for example from "pgrep -lf ftp" while
> >>> trying to use one of these utilities?
> >>>
> >>> The only place I would expect to see ftp:// URLs used
> >> grep ftp /usr/sbin/sysupgrade
> > $ grep -ne ftp -e URL -e MIRROR /usr/sbin/sysupgrade
> > 102:0)  MIRROR=$(sed 's/#.*//;/^$/d' /etc/installurl) 2>/dev/null ||
> > 103:MIRROR=https://cdn.openbsd.org/pub/OpenBSD
> > 105:1)  MIRROR=$1
> > 117:URL=${MIRROR}/snapshots/${ARCH}/
> > 119:URL=${MIRROR}/${NEXT_VERSION}/${ARCH}/
> > 136:unpriv -f SHA256.sig ftp -Vmo SHA256.sig ${URL}SHA256.sig
> > 176:unpriv -f $f ftp -Vmo ${f} ${URL}${f}
> >
> > Your point?
>
> I understand that I misread the question, sorry.
>
>
> >>> is when fetching
> >>> certain distfiles while building some things from ports (and they would
> >>> usually fallback to http://ftp.openbsd.org/pub/OpenBSD/distfiles if
> >>> the ftp fetch failed)..
>
>


PF queue bandwidth limited to 32bit value

2023-09-11 Thread Andrew Lemin
Hi all,
Hope this finds you well.

I have discovered that PF's queueing is still limited to 32bit bandwidth
values.

I don't know if this is a regression or not. I am sure one of the
objectives of the ALTQ rewrite into the new queuing system we have in
OpenBSD today, was to allow bandwidth values larger than 4294M. Maybe I am
imagining it..

Anyway, I am trying to use OpenBSD PF to perform/filter Inter-VLAN routing
with 10Gbps trunks, and I cannot set the queue bandwidth higher than a
32bit value?

Setting the bandwidth value to 4295M results in a value overflow where
'systat queues' shows it wrapped and starts from 0 again. And traffic is
indeed restricted to such values, so does not appear to be just a cosmetic
'systat queues' issue.

I am sure this must be a bug/regression, 10Gbps on OpenBSD is trivial and
common nowadays..

Tested on OpenBSD 7.3
Thanks for checking my sanity :)
Andy.


Re: PF queue bandwidth limited to 32bit value

2023-09-12 Thread Andrew Lemin
Hi Stuart.

On Wed, Sep 13, 2023 at 12:25 AM Stuart Henderson 
wrote:

> On 2023-09-12, Andrew Lemin  wrote:
> > Hi all,
> > Hope this finds you well.
> >
> > I have discovered that PF's queueing is still limited to 32bit bandwidth
> > values.
> >
> > I don't know if this is a regression or not.
>
> It's not a regression, it has been capped at 32 bits afaik forever
> (certainly was like that when the separate classification via altq.conf
> was merged into PF config, in OpenBSD 3.3).
>

Ah ok, it was talked about so much I thought it was part of it. Thanks for
clarifying.


>
> >  I am sure one of the
> > objectives of the ALTQ rewrite into the new queuing system we have in
> > OpenBSD today, was to allow bandwidth values larger than 4294M. Maybe I
> am
> > imagining it..
>
> I don't recall that though there were some hopes expressed by
> non-developers.
>

Haha, it is definitely still wanted and needed. prio-only based ordering is
too limited


>
> > Anyway, I am trying to use OpenBSD PF to perform/filter Inter-VLAN
> routing
> > with 10Gbps trunks, and I cannot set the queue bandwidth higher than a
> > 32bit value?
> >
> > Setting the bandwidth value to 4295M results in a value overflow where
> > 'systat queues' shows it wrapped and starts from 0 again. And traffic is
> > indeed restricted to such values, so does not appear to be just a
> cosmetic
> > 'systat queues' issue.
> >
> > I am sure this must be a bug/regression,
>
> I'd say a not-implemented feature (and I have a feeling it is not
> going to be all that simple a thing to implement - though changing
> scales so the uint32 carries bytes instead of bits per second might
> not be _too_ terrible).
>

Following the great work to SMP unlock in the VLAN interface, and recent
NIC optimisations (offloading and interrupt handling) in various drivers,
you can now push packet filtered 10Gbps with modern CPUs without breaking a
sweat..

A, thats clever! Having bandwidth queues up to 34,352M would definitely
provide runway for the next decade :)

Do you think your idea is worth circulating on tech@ for further
discussion? Queueing at bps resolution is rather redundant nowadays, even
on the very slowest links.


> >  10Gbps on OpenBSD is trivial and
> > common nowadays..
>
> While using interfaces with 10Gbps link speed on OpenBSD is trivial,
> actually pushing that much traffic (particularly with more complex
> processing e.g. things like bandwidth controls, and particularly with
> smaller packet sizes) not so much.
>
>
> --
> Please keep replies on the mailing list.
>
>


Re: PF queue bandwidth limited to 32bit value

2023-09-12 Thread Andrew Lemin
On Wed, Sep 13, 2023 at 3:43 AM Andrew Lemin  wrote:

> Hi Stuart.
>
> On Wed, Sep 13, 2023 at 12:25 AM Stuart Henderson <
> stu.li...@spacehopper.org> wrote:
>
>> On 2023-09-12, Andrew Lemin  wrote:
>> > Hi all,
>> > Hope this finds you well.
>> >
>> > I have discovered that PF's queueing is still limited to 32bit bandwidth
>> > values.
>> >
>> > I don't know if this is a regression or not.
>>
>> It's not a regression, it has been capped at 32 bits afaik forever
>> (certainly was like that when the separate classification via altq.conf
>> was merged into PF config, in OpenBSD 3.3).
>>
>
> Ah ok, it was talked about so much I thought it was part of it. Thanks for
> clarifying.
>
>
>>
>> >  I am sure one of the
>> > objectives of the ALTQ rewrite into the new queuing system we have in
>> > OpenBSD today, was to allow bandwidth values larger than 4294M. Maybe I
>> am
>> > imagining it..
>>
>> I don't recall that though there were some hopes expressed by
>> non-developers.
>>
>
> Haha, it is definitely still wanted and needed. prio-only based ordering
> is too limited
>

I have noticed another issue while trying to implement a 'prio'-only
workaround (using only prio ordering for inter-VLAN traffic, and HSFC
queuing for internet traffic);
It is not possible to have internal inter-vlan traffic be solely priority
ordered with 'set prio', as the existence of 'queue' definitions on the
same internal vlan interfaces (required for internet flows), demands one
leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into
the 'default' queue despite queuing not being wanted, and so
unintentionally clamping all internal traffic to 4294M just because full
queuing is needed for internet traffic.
In fact 'prio' is irrelevant, as with or without 'prio' because queue's are
required for internet traffic, all internal traffic becomes bound by the
'default' HSFC queue.

So I would propose that the mandate on the 'default' keyword is relaxed (or
a new keyword is provided for match/pass rules to force flows to not be
queued), and/or implement the uint32 scale in bytes, instead of bits?

I personally believe both are valid and needed?


>
>
>>
>> > Anyway, I am trying to use OpenBSD PF to perform/filter Inter-VLAN
>> routing
>> > with 10Gbps trunks, and I cannot set the queue bandwidth higher than a
>> > 32bit value?
>> >
>> > Setting the bandwidth value to 4295M results in a value overflow where
>> > 'systat queues' shows it wrapped and starts from 0 again. And traffic is
>> > indeed restricted to such values, so does not appear to be just a
>> cosmetic
>> > 'systat queues' issue.
>> >
>> > I am sure this must be a bug/regression,
>>
>> I'd say a not-implemented feature (and I have a feeling it is not
>> going to be all that simple a thing to implement - though changing
>> scales so the uint32 carries bytes instead of bits per second might
>> not be _too_ terrible).
>>
>
> Following the great work to SMP unlock in the VLAN interface, and recent
> NIC optimisations (offloading and interrupt handling) in various drivers,
> you can now push packet filtered 10Gbps with modern CPUs without breaking a
> sweat..
>
> A, thats clever! Having bandwidth queues up to 34,352M would
> definitely provide runway for the next decade :)
>
> Do you think your idea is worth circulating on tech@ for further
> discussion? Queueing at bps resolution is rather redundant nowadays, even
> on the very slowest links.
>
>
>> >  10Gbps on OpenBSD is trivial
>> and
>> > common nowadays..
>>
>> While using interfaces with 10Gbps link speed on OpenBSD is trivial,
>> actually pushing that much traffic (particularly with more complex
>> processing e.g. things like bandwidth controls, and particularly with
>> smaller packet sizes) not so much.
>>
>>
>> --
>> Please keep replies on the mailing list.
>>
>>
Thanks again, Andy.


Re: PF queue bandwidth limited to 32bit value

2023-09-14 Thread Andrew Lemin
On Wed, Sep 13, 2023 at 8:22 PM Stuart Henderson 
wrote:

> On 2023-09-12, Andrew Lemin  wrote:
> > A, thats clever! Having bandwidth queues up to 34,352M would
> definitely
> > provide runway for the next decade :)
> >
> > Do you think your idea is worth circulating on tech@ for further
> > discussion? Queueing at bps resolution is rather redundant nowadays, even
> > on the very slowest links.
>
> tech@ is more for diffs or technical questions rather than not-fleshed-out
> quick ideas. Doing this would solve some problems with the "just change it
> to 64-bit" mooted on the freebsd-pf list (not least with 32-bit archs),
> but would still need finding all the places where the bandwidth values are
> used and making sure they're updated to cope.
>
>
Yes good point :) I am not in a position to undertake this myself at the
moment.
If none of the generous developers feel included to do this despite the
broad value, I might have a go myself at some point (probably not able
until next year sadly).

"just change it to 64-bit" mooted on the freebsd-pf list - I have been
unable to find this conversation. Do you have a link?


>
> --
> Please keep replies on the mailing list.
>
>


Re: PF queue bandwidth limited to 32bit value

2023-09-14 Thread Andrew Lemin
On Wed, Sep 13, 2023 at 8:35 PM Stuart Henderson 
wrote:

> On 2023-09-13, Andrew Lemin  wrote:
> > I have noticed another issue while trying to implement a 'prio'-only
> > workaround (using only prio ordering for inter-VLAN traffic, and HSFC
> > queuing for internet traffic);
> > It is not possible to have internal inter-vlan traffic be solely priority
> > ordered with 'set prio', as the existence of 'queue' definitions on the
> > same internal vlan interfaces (required for internet flows), demands one
> > leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into
> > the 'default' queue despite queuing not being wanted, and so
> > unintentionally clamping all internal traffic to 4294M just because full
> > queuing is needed for internet traffic.
>
> If you enable queueing on an interface all traffic sent via that
> interface goes via one queue or another.
>

Yes, that is indeed the very problem. Queueing is enabled on the inside
interfaces, with bandwidth values set slightly below the ISP capacities
(multiple ISP links as well), so that all things work well for all internal
users.
However this means that inter-vlan traffic from client networks to server
networks are restricted to 4294Mbps for no reason.. It would make a huge
difference to be able to allow local traffic to flow without being
queued/restircted.


>
> (also, AIUI the correct place for queues is on the physical interface
> not the vlan, since that's where the bottleneck is... you can assign
> traffic to a queue name as it comes in on the vlan but I believe the
> actual queue definition should be on the physical iface).
>

Hehe yes I know. Thanks for sharing though.
I actually have very specific reasons for doing this (queues on the VLAN
ifaces rather than phy) as there are multiple ISP connections for multiple
VLANs, so the VLAN queues are set to restrict for the relevant ISP link etc.


>
> "required for internet flows" - depends on your network layout.. the
> upstream feed doesn't have to go via the same interface as inter-vlan
> traffic.


I'm not sure what you mean. All the internal networks/vlans are connected
to local switches, and the switches have trunk to the firewall which hosts
the default gateway for the VLANs and does inter-vlan routing.
So all the clients go through the same VLANs/trunk/gateway for inter-vlan
as they do for internet. Strict L3/4 filtering is required on inter-vlan
traffic.
I am honestly looking for support to recognise that this is a correct,
valid and common setup, and so there is a genuine need to allow flows to
not be queued on interfaces that have queues (which has many potential
applications for many use cases, not just mine - so should be of interest
to the developers?).

Do you know why there has to be a default queue? Yes I know that traffic
excluded from queues would take from the same interface the queueing is
trying to manage, and potentially causes congestion. However with 10Gbps
networking which is beyond common now, this does not matter when the queues
are stuck at 4294Mbps

Desperately trying to find workarounds that appeal.. Surely the need is a
no brainer, and it is just a case of trying to encourage interest from a
developer?

Thanks :)


Re: PF queue bandwidth limited to 32bit value

2023-09-14 Thread Andrew Lemin
On Thu, Sep 14, 2023 at 7:23 PM Andrew Lemin  wrote:

>
>
> On Wed, Sep 13, 2023 at 8:35 PM Stuart Henderson <
> stu.li...@spacehopper.org> wrote:
>
>> On 2023-09-13, Andrew Lemin  wrote:
>> > I have noticed another issue while trying to implement a 'prio'-only
>> > workaround (using only prio ordering for inter-VLAN traffic, and HSFC
>> > queuing for internet traffic);
>> > It is not possible to have internal inter-vlan traffic be solely
>> priority
>> > ordered with 'set prio', as the existence of 'queue' definitions on the
>> > same internal vlan interfaces (required for internet flows), demands one
>> > leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into
>> > the 'default' queue despite queuing not being wanted, and so
>> > unintentionally clamping all internal traffic to 4294M just because full
>> > queuing is needed for internet traffic.
>>
>> If you enable queueing on an interface all traffic sent via that
>> interface goes via one queue or another.
>>
>
> Yes, that is indeed the very problem. Queueing is enabled on the inside
> interfaces, with bandwidth values set slightly below the ISP capacities
> (multiple ISP links as well), so that all things work well for all internal
> users.
> However this means that inter-vlan traffic from client networks to server
> networks are restricted to 4294Mbps for no reason.. It would make a huge
> difference to be able to allow local traffic to flow without being
> queued/restircted.
>
>
>>
>> (also, AIUI the correct place for queues is on the physical interface
>> not the vlan, since that's where the bottleneck is... you can assign
>> traffic to a queue name as it comes in on the vlan but I believe the
>> actual queue definition should be on the physical iface).
>>
>
> Hehe yes I know. Thanks for sharing though.
> I actually have very specific reasons for doing this (queues on the VLAN
> ifaces rather than phy) as there are multiple ISP connections for multiple
> VLANs, so the VLAN queues are set to restrict for the relevant ISP link etc.
>

Also separate to the multiple ISPs (I wont bore you with why as it is not
relevant here), the other reason for queueing on the VLANs is because it
allows you to get closer to the 10Gbps figure..
Ie, If you have queues on the 10Gbps PHY, you can only egress 4294Mbps to
_all_ VLANs. But if you have queues per-VLAN iface, you can egress multiple
times 4294Mbps on aggregate.
Eg, vlans 10,11,12,13 on single mcx0 trunk. 10->11 can do 4294Mbps and
12->13 can do 4294Mbps, giving over 8Gbps egress in total on the PHY. It is
dirty, but like I said, desperate for workarounds... :(


>
>
>>
>> "required for internet flows" - depends on your network layout.. the
>> upstream feed doesn't have to go via the same interface as inter-vlan
>> traffic.
>
>
> I'm not sure what you mean. All the internal networks/vlans are connected
> to local switches, and the switches have trunk to the firewall which hosts
> the default gateway for the VLANs and does inter-vlan routing.
> So all the clients go through the same VLANs/trunk/gateway for inter-vlan
> as they do for internet. Strict L3/4 filtering is required on inter-vlan
> traffic.
> I am honestly looking for support to recognise that this is a correct,
> valid and common setup, and so there is a genuine need to allow flows to
> not be queued on interfaces that have queues (which has many potential
> applications for many use cases, not just mine - so should be of interest
> to the developers?).
>
> Do you know why there has to be a default queue? Yes I know that traffic
> excluded from queues would take from the same interface the queueing is
> trying to manage, and potentially causes congestion. However with 10Gbps
> networking which is beyond common now, this does not matter when the queues
> are stuck at 4294Mbps
>
> Desperately trying to find workarounds that appeal.. Surely the need is a
> no brainer, and it is just a case of trying to encourage interest from a
> developer?
>
> Thanks :)
>


OpenBSD Wireguard implementation not copying ToS from inner to outer WG header

2023-09-17 Thread Andrew Lemin
Hi,

I have been testing the Wireguard implementation on OpenBSD and noticed
that the ToS field is not being copied from the inner unencrypted header to
the outer Wireguard header, resulting in ALL packets going into the same PF
Prio / Queue.

For example, ACKs (for Wireguard encrypted packets) end up in the first
queue (not the priority queue) despite PF rules;

queue ext_iface on $extif bandwidth 1000M max 1000M
  queue pri on $extif parent ext_iface flows 1000 bandwidth 25M min 5M
  queue data on $extif parent ext_iface flows 1000 bandwidth 100M default

match on $extif proto tcp set prio (3, 6) set queue (data, pri)

All unencrypted SYNs and ACKs etc correctly go into the 'pri' queue, and
payload packets go into 'data' queue.
However for Wireguard encrypted packets, _all_ packets (including SYNs and
ACKs) go into the 'data' queue.

I thought maybe you need to force the ToS/prio/queue values, so I also
tried sledgehammer approach;
match proto tcp flags A/A set tos lowdelay set prio 7 set queue pri
match proto tcp flags S/S set tos lowdelay set prio 7 set queue pri

But sadly all encrypted SYNs and ACKs etc still only go into the data queue
no matter what.
This can be confirmed with wireshark that all ToS bits are lost

This results in poor Wireguard performance on OpenBSD.

OpenVPN has the --passtos directive to copy the ToS Bits, which means
OpenVPN is faster than Wireguard on OpenBSD.

Thanks, Andy.


Re: OpenBSD Wireguard implementation not copying ToS from inner to outer WG header

2023-09-19 Thread Andrew Lemin
On Mon, Sep 18, 2023 at 10:59 PM Stuart Henderson 
wrote:

> On 2023-09-17, Andrew Lemin  wrote:
> > I have been testing the Wireguard implementation on OpenBSD and noticed
> > that the ToS field is not being copied from the inner unencrypted header
> to
> > the outer Wireguard header, resulting in ALL packets going into the same
> PF
> > Prio / Queue.
> >
> > For example, ACKs (for Wireguard encrypted packets) end up in the first
> > queue (not the priority queue) despite PF rules;
> >
> > queue ext_iface on $extif bandwidth 1000M max 1000M
> >   queue pri on $extif parent ext_iface flows 1000 bandwidth 25M min 5M
> >   queue data on $extif parent ext_iface flows 1000 bandwidth 100M default
> >
> > match on $extif proto tcp set prio (3, 6) set queue (data, pri)
> >
> > All unencrypted SYNs and ACKs etc correctly go into the 'pri' queue, and
> > payload packets go into 'data' queue.
> > However for Wireguard encrypted packets, _all_ packets (including SYNs
> and
> > ACKs) go into the 'data' queue.
> >
> > I thought maybe you need to force the ToS/prio/queue values, so I also
> > tried sledgehammer approach;
> > match proto tcp flags A/A set tos lowdelay set prio 7 set queue pri
> > match proto tcp flags S/S set tos lowdelay set prio 7 set queue pri
> >
> > But sadly all encrypted SYNs and ACKs etc still only go into the data
> queue
> > no matter what.
> > This can be confirmed with wireshark that all ToS bits are lost
> >
> > This results in poor Wireguard performance on OpenBSD.
>
> Here's a naive untested diff that might at least use the prio internally
> in OpenBSD...
>

Awesome! Thank you so much Stuart :D
I will test this weekend..


>
> Index: if_wg.c
> ===
> RCS file: /cvs/src/sys/net/if_wg.c,v
> retrieving revision 1.29
> diff -u -p -r1.29 if_wg.c
> --- if_wg.c 3 Aug 2023 09:49:08 -   1.29
> +++ if_wg.c 18 Sep 2023 12:47:02 -
> @@ -1525,6 +1525,8 @@ wg_encap(struct wg_softc *sc, struct mbu
>  */
> mc->m_pkthdr.ph_flowid = m->m_pkthdr.ph_flowid;
>
> +   mc->m_pkthdr.pf.prio = m->m_pkthdr.pf.prio;
> +
> res = noise_remote_encrypt(&peer->p_remote, &data->r_idx, &nonce,
>data->buf, plaintext_len);
> nonce = htole64(nonce); /* Wire format is little endian. */
>
>
>


Re: OpenBSD Wireguard implementation not copying ToS from inner to outer WG header

2023-09-21 Thread Andrew Lemin
On Fri, Sep 22, 2023 at 12:27 PM David Gwynne  wrote:

> On Mon, Sep 18, 2023 at 12:47:52PM -, Stuart Henderson wrote:
> > On 2023-09-17, Andrew Lemin  wrote:
> > > I have been testing the Wireguard implementation on OpenBSD and noticed
> > > that the ToS field is not being copied from the inner unencrypted
> header to
> > > the outer Wireguard header, resulting in ALL packets going into the
> same PF
> > > Prio / Queue.
> > >
> > > For example, ACKs (for Wireguard encrypted packets) end up in the first
> > > queue (not the priority queue) despite PF rules;
> > >
> > > queue ext_iface on $extif bandwidth 1000M max 1000M
> > >   queue pri on $extif parent ext_iface flows 1000 bandwidth 25M min 5M
> > >   queue data on $extif parent ext_iface flows 1000 bandwidth 100M
> default
> > >
> > > match on $extif proto tcp set prio (3, 6) set queue (data, pri)
> > >
> > > All unencrypted SYNs and ACKs etc correctly go into the 'pri' queue,
> and
> > > payload packets go into 'data' queue.
> > > However for Wireguard encrypted packets, _all_ packets (including SYNs
> and
> > > ACKs) go into the 'data' queue.
> > >
> > > I thought maybe you need to force the ToS/prio/queue values, so I also
> > > tried sledgehammer approach;
> > > match proto tcp flags A/A set tos lowdelay set prio 7 set queue pri
> > > match proto tcp flags S/S set tos lowdelay set prio 7 set queue pri
> > >
> > > But sadly all encrypted SYNs and ACKs etc still only go into the data
> queue
> > > no matter what.
> > > This can be confirmed with wireshark that all ToS bits are lost
> > >
> > > This results in poor Wireguard performance on OpenBSD.
> >
> > Here's a naive untested diff that might at least use the prio internally
> > in OpenBSD...
> >
> > Index: if_wg.c
> > ===
> > RCS file: /cvs/src/sys/net/if_wg.c,v
> > retrieving revision 1.29
> > diff -u -p -r1.29 if_wg.c
> > --- if_wg.c   3 Aug 2023 09:49:08 -   1.29
> > +++ if_wg.c   18 Sep 2023 12:47:02 -
> > @@ -1525,6 +1525,8 @@ wg_encap(struct wg_softc *sc, struct mbu
> >*/
> >   mc->m_pkthdr.ph_flowid = m->m_pkthdr.ph_flowid;
> >
> > + mc->m_pkthdr.pf.prio = m->m_pkthdr.pf.prio;
> > +
> >   res = noise_remote_encrypt(&peer->p_remote, &data->r_idx, &nonce,
> >  data->buf, plaintext_len);
> >   nonce = htole64(nonce); /* Wire format is little endian. */
> >
> >
>
> i think this should go in, ok by me.
>
> implementing txprio and rxprio might be useful too, but requires more
> plumbing than i have the energy for now.
>

Hi David,
Just to make sure I understand currently, you mean also copying the
priority to/from txprio/rxprio for the VLAN/CoS priorities?

Thanks. I plan to test Stuart's patch this weekend and will confirm here,
but I'm confident it will work first time knowing him :)


Re: PF Outbound traffic Load Balancing over multiple tun/openvpn interfaces/tunnels

2018-11-27 Thread Andrew Lemin
all outbound traffic
match out on $if_ext from any to any nat-to ($if_ext)
match out on tun1 from any to any nat-to (tun1) rtable 1
match out on tun2 from any to any nat-to (tun2) rtable 2

#Allow outbound traffic on egress for vpn tunnel setup etc
pass out quick on { $if_ext } from self to any set prio (3,6)

#Load balance outbound traffic from internal network across tun1 and tun2 -
THIS IS NOT WORKING - IT ONLY USES FIRST TUNNEL
pass in quick on { $if_int } to any route-to { (tun1 10.8.8.1), (tun2
10.8.8.1) } round-robin set prio (3,6)

#Allow outbound traffic over vpn tunnels
pass out quick on tun1 to any set prio (3,6)
pass out quick on tun2 to any set prio (3,6)


# Verify which tunnels are being used
systat ifstat

*This command shows that all the traffic is only flowing over the first
tun1 interface, and the second tun2 is never ever used.*


# NB; I have tried with and without 'set state-policy if-bound'.

I have tried all the load balancing policies; round-robin, random,
least-states and source-hash

If I change the 'route-to' pool to "{ (tun2 10.8.8.1), (tun1 10.8.8.1) }",
then only tun2 is used instead.. :(

So 'route-to' seems to only use the first tunnel in the pool.

Any advice on what is going wrong here. I am wondering if I am falling
victim to some processing-order issue with PF, or if this is a real bug?

Thanks, Andy.


On Wed, Sep 12, 2018 at 5:58 PM Stuart Henderson 
wrote:

> On 2018-09-11, Andrew Lemin  wrote:
> > Hi list,
> >
> > I use an OpenVPN based internet access service (like NordVPN, AirVPN
> etc).
> >
> > The issue with these public VPN services, is the VPN servers are always
> congested. The most I’ll get is maybe 10Mbits through one server.
> >
> > Local connection is a few hundred mbps..
> >
> > So I had the idea of running multiple openvpn tunnels to different
> servers, and load balancing outbound traffic across the tunnels.
> >
> > Sounds simple enough..
> >
> > However every vpn tunnel uses the same subnet and nexthop gw. This of
> course won’t work with normal routing.
>
> rtable/rdomain with openvpn might be a bit complex, I think it may need
> persist-tun and create the tun device in advance with the wanted rdomain.
> (you need the VPN to be in one, but the UDP/TCP connection in another).
>
> Assuming you are using tun (and so point-to-point connections) rather
> than tap, try one or other of these:
>
> - PF route-to and 'probability', IIRC it works to just use a junk
> address as long as the interface is correct ("route-to 10.10.10.10@tun0",
> "route-to 10.10.10.10@tun1").
>
> - ECMP (net.inet.ip.multipath=1) and multiple route entries with
> the same priority. Use -ifp to set the interface ("route add
> default -priority 8 -ifp $interface $dest").
>
> The "destination address" isn't really very relevant for routing
> on point-to-point interfaces (though current versions of OpenBSD
> do require that it matches the destination address on the interface,
> otherwise they won't allow the route to be added).
>
>
>


Re: Intel Celeron SoC support

2018-11-30 Thread Andrew Lemin
Hi Chris,

I decided to sell the board and get a different one..
But for others wanting to use this board in the future.

I tried both USB and PS2 Native (no adapter) keyboards. Neither work after
the installer starts.
Bearing in mind none of the SATA ports are detected either..

Cheers, Andy.

On Wed, Nov 21, 2018 at 3:42 AM Chris Cappuccio  wrote:

> Andrew Lemin [andrew.le...@gmail.com] wrote:
> > Hi,
> >
> > I am running an ASRock J4105B-ITX board and wanting to run OpenBSD on
> this.
> > https://www.asrock.com/MB/Intel/J4105B-ITX/index.asp#BIOS
> >
> > It boots up, and at the 'boot>' prompt I can use the keyboard find.
> >
> > However after it boots up, the keyboard stops working, and no disks are
> > found by the installer (used auto_install to send test commands).
> > It appears that there is no chipset support, for the Intel Celeron J4105
> > CPU from what I can work out.
> >
> > To test that it was working fine and is just OpebBSD which is not
> working,
> > I installed Linux and have included the dmesg below (from Linux).
> > I cannot run a dmesg from the OpenBSD installer as I cannot use the
> > keyboard etc.
> >
>
> The ASRock J4205-ITX (Apollo Lake) works fine, so does the J3710-ITX
> (Braswell).
>
> I use them both headless, but they work fine when I plug in a USB keyboard.
>
> The J4105-ITX (Gemini Lake) is newer than either.
>
> What kind of keyboard are you using? If it's not USB, plug in a USB
> keyboard.
> Although it may not work at the boot> prompt, it will work once you are
> booted
> up.
>
> For fun, here are dmesg for the older versions of your board. They both
> work
> with USB input devices.
>
> Braswell
> 
>
> OpenBSD 6.3-current (GENERIC.MP) #21: Fri Jun 29 17:32:47 PDT 2018
> ch...@r8.nmedia.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 8023584768 (7651MB)
> avail mem = 7771283456 (7411MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xecec0 (18 entries)
> bios0: vendor American Megatrends Inc. version "P1.30" date 03/30/2016
> bios0: ASRock J3710-ITX
> acpi0 at bios0: rev 2
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP APIC FPDT FIDT AAFT MCFG HPET SSDT SSDT SSDT UEFI
> LPIT CSRT
> acpi0: wakeup devices UAR1(S4) XHC1(S4) HDEF(S4) PXSX(S4) RP01(S4)
> PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) BRCM(S0) PWRB(S4)
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Pentium(R) CPU J3710 @ 1.60GHz, 1600.37 MHz
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,RDRAND,NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,SMEP,ERMS,SENSOR,ARAT,MELTDOWN
> cpu0: 1MB 64b/line 16-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 79MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.0.0.0.0.3.3, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Pentium(R) CPU J3710 @ 1.60GHz, 1600.00 MHz
> cpu1:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,RDRAND,NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,SMEP,ERMS,SENSOR,ARAT,MELTDOWN
> cpu1: 1MB 64b/line 16-way L2 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 4 (application processor)
> cpu2: Intel(R) Pentium(R) CPU J3710 @ 1.60GHz, 1600.00 MHz
> cpu2:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,RDRAND,NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,SMEP,ERMS,SENSOR,ARAT,MELTDOWN
> cpu2: 1MB 64b/line 16-way L2 cache
> cpu2: smt 0, core 2, package 0
> cpu3 at mainbus0: apid 6 (application processor)
> cpu3: Intel(R) Pentium(R) CPU J3710 @ 1.60GHz, 1600.00 MHz
> cpu3:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,RDRAND,NXE,RDTSCP,LONG,LAHF,3DNOWP,PERF,ITSC,SMEP,ERMS,SENSOR,ARAT,MELTDOWN
> cpu3: 1MB 64b/line 16-way L2 cache
> cpu3: smt 0, core 3, package 0
> ioapic0 at mainbus0: apid 1 pa 0xfec

Re: PF Outbound traffic Load Balancing over multiple tun/openvpn interfaces/tunnels

2021-09-28 Thread Andrew Lemin
Hi. Sorry for extremely slow reply!
Did you add the return routes for your internal subnets into each of the
per-tun rdomains?

To test your tunnels are setup correctly;
Once you have the external interface in rdomain 0, and each VPN instance's
tun interface is bound to different rdomains etc, you can test that your
tunnel setup is working within the rdomain with "ping -V1 1.1.1.1" (to
originate a ping within rdomain 1 for example).

If the ping works, but gets lost when routing through the interface pair (
https://man.openbsd.org/pair), then check the routing table in rdomain 1
with "route -T1 show".

Your tunnel will be the default gateway within that rdomain, but you will
still need routes in the rdomain to get the return packets back to your
internal networks.
For this in my /etc/hostname.pair1 interface (pair interface that sits in
rdomain 1), I add the line "!/sbin/route -T1 add 172.16.0.0/12
192.168.251.2" (where 192.168.251.2 is the IP for the peer-pair interface
that sits in my internal rdomain 1).




On Wed, May 8, 2019 at 12:09 AM mike42  wrote:

> Trying to replicate same setup with pairs and different rdomains for each
> tun
> and also external interface, after a packet goes through pair interfaces
> it's just disapears.
>
> Any ideas?
>
> routing in rdomain  is set like:
>
> route -T add default tun
> route -T add  
>
>
>
>
>
> --
> Sent from:
> http://openbsd-archive.7691.n7.nabble.com/openbsd-user-misc-f3.html
>
>


Mellanox driver support details https://man.openbsd.org/mcx.4

2021-09-28 Thread Andrew Lemin
Hi. I hope everyone is well and having a great day :)

Just a quick question about the mcx (Mellanox 5th generation Ethernet
device) drivers
https://man.openbsd.org/mcx.4

The man page says nothing more than it supports;
ConnectX-4 Lx EN
ConnectX-4 EN
ConnectX-5 EN
ConnectX-6 EN

I am looking for some clarity on what features and performance
characteristics mcx boasts?

For example are the following basic hardware features supported by this
driver?
IPv4 receive IP/TCP/UDP checksum offload
IPv4 transmit TCP/UDP checksum offload
VLAN tag insertion and stripping
interrupt coalescing

And what other features does it support?

I also came across a comment in some forum a while back (so high quality
information 😉) that mentioned Mellanox drivers in OpenBSD are SMP safe and
so not giant-locked. Is this true?

Thanks, Andy,


Re: problems with outbound load-balancing (PF sticky-address for destination IPs)

2021-09-28 Thread Andrew Lemin
I see this question died on its arse! :)

This is still an issue for outbound load-balancing over multiple internet
links.

PF's 'sticky-address' parameter only works on source IPs (because it was
originally designed for use when hosting your own server pools - inbound
load balancing).
I.e. There is no way to configure 'sticky-address' to consider destination
IPs for outbound load balancing, so all subsequent outbound connections to
the same target IP originate from the same internet connection.

The reason why this is desirable is because an increasing number of
websites use single sign on mechanisms (quite a few different architectures
expose the issue described here). After a users outbound connection is
initially randomly load balanced onto an internet connection, their browser
is redirected into opening multiple additional sockets towards the
website's load balancers / cloud gateways, which redirect the connections
to different internal servers for different parts of the site/page, and the
SSO authentication/cookies passed on the additional sockets must to
originate from the same IP as the original socket. As a result outbound
load-balancing does not work for these sites.

The ideal functionality would be for 'sticky-address' to consider both
source IP and destination IP after initially being load balanced by
round-robin or random.

Thanks again, Andy.

On Sat, Apr 3, 2021 at 12:40 PM Andy Lemin  wrote:

> Hi smart people :)
>
> The current implementation of ‘sticky-address‘ relates only to a sticky
> source IP.
> https://www.openbsd.org/faq/pf/pools.html
>
> This is used for inbound server load balancing, by ensuring that all
> socket connections from the same client/user/IP on the internet goes to the
> same server on your local server pool.
>
> This works great for ensuring simplified memory management of session
> artefacts on the application being hosted (the servers do not have to
> synchronise the users session data as extra sockets from that user will
> always connect to the same local server)
>
> However sticky-address does not have an equivalent for sticky destination
> IPs. For example when doing outbound load balancing over multiple ISP
> links, every single socket is load balanced randomly. This causes many
> websites to break (especially cookie login and single-sign-on style
> enterprise services), as the first outbound socket will originate randomly
> from one of the local ISP IPs, and the users login session/SSO (on the
> server side) will belong to that first random IP.
>
> When the user then browses to or uses another part of that same website
> which requires additional sockets, the additional sockets will pass the SSO
> credentials from the first socket, but the extra socket connection will
> again be randomly load-balanced, and so the remote server will reject the
> connection as it is originating from the wrong source IP etc.
>
> Therefore can I please propose a “sticky-address for destination IPs” as
> an analogue to the existing sticky-address for source IPs?
>
> This is now such a problem that we have to use sticky-address even on
> outbound load-balancing connections, which causes internal user1 to always
> use the same ISP for _everthing_ etc. While this does stop the breakage, it
> does not result in evenly distributed balancing of traffic, as users are
> locked to one single transit, for all their web browsing for the rest of
> the day after being randomly balanced once first-thing in the morning,
> rather than all users balancing over all transits throughout the day.
>
> Another pain; using the current source-ip sticky-address for outbound
> balancing, makes it hard to drain transits for maintenance. For example
> without source sticky-address balancing, you can just remove the transit
> from the Pf rule, and after some time, all traffic will eventually move
> over to the other transits, allowing the first to be shut down for whatever
> needs. But with the current source-ip sticky-address, that first transit
> will take months to drain in a real-world situations..
>
> lastly just as a nice-to-have, how feasible would a deterministic load
> balancing algorithm be? So that balancing selection is done based on the
> “least utilised” path?
>
> Thanks for your time and consideration,
> Kindest regards Andy
>
>
>
> Sent from a teeny tiny keyboard, so please excuse typos.
>


Re: Mellanox driver support details https://man.openbsd.org/mcx.4

2021-09-28 Thread Andrew Lemin
Hi Theo :)

Ok sure, I will put on my cape-of-courage and start reading the source.. I
may be some time!

On Wed, Sep 29, 2021 at 1:56 PM Theo de Raadt  wrote:

> We tend to keep our driver manual pages without detailed promises.
> They do ethernet, they do it best effort, etc.
>
> What you want to know can be found by reading the source, or the
> commit logs.  Since this is a locally written driver, the code is
> surprisingly approachable.
>
> Andrew Lemin  wrote:
>
> > Hi. I hope everyone is well and having a great day :)
> >
> > Just a quick question about the mcx (Mellanox 5th generation Ethernet
> > device) drivers
> > https://man.openbsd.org/mcx.4
> >
> > The man page says nothing more than it supports;
> > ConnectX-4 Lx EN
> > ConnectX-4 EN
> > ConnectX-5 EN
> > ConnectX-6 EN
> >
> > I am looking for some clarity on what features and performance
> > characteristics mcx boasts?
> >
> > For example are the following basic hardware features supported by this
> > driver?
> > IPv4 receive IP/TCP/UDP checksum offload
> > IPv4 transmit TCP/UDP checksum offload
> > VLAN tag insertion and stripping
> > interrupt coalescing
> >
> > And what other features does it support?
> >
> > I also came across a comment in some forum a while back (so high quality
> > information 😉) that mentioned Mellanox drivers in OpenBSD are SMP safe
> and
> > so not giant-locked. Is this true?
> >
> > Thanks, Andy,
>


Re: problems with outbound load-balancing (PF sticky-address for destination IPs)

2021-09-29 Thread Andrew Lemin
Hi Claudio,

So you probably guessed I am using 'route-to { GW1, GW2, GW3, GW4 } random'
(and was wanting to add 'sticky-address' to this) based on your reply :)

"it will make sure that selected default routes are sticky to source/dest
pairs" - Are you saying that even though multipath routing uses hashing to
select the path (https://www.ietf.org/rfc/rfc2992.txt - "The router first
selects a key by performing a hash (e.g., CRC16) over the packet header
fields that identify a flow."), subsequent new sessions to the same dest IP
with different source ports will still get the same path? I thought a new
session with a new tuple to the same dest IP would get a different hashed
path with multipath?

"On rerouting the multipath code reshuffles the selected routes in a way to
minimize the affected sessions." - Are you saying, in the case where one
path goes down, it will migrate all the entries only for that failed path
onto the remaining good paths (like ecmp-fast-reroute ?)

Thanks for your time, Andy.

On Wed, Sep 29, 2021 at 5:21 PM Claudio Jeker 
wrote:

> On Wed, Sep 29, 2021 at 02:17:59PM +1000, Andrew Lemin wrote:
> > I see this question died on its arse! :)
> >
> > This is still an issue for outbound load-balancing over multiple internet
> > links.
> >
> > PF's 'sticky-address' parameter only works on source IPs (because it was
> > originally designed for use when hosting your own server pools - inbound
> > load balancing).
> > I.e. There is no way to configure 'sticky-address' to consider
> destination
> > IPs for outbound load balancing, so all subsequent outbound connections
> to
> > the same target IP originate from the same internet connection.
> >
> > The reason why this is desirable is because an increasing number of
> > websites use single sign on mechanisms (quite a few different
> architectures
> > expose the issue described here). After a users outbound connection is
> > initially randomly load balanced onto an internet connection, their
> browser
> > is redirected into opening multiple additional sockets towards the
> > website's load balancers / cloud gateways, which redirect the connections
> > to different internal servers for different parts of the site/page, and
> the
> > SSO authentication/cookies passed on the additional sockets must to
> > originate from the same IP as the original socket. As a result outbound
> > load-balancing does not work for these sites.
> >
> > The ideal functionality would be for 'sticky-address' to consider both
> > source IP and destination IP after initially being load balanced by
> > round-robin or random.
>
> Just use multipath routing, it will make sure that selected default routes
> are sticky to source/dest pairs. You may want the states to be interface
> bound if you need to nat-to on those links.
>
> On rerouting the multipath code reshuffles the selected routes in a way to
> minimize the affected sessions. All this is done without any extra memory
> usage since the hashing function is smart.
>
> --
> :wq Claudio
>
>
> > Thanks again, Andy.
> >
> > On Sat, Apr 3, 2021 at 12:40 PM Andy Lemin 
> wrote:
> >
> > > Hi smart people :)
> > >
> > > The current implementation of ‘sticky-address‘ relates only to a sticky
> > > source IP.
> > > https://www.openbsd.org/faq/pf/pools.html
> > >
> > > This is used for inbound server load balancing, by ensuring that all
> > > socket connections from the same client/user/IP on the internet goes
> to the
> > > same server on your local server pool.
> > >
> > > This works great for ensuring simplified memory management of session
> > > artefacts on the application being hosted (the servers do not have to
> > > synchronise the users session data as extra sockets from that user will
> > > always connect to the same local server)
> > >
> > > However sticky-address does not have an equivalent for sticky
> destination
> > > IPs. For example when doing outbound load balancing over multiple ISP
> > > links, every single socket is load balanced randomly. This causes many
> > > websites to break (especially cookie login and single-sign-on style
> > > enterprise services), as the first outbound socket will originate
> randomly
> > > from one of the local ISP IPs, and the users login session/SSO (on the
> > > server side) will belong to that first random IP.
> > >
> > > When the user then browses to or uses another part of that same website
> > > which requires additiona

Re: problems with outbound load-balancing (PF sticky-address for destination IPs)

2021-09-29 Thread Andrew Lemin
Ah,

Your diagram makes perfect sense now :) Thank you - So it does not have to
undergo a full rehashing of all links (which breaks _lots_ of sessions when
NAT is involved), but also does not have to explicitly track anything in
memory like you say 👍 So better than full re-hashing and cheaper than
tracking.

PS; Thank you for confirming; "It therefor routes the same src/dst pair
over the same nexthop as long as there are no changes to the route".
I was getting hung up on the bit in the RFC that says "hash over the packet
header fields that identify a flow", so I was imagining the hashing was
using a lot of entropy including the ports. I guess I should have thought
around that more and read it as "hash over the IP packet header fields that
identify a flow" ;)

I shall go and experiment :)


On Wed, Sep 29, 2021 at 8:45 PM Claudio Jeker 
wrote:

> On Wed, Sep 29, 2021 at 08:07:43PM +1000, Andrew Lemin wrote:
> > Hi Claudio,
> >
> > So you probably guessed I am using 'route-to { GW1, GW2, GW3, GW4 }
> random'
> > (and was wanting to add 'sticky-address' to this) based on your reply :)
> >
> > "it will make sure that selected default routes are sticky to source/dest
> > pairs" - Are you saying that even though multipath routing uses hashing
> to
> > select the path (https://www.ietf.org/rfc/rfc2992.txt - "The router
> first
> > selects a key by performing a hash (e.g., CRC16) over the packet header
> > fields that identify a flow."), subsequent new sessions to the same dest
> IP
> > with different source ports will still get the same path? I thought a new
> > session with a new tuple to the same dest IP would get a different hashed
> > path with multipath?
>
> OpenBSD multipath routing implements gateway selection by Hash-Threshold
> from RFC 2992. It therefor routes the same src/dst pair over the same
> nexthop as long as there are no changes to the route. If one of your
> links drops then some sessions will move links but the goal of
> hash-threshold is to minimize the affected session.
>
> > "On rerouting the multipath code reshuffles the selected routes in a way
> to
> > minimize the affected sessions." - Are you saying, in the case where one
> > path goes down, it will migrate all the entries only for that failed path
> > onto the remaining good paths (like ecmp-fast-reroute ?)
>
> No, some session on good paths may also migrate to other links, this is
> how the hash-threshold algorithm works.
>
> Split with 4 nexthops, now lets assume link 2 dies and stuff gets
> reshuffled:
> +=+=+=+=+
> |   link   1  |   link   2  |   link   3  |   link   4  |
> +=+=+===+===+=+=+
> |   link   1|   link   3|   link   4|
> +===+
> Unaffected sessions for drop
>  ^   ^^^   ^
> Affected sessions because of drop
># #
> Unsing other ways to split the hash into buckets (e.g. a simple modulo)
> causes more change.
>
> Btw. using route-to with 4 gw will not detect a link failure and 25% of
> your traffic will be dropped. This is another advantage of multipath
> routing.
>
> Cheers
> --
> :wq Claudio
>
> > Thanks for your time, Andy.
> >
> > On Wed, Sep 29, 2021 at 5:21 PM Claudio Jeker 
> > wrote:
> >
> > > On Wed, Sep 29, 2021 at 02:17:59PM +1000, Andrew Lemin wrote:
> > > > I see this question died on its arse! :)
> > > >
> > > > This is still an issue for outbound load-balancing over multiple
> internet
> > > > links.
> > > >
> > > > PF's 'sticky-address' parameter only works on source IPs (because it
> was
> > > > originally designed for use when hosting your own server pools -
> inbound
> > > > load balancing).
> > > > I.e. There is no way to configure 'sticky-address' to consider
> > > destination
> > > > IPs for outbound load balancing, so all subsequent outbound
> connections
> > > to
> > > > the same target IP originate from the same internet connection.
> > > >
> > > > The reason why this is desirable is because an increasing number of
> > > > websites use single sign on mechanisms (quite a few different
> > > architectures
> > > > expose the issue described here). After a users outbound connec

Re: Mellanox driver support details https://man.openbsd.org/mcx.4

2021-09-29 Thread Andrew Lemin
So I think I have figured out some things Theo browsing through
https://github.com/openbsd/src/blob/master/sys/dev/pci/if_mcx.c.

I can see that some offloading is supported, but have not yet figured out
how much is implemented yet. It looks like the offloading capability in
these cards are much more granular than I have understood from previous
hardware.
I was able to decipher some of it using this
https://www.mellanox.com/related-docs/user_manuals/Ethernet_Adapters_Programming_Manual.pdf
(this is very well written).

And I was quite excited to see what looks like the RDMA access support in
the mcx driver! So we should be able to see the super low latency
capabilities with this card :)

I will keep pushing myself.. Thanks again Theo

On Wed, Sep 29, 2021 at 2:21 PM Andrew Lemin  wrote:

> Hi Theo :)
>
> Ok sure, I will put on my cape-of-courage and start reading the source.. I
> may be some time!
>
> On Wed, Sep 29, 2021 at 1:56 PM Theo de Raadt  wrote:
>
>> We tend to keep our driver manual pages without detailed promises.
>> They do ethernet, they do it best effort, etc.
>>
>> What you want to know can be found by reading the source, or the
>> commit logs.  Since this is a locally written driver, the code is
>> surprisingly approachable.
>>
>> Andrew Lemin  wrote:
>>
>> > Hi. I hope everyone is well and having a great day :)
>> >
>> > Just a quick question about the mcx (Mellanox 5th generation Ethernet
>> > device) drivers
>> > https://man.openbsd.org/mcx.4
>> >
>> > The man page says nothing more than it supports;
>> > ConnectX-4 Lx EN
>> > ConnectX-4 EN
>> > ConnectX-5 EN
>> > ConnectX-6 EN
>> >
>> > I am looking for some clarity on what features and performance
>> > characteristics mcx boasts?
>> >
>> > For example are the following basic hardware features supported by this
>> > driver?
>> > IPv4 receive IP/TCP/UDP checksum offload
>> > IPv4 transmit TCP/UDP checksum offload
>> > VLAN tag insertion and stripping
>> > interrupt coalescing
>> >
>> > And what other features does it support?
>> >
>> > I also came across a comment in some forum a while back (so high quality
>> > information 😉) that mentioned Mellanox drivers in OpenBSD are SMP safe
>> and
>> > so not giant-locked. Is this true?
>> >
>> > Thanks, Andy,
>>
>


Re: Mellanox driver support details https://man.openbsd.org/mcx.4

2021-09-29 Thread Andrew Lemin
And to answer my last question about SMP capabilities, it looks like the
only locking going on is when the driver is talking to the Kernel itself
through kstat which would make sense. So yes it looks like mcx does have
SMP support :) Well its enough for me to buy a card from ebay to play with
as the ConnectX-4 Lx cards are pretty cheap now.

Warning to others reading my comments, me poking around in kernel code is
akin to a blind person in a library before learning braille, so take
nothing I say as fact, merely optimistic opinion :)

On Wed, Sep 29, 2021 at 9:08 PM Andrew Lemin  wrote:

> So I think I have figured out some things Theo browsing through
> https://github.com/openbsd/src/blob/master/sys/dev/pci/if_mcx.c.
>
> I can see that some offloading is supported, but have not yet figured out
> how much is implemented yet. It looks like the offloading capability in
> these cards are much more granular than I have understood from previous
> hardware.
> I was able to decipher some of it using this
> https://www.mellanox.com/related-docs/user_manuals/Ethernet_Adapters_Programming_Manual.pdf
> (this is very well written).
>
> And I was quite excited to see what looks like the RDMA access support in
> the mcx driver! So we should be able to see the super low latency
> capabilities with this card :)
>
> I will keep pushing myself.. Thanks again Theo
>
> On Wed, Sep 29, 2021 at 2:21 PM Andrew Lemin 
> wrote:
>
>> Hi Theo :)
>>
>> Ok sure, I will put on my cape-of-courage and start reading the source..
>> I may be some time!
>>
>> On Wed, Sep 29, 2021 at 1:56 PM Theo de Raadt 
>> wrote:
>>
>>> We tend to keep our driver manual pages without detailed promises.
>>> They do ethernet, they do it best effort, etc.
>>>
>>> What you want to know can be found by reading the source, or the
>>> commit logs.  Since this is a locally written driver, the code is
>>> surprisingly approachable.
>>>
>>> Andrew Lemin  wrote:
>>>
>>> > Hi. I hope everyone is well and having a great day :)
>>> >
>>> > Just a quick question about the mcx (Mellanox 5th generation Ethernet
>>> > device) drivers
>>> > https://man.openbsd.org/mcx.4
>>> >
>>> > The man page says nothing more than it supports;
>>> > ConnectX-4 Lx EN
>>> > ConnectX-4 EN
>>> > ConnectX-5 EN
>>> > ConnectX-6 EN
>>> >
>>> > I am looking for some clarity on what features and performance
>>> > characteristics mcx boasts?
>>> >
>>> > For example are the following basic hardware features supported by this
>>> > driver?
>>> > IPv4 receive IP/TCP/UDP checksum offload
>>> > IPv4 transmit TCP/UDP checksum offload
>>> > VLAN tag insertion and stripping
>>> > interrupt coalescing
>>> >
>>> > And what other features does it support?
>>> >
>>> > I also came across a comment in some forum a while back (so high
>>> quality
>>> > information 😉) that mentioned Mellanox drivers in OpenBSD are SMP
>>> safe and
>>> > so not giant-locked. Is this true?
>>> >
>>> > Thanks, Andy,
>>>
>>


OpenBSD 7.1 - hangs after userland upgrade on server hardware

2022-05-01 Thread Andrew Lemin
Hi all,

I am totally stumped with issues while upgrading/installing 7.1 and I need
some help!

Server; Supermicro X10SLV-Q (Intel Q87 Express), Xeon E3-1280 v3, 8G RAM,
Mellanox 10G NIC

This server has been running OpenBSD flawlessly for years. I followed the
upgrade instructions and was able to reboot fine onto the 7.1 kernel (I
rebooted a couple of times on the 7.1 kernel in fact). However after I run
'pkg_add -u' to upgrade all of userland to 7.1, the machine started hanging
during boot.

The hang looked like an IO problem as it would always hang around the disk
setup stages.
I went into the BIOS and tried optimised defaults and failsafe defaults but
no luck..

I also downloaded a fresh copy and tried installing 7.1 from flash, however
the 7.1 installer also hangs. It hangs in the same place every time after
selecting 'done' to the networking config.
As I have a Mellanox card in here, I removed the NIC. but the hang
continues so its not that..

I get nothing to debug, it just freezes. I have reinstalled 7.0 which is
still working perfectly so this is not a hardware fault.

Is there anything I can do to increase the verbosity to see what driver it
is trying to load before the hang?

Other information, this is a totally headless machine, with a Xeon CPU
without any onboard GPU. It has a console connection with
console-redirection in the bios, and I have to set the tty params during
boot to interact over console. Otherwise everything else is standard.

Thanks for your time,
Best regards Andy.


Re: OpenBSD 7.1 - hangs after userland upgrade on server hardware

2022-05-01 Thread Andrew Lemin
,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 8 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0xf800, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (PEG0)
acpiprt2 at acpi0: bus -1 (PEG1)
acpiprt3 at acpi0: bus -1 (PEG2)
acpiprt4 at acpi0: bus 2 (RP01)
acpiprt5 at acpi0: bus -1 (RP02)
acpiprt6 at acpi0: bus -1 (RP03)
acpiprt7 at acpi0: bus 3 (RP04)
acpiec0 at acpi0: not present
acpipci0 at acpi0 PCI0: 0x0010 0x0011 0x
acpicmos0 at acpi0
acpibtn0 at acpi0: SLPB
acpibtn1 at acpi0: PWRB
"PNP0C0B" at acpi0 not configured
"PNP0C0B" at acpi0 not configured
"PNP0C0B" at acpi0 not configured
"PNP0C0B" at acpi0 not configured
"PNP0C0B" at acpi0 not configured
acpicpu0 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpipwrres0 at acpi0: PG00, resource for PEG0
acpipwrres1 at acpi0: PG01, resource for PEG1
acpipwrres2 at acpi0: PG02, resource for PEG2
acpipwrres3 at acpi0: WRST
acpipwrres4 at acpi0: WRST
acpipwrres5 at acpi0: WRST
acpipwrres6 at acpi0: WRST
acpipwrres7 at acpi0: FN00, resource for FAN0
acpipwrres8 at acpi0: FN01, resource for FAN1
acpipwrres9 at acpi0: FN02, resource for FAN2
acpipwrres10 at acpi0: FN03, resource for FAN3
acpipwrres11 at acpi0: FN04, resource for FAN4
acpitz0 at acpi0: critical temperature is 105 degC
acpitz1 at acpi0: critical temperature is 105 degC
acpivideo0 at acpi0: GFX0
acpivout0 at acpivideo0: DD1F
cpu0: using VERW MDS workaround (except on vmm entry)
cpu0: Enhanced SpeedStep 3600 MHz: speeds: 3601, 3600, 3400, 3200, 3000,
2800, 2600, 2400, 2200, 2000, 1800, 1600, 1400, 1200, 1000, 800 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Xeon E3-1200 v3 Host" rev 0x06
ppb0 at pci0 dev 1 function 0 "Intel Core 4G PCIE" rev 0x06: msi
pci1 at ppb0 bus 1
mcx0 at pci1 dev 0 function 0 "Mellanox ConnectX-4 Lx" rev 0x00: FW
14.28.2006, msix, address ec:0d:9a:83:1f:3a
mcx1 at pci1 dev 0 function 1 "Mellanox ConnectX-4 Lx" rev 0x00: FW
14.28.2006, msix, address ec:0d:9a:83:1f:3b
xhci0 at pci0 dev 20 function 0 "Intel 8 Series xHCI" rev 0x05: msi, xHCI
1.0
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev
3.00/1.00 addr 1
em0 at pci0 dev 25 function 0 "Intel I217-V" rev 0x05: msi, address
00:25:90:e1:e5:46
ehci0 at pci0 dev 26 function 0 "Intel 8 Series USB" rev 0x05: apic 8 int 16
usb1 at ehci0: USB revision 2.0
uhub1 at usb1 configuration 1 interface 0 "Intel EHCI root hub" rev
2.00/1.00 addr 1
ppb1 at pci0 dev 28 function 0 "Intel 8 Series PCIE" rev 0xd5: msi
pci2 at ppb1 bus 2
ppb2 at pci0 dev 28 function 3 "Intel 8 Series PCIE" rev 0xd5: msi
pci3 at ppb2 bus 3
em1 at pci3 dev 0 function 0 "Intel I210" rev 0x03: msi, address
00:25:90:e1:e5:47
ehci1 at pci0 dev 29 function 0 "Intel 8 Series USB" rev 0x05: apic 8 int 23
usb2 at ehci1: USB revision 2.0
uhub2 at usb2 configuration 1 interface 0 "Intel EHCI root hub" rev
2.00/1.00 addr 1
pcib0 at pci0 dev 31 function 0 "Intel Q87 LPC" rev 0x05
ahci0 at pci0 dev 31 function 2 "Intel 8 Series AHCI" rev 0x05: msi, AHCI
1.3
ahci0: port 4: 6.0Gb/s
ahci0: port 5: 6.0Gb/s
scsibus1 at ahci0: 32 targets
sd0 at scsibus1 targ 4 lun 0: 
naa.500a075103185123
sd0: 61057MB, 512 bytes/sector, 125045424 sectors, thin
sd1 at scsibus1 targ 5 lun 0: 
naa.55cd2e438062
sd1: 171705MB, 512 bytes/sector, 351651888 sectors, thin
ichiic0 at pci0 dev 31 function 3 "Intel 8 Series SMBus" rev 0x05: apic 8
int 18
iic0 at ichiic0
spdmem0 at iic0 addr 0x50: 4GB DDR3 SDRAM PC3-12800 SO-DIMM
spdmem1 at iic0 addr 0x52: 4GB DDR3 SDRAM PC3-12800 SO-DIMM
"Intel 8 Series Thermal" rev 0x05 at pci0 dev 31 function 6 not configured
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com0: console
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
com2 at isa0 port 0x3e8/8 irq 5: ns16550a, 16 byte fifo
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
wbsio0 at isa0 port 0x2e/2: NCT5104D rev 0x52
wbsio0 port 0x290/2 not configured
vmm0 at mainbus0: VMX/EPT
dt: 445 probes
uhub3 at uhub1 port 1 configuration 1 interface 0 "Intel Rate Matching Hub"
rev 2.00/0.05 addr 2
uhub4 at uhub2 port 1 configuration 1 interface 0 "Intel Rate Matching Hub"
rev 2.00/0.05 addr 2
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
root on sd0a (2d1c73e93d7b352e.a) swap on sd0b dump on sd0b
WARNING: / was not properly unmounted

Re: pair(4) question

2025-03-30 Thread Andrew Lemin
I had a similar issue years ago which I solved by putting 'up' as the first
line in the hostname.pairX files, so the pair interfaces come up without
any config first.

But that was probably even before the ordering improvements mentioned by
David above, and is probably not ideal anymore.

I used one rdomain for internal clients/VLAN, which has multipath default
routes pointing to a bunch of pair tunnels/patches. Each patch connects to
a different rdomain (with no physical interfaces attached) where I have
wireguard tunnel endpoints. This allows load balancing over multiple
wireguard or openvpn tunnels where tunnel addresses might overlap.
The tricky part was getting the tunnel daemon to use rdomain 0 for the
outer encrypted connection, but place the tunnel endpoint into different
rdomains for the clients.

So it does work, and it works really well. But I remember spending weeks
getting it to work ;)

Never knew about rport! will have to try that :)

Good luck



On Mon, 31 Mar 2025 at 14:57, Philipp Buehler <
e1c1bac6253dc54a1e89ddc046585...@posteo.net> wrote:

> Am 31.03.2025 03:49 schrieb David Gwynne:
> > you can also try rport(4) to replace pair(4) for p2p links between
> > rdomains.
>
> Has been some years since i dug through all this - and rport is
> pretty brand new , thanks for the hint. Unsure why no .Xr ..
>
>
> PS: i would debate if I want a failed IP-config leading to an "up
> anyway",
> but as an option, sure.
>
> --
> pb
>
>