Hi.

On 2023-10-10 (Di.) 09:08, Willy Tarreau wrote:
Hi Tristan,

On Sun, Oct 08, 2023 at 12:15:00PM +0000, Tristan wrote:
Since this was brought up,

On 7 Oct 2023, at 14:34, Willy Tarreau <w...@1wt.eu> wrote:

[...]

Maybe this will then bring up SPOE to a level where the body of a request
can be scanned and bring it to a full WAF level or as WASM filter.

Any thoughts on the feasibility of a WASM based alternative to the current
LUA platform?

 From what I looked there are a few WASM runtimes set up for being embedded in
C applications, though I'm not expert enough on the tradeoffs of each to know
if there are dealbreakers.

I've never had a look yet. I can understand there are pros and cons.
When we added Lua, the goal was to be able to script a little bit more
what was too complicated to implement in rules. And I must say this has
served its purpose well, with dashboards, let's encrypt, authentication,
session management and whatever being done in Lua. Scripting languages
have a great advantage in field, they're easy to adapt or fix. Granted
Lua's syntax is not exactly what I would call awesome, but it's modular
and extensible enough to allow to do lots of things easily and at a low
execution cost.

WASM on the other hand would provide more performance and compile-time
checks but I fear that it could also bring new classes of issues such as
higher memory usage, higher latencies, and would make it less convenient
to deploy updates since these would require to be rebuilt. Also we don't
want to put too much of the application into the load balancer. But as I
said I haven't had a look at the details so I don't know if we can yield
like in Lua, implement our own bindings for internal functions, or limit
the memory usage per request being processed.

Hm, how could WASM be integrated into HAP if not with SPOE? I don't have now any Idea what's the best way could be.

Willy, please take a stable seat :-)

How about to use HTTP/(1/2/3), grpc or FCGI as filter protocols to be able to handle the body, instead of SPOE?

One option could be, as Alex suggested, to move that to an external
agent accessed via SPOE, but I must confess that I'm having an issue
with that: Since I drafted the basic needs in 2016 and Christopher
implemented a first experimental and limited version the same year, it
has not really taken off. It has become a chicken-and-egg problem. It
doesn't support streaming yet so it's not used by content inspection/
wafs/image compressors/on-disk caches etc, so it basically sees zero
adoption. And since it sees zero adoption, it has never been on anyone's
priority list to rework it. Such a rework does require particular knowledge
of the internals and good architectural skills to be able to implement a
v2 that would address all the current design's shortcomings by relying on
the muxes and idle connections, but the rare people who are able to work
on such a thing among the core team are constantly busy on much more
useful and important stuff, and I doubt anyone would have any interest
in working on this thankless thing.

So I feel like it's here to stay with its design limitations making it
unsuitable to many of the tasks it was imagined for, and that it could
actually be much less effort to simply remove it. Of course that's not
something to do between an odd and an even version, but maybe it's not
even too late to drop it from 2.9 if nobody cares anymore.

Well, this could be an option, from my point of view.

@Community: Culd you be so kind and tell us for which use cases you use SPOE, similar to Norman ( https://www.mail-archive.com/haproxy@formilux.org/msg44127.html ) and how big the afford could be to migrate to LUA filter.

Or to put it in a blunt way: does anyone want that 2.9 still supports
SPOE whose necessary redesign never happened in 7 years despite trying
to find time for this, and will likely never happen ? Or can we just
remove it ? I have nothing against preserving it a little bit more if
there really are users, but it would be nice if their use cases,
successes or issues were known, and even more if the effort could be
spread over multiple persons.

I think it would be nice when there are some use cases written in the https://github.com/haproxy/wiki/wiki/SPOE:-Stream-Processing-Offloading-Engine which are in use with SPOE to see how often this feature is used in HAP.


I also realize that a lot of work went into the current LUA support (a long
at the frighteningly long .c file for it speaks volumes).

My understanding is that many of the recent changes were attempts to
address certain design limitations and dirty corner cases.

But on one hand I find it rather difficult to use correctly in its current
state, in part because of the complete absence (to my knowledge) of something
equivalent to C headers for validation ahead of deployment, and also in part
(and more personally) because I never understood what anyone could possibly
like about LUA itself...

I don't think people "like Lua" but they like its lightness, low
general footprint, low intrusiveness, availability by default in their
OS, and the ability to script what cannot be scripted in the config.
You can easily iterate, compare strings, store/retrieve in memory etc.
Generally when you go to Lua, it starts from a frustration because you
cannot figure how to do something differently. It's never that fun of an
experience but once you're done, you figure that it remains performant,
easy to maintain because it's just a few lines to a few tens of lines,
reasonably well integrated, and allows your configs to be simplified. I
think those of us using it are never happy to do it but at the end feel
proud that they managed to work around a limitation. During the Lua
integration we used to say that it would teach us new use cases that
we're not aware of and that could ultimately end up as native actions/
sample fetches/converters for some of them if they were popular. I don't
have many examples of this to be honest, but for example I remember about
JWT which stared in Lua and is now native. So I'd say that Lua brings you
this: this ability to refine your feature in field until it's good enough,
and if there is demand for it, this can become a compelling argument to
port it natively.

Well, to make it simpler I would vote to remove SPOE and migrate the SPOE Workload to LUA, but as I currently don't use SPOE it would be really great to here from users how SPOE is used and how big the work would be to migrate to LUA.

This would move  the flow from

HAP
  => SPOE
    => Server
      => SPOA
        => App

to

HAP
  => LUA

right?


[...]

Are there any plans to have something similar to XDS (
https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol ) for
dynamic configs at runtime, similar to the socket api and Data Plane API?

I used to have such plans a long time ago and even participated to a
few udpapi meetings. But at this point I think it's not haproxy's job
to perform service discovery

And that's a very fair point. I wonder however how feasible it will
realistically be from dpapi's perspective to add that to its remit.

I don't know, and I don't know if it should be inside the component
itself or an agent that talks to haproxy via it. That's definitely
not my area of expertise to say the least.

That said I'd definitely be very interested as well. As much as handcrafted
configurations are nice, one quickly reaches their maintainability limits.
And if we're to stop abusing DNS again and again, proper service discovery is
the way.

Yes I think so. I remember Marko telling us at HaproxyConf 2022 that
the dpapi can now consume and produce about everything that's valid
from an haproxy standpoint and can do it from other representations
(I think YAML was mentioned). This can also help some users maintain
and generate their configs in a way they find more convenient based
on the tools available to them.

Well this implies that always a dpapi should run together with HAProxy if you want something like DNS resolving for server or anything else? I don't think that's good approach but I understand that some part have to be cleanuped, difficult decision.
I think that the DNS Stuff should be keep there and maybe be enhanced as
it looks to me some new Security Topics are using DNS more and more like
ESNI, ECH, SVB, ...
Should this be handled by dpapi and configured via socket api or any upcoming api in HAProxy?

Willy

Uji, this discussion gets some interesting points :-)

Regards
Alex

Reply via email to