09, (8:30 AM PST)

Michael Marshall Thu, 09 Jun 2022 12:54:28 -0700

Hi Pulsar Community,

Here are the meeting notes from today's community meeting. Thanks to
all who participated! We had 3 detailed conversations, as the length
of this email demonstrates.


Disclaimer: If something is misattributed or misrepresented, please
send a correction to this list.

Source google doc:
https://docs.google.com/document/d/19dXkVXeU2q_nHmkG8zURjKnYlvD96TbKf5KjYyASsOE

Thanks,
Michael

2022/06/09, (8:30 AM PST)
-   Attendees:
-   Matteo Merli
-   Michael Marshall
-   Mattison Chao
-   Heesung Sohn
-   Lari Hotari
-   Aaron Williams
-   Enrico Olivelli
-   Christophe Bornet
-   Andrey Yegorov
-   Saumitra Srivastav

-   Discussions

-   Matteo: Load Balancer email from Heesung is not a proposal yet. It
is a discussion, no design doc yet. Just want to start a discussion to
start early. It will touch a lot of Pulsar, so consensus is essential.
There are many we can touch. The biggest challenge is that no one
really understood how it worked in reality. It works most of the time,
but when it doesn’t, it’s not clear why. Let’s document what is there
from the code. Many things involved there, including keeping track of
the old rates of bundles and all the metadata. Then start with a fresh
implementation so that we don’t break the current one. There isn’t an
answer, and we don’t necessarily even have all of the questions yet.
Lari: there was a proposal on the mailing list related to the load
balancer, it seemed valuable. Matteo: we need more drastic changes.
There isn’t any one problem. Lari: I agree. That proposal included
some problem statements that could be valuable. Matteo: some of the
approaches embedded in the current model: it is always reactive (there
is no broker to broker coordination, instead the interaction with the
client is the only driver), the other one is that try to get a perfect
placement (should the model be more solid for rearranging bundles
across brokers). Michael: the bundle logic is something interesting to
consider. What happens when there are many topics and it takes longer.
Matteo: the primary issue is that moving bundles is that it’s not
broker to broker. It just gets unloaded and then the manager places it
somewhere. The broker should know the target broker for the bundle.
Michael: that could be sent to the client to decrease latency during
bundle moves. Matteo: a tool to measure time to transfer bundle could
be valuable. Another detail: no way to do bulk lookups. Michael: that
doesn’t even include redirects. Matteo: we should get rid of redirects
completely. It should be on the broker side. Another one, when we
added bundles, it was meant to be transparent to clients to ensure
design flexibility. The design hasn’t actually changed in almost 10
years, so we could expose the bundles to the client and speed up the
lookups.

-   PIPs

-   Christophe: PIP-173 : Create a built-in Function implementing the
most common basic transformations Matteo: I am not a fan of SMT
(Kafka’s term for simple message transfer). The way you express the
syntax is convoluted. It does not feel natural. Christophe: is the
configuration the problem or the feature itself? Matteo: there are
multiple ways of doing that, not a single one. Christophe: what we see
from users is that there always seem to be transformations like
renaming a field or removing a field, and there is no easy way to do
that at the moment. The idea is to find something that is available,
easy to configure, and does not require writing code. It’s language
agnostic. Matteo: I don’t disagree with that. I disagree with the way
to expose that. Yes, you don’t write code, but you’re trying to
express logic in a configuration file. It is probably not the best way
to express that logic. Enrico: two points. Most of the users need a
very simple transformation like drop a field. A line with drop field
in a config file is straight forward. Matteo: depends on the case. A
more general approach gets more complicated. Maybe you want to filter,
route, more complex logic, there are many things you can do. The
problem is that expressing those in a config file is a nightmare. If
you look at an example of SMT, it is complicated. More complex
scenarios make it harder. Christophe: did you see the proposed
configuration design? I feel it is better than what SMT offers. We
could do something with bash like logic that could be compatible with
the JSON configuration. How would you do this? Matteo: a SQL interface
is easier for users to get. This is how you do aggregation, call a
function like upper case, all of this comes naturally. Enrico: the
main point now is probably that Christophe may start a standalone
project, but the main point is to give pulsar users the functionality.
Matteo: the main point is that a new API has to be supported. Not
saying we shouldn’t do this. My point is that we should get a good
way. You could do it outside of Pulsar. Enrico: these are functions
that we want to give to the community for users to have access to out
of the box. Being in Pulsar gives confidence that breaking changes
will be caught and/or prevented. Matteo: if you put that there, SMT
becomes an API. The question is: is it the right API? Is it the right
abstraction? To me, it doesn’t sound like a good one. The problem is
how is it accessible to a user and how will users be excited to write
the actual logic. Lari: there are many use cases where there are users
that have high barriers for coding. This lowers the barrier for users.
It’s not a stream processing solution. It’s an extension. Christophe:
it’s just meant to be for simple transformations. Schemas have a steep
learning curve. Matteo: I understand the no-code thing. However, a
config file is not the place to express logic. A DSL could capture
this. A custom one would be tough, but one like SQL could be easy to
adopt. (There was some discussion about complexity and how SQL gives
more selection, see recording for more info). Christophe: how would
you do routing? Matteo: that is something you can do if you look at
Flink SQL, KSQLDB, SnowFlake SQL. They are all slightly different, but
the basics are the same. My point is that SQL gives filter, routing,
dropping fields, etc. Lari: in this case, it’s more like the model is
that there are multiple steps that are doing transformations. There
can be these components that do these transformations, and then you
add these steps to a list (the config). Each step gets configuration.
It’s not meant for all use cases. A good example is a topic with
messages in Avro format, and you want them on another topic with JSON
format. Adding a function that does that is helpful. Matteo: chaining
multiple things still adds complexity. Now you have 5 different config
files to inspect to see how they apply to each other. Lari: these
models do have limitations, but if it is defined and documented that
it is for simple message transformations, to me that makes a lot of
sense. I would find it problematic for adding SQL because adding
tooling support would be much harder. I see this integrating with a
web UI. In that case, it’s better that there isn’t a language that
needs to be parsed arbitrarily. Matteo: a SQL dialect is a very
specific language to parse. Transformations are one small part.
Routing, grouping, and others are features that would be valuable too.
A Web UI may be helpful, but the moment you want something serious,
you need the underlying code model. We saw this in splunk where there
was a DSL and then all the UI to compose a pipeline that gets
translated into flink jobs. Ultimately, only the demos were done in
the UI. Anything serious needed to use the underlying DSL. Lari: I
agree that typically happens. That’s why it’s not meant as a generic
stream processing solution. Matteo: it’s not black and white. We say
Pulsar functions are not a stream processing framework, but we want to
give users the ability to process data easily without managing their
own consumers. There are multiple shades in between and there are
concepts that are overlapping. Flink, for example, is very powerful
and complicated. The whole point is that we don’t need to re-invent a
stream processing framework. But we can expose a lot of the constructs
that will work with the Pulsar Function model. Lari: in most cases,
you do have to actually write code, anyway. Matteo: my only worry is
that if this becomes the official API, is this the preferred model
that we tell users to define basic functions? I would be resistant to
that. Lari: a similar critique could be for the Pulsar function API,
too. There isn’t a completely optimal solution. This would solve the
very specific entry level solution, but wouldn’t be for all functions.
Matteo: my point is that we could have a more general solution that
does the transformations and also more complicated logic too. Lari:
the risk is that it doesn’t do any one thing very well. Matteo: I
disagree, that is implementation dependent. On a SQL like DSL, we can
express all of these logics and implement them in the function
runtime. It won’t be like stream process, which would need the right
run-time, which functions are not. It would let you do most of the
things you could do in functions without writing code. Even if we call
it no-code, it’s still code, it’s logic. Lari: I agree that something
like that would be useful. How realistic is that that we would have
that in any short term? Christophe already has it implemented this
way. Matteo: let’s talk about that in a couple weeks or less.

-   Michael: on the release plan PIP-175, we need to define what will
force an RC. Matteo: I agree, and we need to set a date and hit our
dates. We should cut a branch 3 weeks before. We have different kinds
of bugs. The first is a new regression. Those must be fixed. The other
one is a bug in a new, shiny feature that has a bug. Should we fix it
or not? In order to keep that date, we have 3 weeks to discover and
fix the bug. If we can’t fix it in that time, the feature is broken
and should be communicated to users. If we find it at the last moment,
we ship with the bug (and still tell users). Michael: that could
simplify our patch releases too, which can take a while to get out.
Matteo: we should formalize when we do patch releases, as well. As
Dave mentioned, we have the cherry-picking process that needs
documenting. If you merge something and tag it to be cherry-picked,
you need to cherry-pick when it is merged (immediately). Otherwise,
the release manager will skip that change. Regardless of whether you
fixed a bug. The release manager has a lot of work there. Also, when
do we want to do patch releases? How many bugs do we wait for, is
there a time limit? We need guidelines. I don’t know of any good
model. We should have something there. Michael: what about performance
regressions? Can those be release blockers? Matteo: There are cases. A
simple bug that is easy to fix can definitely trigger a rebuild of the
RC. Another is from a large refactor, but if it is from a large
refactor, it might take more than 3 weeks. Or it could be a new
feature and we want to maintain the new feature. We should run these
performance testing at the code freeze time. If 3 weeks is not enough,
we should move forward with the release. Matteo’s quick cut is 30%,
but doesn’t want a fixed percentage. Different features have different
requirements.

Pulsar Community Meeting Notes 2022/06/09, (8:30 AM PST)

Reply via email to