Hi Claude,

I think this is great work. Speeding up the Authorizer will be a big win for us.

I don't think we need to add additional interfaces for this, though. Just get 
rid of the old slower implementation that I wrote, and replace it with your 
newer, faster one.

Also, I think we should continue the discussion about globs... after all, it is 
the topic of this KIP! (You are calling it "wildcards" but I don't think that 
is good terminiology since it will cause confusion with the existing "*" 
wildcard). Anyway, earlier I requested that you add PatternType.GLOB and use it 
for these things. Does that make sense? I don't see another path to doing it 
compatibly. I certainly wouldn't want to create a "Python 2 vs. Python 3" type 
situation where people get stuck on an older authorizer fork because the new 
one requires globs and they can't handle that. No authorizer forks (or at 
least, not for things like this.)

best,
Colin


On Fri, Aug 30, 2024, at 01:32, Claude Warren, Jr wrote:
> I made it easier to replace the existing StandardAuthorizerData with a
> different implementation in order show the Trie implementation met all the
> requirements of the StandardAuthorizerData and could be replaced without
> changing the StandardAuthorizer implementation.
>
> Replacing the current StandardAuthorizerData with a Trie
> implementation makes sense because it is an order of magnitude faster.
> This means that when we go to implement the Wildcard types we can perform
> the search in times that are equivalent to the literal search times in the
> current StandardAuthorizerData implementation.
>
> The changes for the first pull request is to create an interface for
> AuthorizerData and create an "authorizeByResourceType" method within that
> interface.  This adds an initial implementation in StandardAuthorizerData
> that mirrors the default implementation found in the Authorizer interface.
>
> The Trie implementation is not dependent upon other classes local to
> StandardAuthorizerData other than the MatchingRule and some of its
> implementations which are now found in the AuthorizerData interface instead
> of the StandardAuthorizerData implementation.  Obviously it is dependent
> upon AuthorizationResult, AclBinding, KafkaPrincipal and other classes that
> are defined elsewhere.
>
> In addition, abstract test cases are implemented to validate that any
> replacement implementations yield the same results as the original
> StandardAuthorizerData.
>
> I hope that this aswages any concerns that you may have had,
> Claude
>
> On Thu, Aug 29, 2024 at 6:51 PM Colin McCabe <cmcc...@apache.org> wrote:
>
>> On Thu, Aug 29, 2024, at 01:34, Claude Warren, Jr wrote:
>>
>> Colin,
>>
>> Thanks for your insightful comments.  I came to the same conclusion.
>>
>> I do have 2 Jira tickets to simplify some of this.
>>
>> 1)  KAFKA-17316 <https://issues.apache.org/jira/browse/KAFKA-17316> -
>> Makes developing a new Authorizer by creating a new implementation of the
>> StandardAuthorizerData easier.  Basically adds interfaces, abstract classes
>> and lots of tests.
>>
>> 2) KAFKA-17423 <https://issues.apache.org/jira/browse/KAFKA-17423> -
>> Builds on KAFKA-17316 by creating a Trie implementation (without the new
>> wildcards from KIP-1042).  This provides an order of magnitude faster
>> processing of ACL requests.
>>
>> My plan is to get these through the process, so I have a better
>> understanding of the Kafka development process, and then rework KIP-1042 to
>> create a new PatternType with wildcard processing.
>>
>> I would appreciate your feedback on the above tickets if you have the time.
>>
>>
>> Hi Claude,
>>
>> Thanks for the response. However, I don't want to fork the
>> StandardAuthorizer. I would be -1 on that since it would greatly enlarge
>> our maintenance burden, for no good reason. And if we did fork it, I
>> wouldn't want the forked version to depend on the internal classes of the
>> original one, since that would make evolving either one much more difficult.
>>
>> We should be able to add wildcards to the existing authorizer without too
>> much trouble, just by adding a new type alongside LITERAL, PREFIXED.
>> Perhaps GLOB ? We always planned on adding more types, to cover exactly
>> this scenario.
>>
>> best,
>> Colin
>>
>> Claude
>>
>> On Fri, Aug 23, 2024 at 9:43 PM Colin McCabe <cmcc...@apache.org> wrote:
>>
>> On Sat, Jul 27, 2024, at 04:20, Claude Warren, Jr wrote:
>> > I have updated the KIP with results from the Trie implementation and they
>> > are dramatic to say the least.  For most searches they are at least an
>> > order of magnitude faster and use less memory.  The wildcard search is
>> not
>> > a regular expression but rather a file type wild card (*=1 or more
>> > character, ?=1 character).
>> >
>>
>> Hi Claude,
>>
>> One issue here is that your change is incompatible. For example, if I have
>> a consumer group named '?', and an ALLOW ACL  for '?', this change alters
>> the meaning of that ACL. Previously it just ALLOWed '?' Now it allows any
>> single-character group. This obviously could be a major security issue for
>> people doing upgrades.
>>
>> (This wouldn't be an issue if we restricted consumer group names to a
>> sensible set of characters, like we did with topics. But that ship has
>> sailed...)
>>
>> If you want to add new behavior in a backwards-compatible fashion, your
>> best bet probably is to create a new PatternType. Unfortunately this makes
>> the complexity issue worse, but I don't see a way around it. Perhaps we can
>> deprecate LITERAL and PREFIXED at some future date, if this wildcard thing
>> makes them unecessary.
>>
>> Another issue I see, somewhat related, is that Authorizers are pluggable,
>> and probably won't be updated all at once to support this new type. That
>> should be fine, but we should document the error that is returned when
>> someone tries to create your new-style ACLs with an authorizer that does
>> not support them.
>>
>> > Code is available on my personal branch [1].  It is not fully documented
>> > and does not meet the checkstyle requirements yet.  I also modified the
>> jmh
>> > tests to run the Trie tests and limit the test to the single case
>> mentioned
>> > in the KIP documentation.
>> >
>> > If there are no issues with this code, I will complete the documentation
>> > and fix the checkstyle and then open a pull request.
>>
>> It's fine to open a pull request. You can link it on the KIP page. As
>> always, we don't want to commit it until the KIP is accepted.
>>
>> best,
>> Colin
>>
>> >
>> > Claude
>> >
>> > [1]
>> https://github.com/Claudenw/kafka/pull/new/KIP-1042_Trie_Implementation
>> >
>> >
>> > On Wed, Jul 3, 2024 at 2:21 PM Claude Warren, Jr <claude.war...@aiven.io
>> >
>> > wrote:
>> >
>> >> I think that if we put in a trie based system we should be able to halve
>> >> the normal searhc times and still be able to locate wild card matches
>> very
>> >> quickly.  Users should be warned that "head wildcard" matches are slow
>> and
>> >> to use them sparingly.  I am going to see if I can work out how to do
>> >> wildcard matches within the trie.
>> >>
>> >> But in all cases can show that the trie is faster than the current
>> >> implementation.
>> >>
>> >> Claude
>> >>
>> >>
>> >>
>> >> On Wed, Jun 19, 2024 at 7:53 PM Muralidhar Basani
>> >> <muralidhar.bas...@aiven.io.invalid> wrote:
>> >>
>> >>> There are some test results mentioned in the Test Plan section of the
>> Kip,
>> >>> but we need to do more testing with various patterns and permission
>> types.
>> >>> As mentioned in the discuss thread, the trie implementation could
>> >>> potentially surpass the current speed of ACL match.
>> >>>
>> >>> However, we can only accurately assess the results after updating the
>> >>> actual classes and analysing them with AuthorizerBenchmark.
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Murali
>> >>>
>> >>> On Mon, 17 Jun 2024 at 20:39, Colin McCabe <cmcc...@apache.org> wrote:
>> >>>
>> >>> > My concern is that the extra complexity may actually slow us down. In
>> >>> > general people already complain about the speed of ACL matches, and
>> >>> adding
>> >>> > another "degree of freedom" seems likely to make things worse.
>> >>> >
>> >>> > It would be useful to understand how much faster or slower the code
>> is
>> >>> > with the propsed changes, versus without them.
>> >>> >
>> >>> > best,
>> >>> > Colin
>> >>> >
>> >>> >
>> >>> > On Mon, Jun 17, 2024, at 01:26, Muralidhar Basani wrote:
>> >>> > > Hi all,
>> >>> > >
>> >>> > > I would like to call a vote on KIP-1042 which extends creation of
>> acls
>> >>> > with
>> >>> > > MATCH pattern type.
>> >>> > >
>> >>> > > KIP -
>> >>> > >
>> >>> >
>> >>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1042%3A+Support+for+wildcard+when+creating+new+acls
>> >>> > >
>> >>> > > Discussion thread -
>> >>> > > https://lists.apache.org/thread/xx3lcg60kp4v34x0j9p6xobby8l4cfq2
>> >>> > >
>> >>> > > Thanks,
>> >>> > > Murali
>> >>> >
>> >>>
>> >>
>>
>>
>>

Reply via email to