Many of the utilities in the tools directory (BulkLoader, SSTableExpoert, etc) already use apache.commons.cli.
On Mon, Jul 8, 2024 at 4:28 PM Dinesh Joshi <djo...@apache.org> wrote: > I agree about picking libraries on their merit but a major factor for any > open source project should consider today is the possibility of > unfavorable/hostile licensing changes. > > On Mon, Jul 8, 2024 at 1:15 PM Jon Haddad <j...@jonhaddad.com> wrote: > >> Without getting into the pros and cons of both libraries, I have to point >> out there's something unsettling about making decisions about libraries we >> used based on arbitrary rules an employer has put into place on its >> employees. The project isn't governed by Apple, it's governed by >> individual contributors to open source. >> >> We need to pick libraries based on their merits. Apple's draconian rules >> should not prevent us from using the best option available. >> >> Jon >> >> >> On Mon, Jul 8, 2024 at 12:57 PM Dinesh Joshi <djo...@apache.org> wrote: >> >>> I agree, having a DISCUSS thread with a specific subject line is less >>> likely to be overlooked. >>> >>> One thing I'd like to note here is PicoCLI and Airline 2 are independent >>> projects that are ALv2 licensed. A subset of the Cassandra contributors may >>> have difficulty contributing to such projects due to preexisting policies >>> that their employers may have in place. >>> >>> I am concerned about hostile licensing changes in the future which will >>> necessitate another migration for us. That said, is there a specific reason >>> to not consider Apache Commons CLI[1]? >>> >>> Dinesh >>> >>> [1] https://commons.apache.org/proper/commons-cli/ >>> >>> On Mon, Jul 8, 2024 at 10:22 AM David Capwell <dcapw...@apple.com> >>> wrote: >>> >>>> I don't think that a separate thread would add extra visibility >>>> >>>> >>>> Disagree. This thread is about adding a feature branch, so many could >>>> ignore if they don’t care. The fact you are switching the library (and >>>> which one) is something we have to hunt for. By having a new DISCUSS >>>> thread it makes it clear which library you wish to add, and people can sign >>>> off if they care or not. >>>> >>>> I wouldn’t create this thread until you settle on which one you wish to >>>> move forward with. >>>> >>>> Is adding the PicoCLI library as a project dependency getting any >>>> objections >>>> from the Community? >>>> >>>> >>>> Thats the point of the new DISCUSS thread. By being very clear you >>>> wish to add PicoCLI people can either validate we are allowed to, or raise >>>> any objections. I have not really seen any pushback so far outside of 1 >>>> case that wasn’t legally allowed to be used. >>>> >>>> Take a look at previous threads about adding different libraries. >>>> >>>> On Jul 8, 2024, at 7:58 AM, Caleb Rackliffe <calebrackli...@gmail.com> >>>> wrote: >>>> >>>> +1 on picocli >>>> >>>> RE the feature branch, I would just maintain the feature branch in your >>>> own fork to break out whatever "reviewable units" of code you want. When >>>> all the incremental review is done (I have no problem going back and >>>> forth), squash everything together, do whatever additional testing you >>>> need, and commit. >>>> >>>> On Fri, Jul 5, 2024 at 10:40 AM Maxim Muzafarov <mmu...@apache.org> >>>> wrote: >>>> >>>>> > Once you are happy with your chosen library, we need a DISCUSS >>>>> thread to add this new library (current protocol). >>>>> >>>>> Thanks, David. This is a good point, do we need a separate DISCUSS >>>>> thread or can we just use this one? I'm in favour of keeping the >>>>> discussion in one place, especially when topics are closely related. I >>>>> don't think that a separate thread would add extra visibility, but if >>>>> that is the way the community has adopted - no problem at all, I'll >>>>> repost. >>>>> >>>>> >>>>> The reasons for replacing the Airlift/Airline [1] with the PicoCli [2] >>>>> are as follows (in order of priority): >>>>> >>>>> 1. The library is under the Apache-2.0 License >>>>> https://github.com/remkop/picocli?tab=Apache-2.0-1-ov-file#readme >>>>> >>>>> 2. The project is active and well-maintained (last release on 8 May >>>>> 2024) >>>>> https://github.com/remkop/picocli/releases >>>>> >>>>> 3. The library has ZERO dependencies, in some of the cases a single >>>>> file can just be dropped into the sources (it's even pointed out in >>>>> the documentation) >>>>> https://picocli.info/#_add_as_source >>>>> >>>>> 4. Compared to the Airlift library, the PicoCLI uses the same markup >>>>> design concepts, so we don't have to rewrite our command or make >>>>> complex changes, which in turn minimizes the migration. >>>>> >>>>> >>>>> Is adding the PicoCLI library as a project dependency getting any >>>>> objections from the Community? Please, share your thoughts. >>>>> >>>>> There are a few other alternatives (commons-cli, airline2, jcommander) >>>>> but they are not as well known and/or not as elegantly suited to our >>>>> needs based on what we have now. >>>>> >>>>> >>>>> [1] https://github.com/airlift/airline >>>>> [2] https://github.com/remkop/picocli >>>>> >>>>> >>>>> On Wed, 3 Jul 2024 at 22:27, David Capwell <dcapw...@apple.com> wrote: >>>>> > >>>>> > I don't personally think there is a strong need for a feature >>>>> branch. If it makes it easy for you, please go ahead with a feature >>>>> branch. >>>>> > >>>>> > >>>>> > Agree, I don’t see the reason for a feature branch… feature branch >>>>> just means the branch lives in apache domain rather than your own fork. >>>>> You won’t be able to merge until you are done and you will need to keep >>>>> rebasing over and over again. Even if multiple people are working on this >>>>> you can work in your fork just fine (assuming you grant permissions). >>>>> > >>>>> > Another issue is that feature branches require the same level of >>>>> commit process as every other main branch, where as personal branches >>>>> don’t. This actually will slow you down as each commit now must be a >>>>> JIRA, >>>>> you go through review of each, must show a success CI, etc. >>>>> > >>>>> > Now, if you wish to split this into multiple steps that is fine, but >>>>> the list of places is basically node tool (kinda has to go in at once) and >>>>> small CLIs. If you wish to migrate the small ones in isolation first, I >>>>> am >>>>> cool with that merging to w/e branch the logic is targeting, but you won’t >>>>> be able to break up node tool without breaking everything… but if you did >>>>> this in your own fork then no one cares. >>>>> > >>>>> > I won’t block a feature branch, but just don’t see a clear “why” and >>>>> only see cons. >>>>> > >>>>> > We are changing the command markup library, so there are two extra >>>>> > things to be checked: >>>>> > - We parse CLI arguments in the same way (as the parser is different >>>>> > in a new library); >>>>> > - The command help output is the same so that the user won't see any >>>>> difference; >>>>> > >>>>> > >>>>> > Personally I would POC a limited node tool change with JVM dtest as >>>>> we require passing the output to the test (the prototypes you listed >>>>> doesn’t include JVM Dtest integration). If one library makes this more >>>>> annoying, then do we care about fancy new features we don’t use when it >>>>> makes the features we do use harder? If you start with the smaller tools >>>>> first then spend a ton of time migrating node tool then find JVM dtest is >>>>> broken, then you will spend so much more time fixing this, I would >>>>> strongly >>>>> recommend doing some throw away POC to make sure w/e way you go won’t >>>>> break >>>>> JVM Dtest’s node tool support. >>>>> > >>>>> > Once you are fine with your selected library, we will need a DISCUSS >>>>> thread to add that new library (current protocol). This mostly just makes >>>>> the pick more visible, and normally we only check simple things like “are >>>>> we legally allowed to use” and “is this project dead?”. >>>>> > >>>>> > >>>>> > On Jul 3, 2024, at 6:06 AM, Maxim Muzafarov <mmu...@apache.org> >>>>> wrote: >>>>> > >>>>> > Thank you all for your comments, >>>>> > >>>>> > I want to stress, that these changes won't affect the input/output >>>>> > formatting of commands, ensuring everything is the same. >>>>> > >>>>> > We are changing the command markup library, so there are two extra >>>>> > things to be checked: >>>>> > - We parse CLI arguments in the same way (as the parser is different >>>>> > in a new library); >>>>> > - The command help output is the same so that the user won't see any >>>>> difference; >>>>> > >>>>> > Additional tests cover both cases. >>>>> > >>>>> > On Mon, 1 Jul 2024 at 20:08, Dinesh Joshi <djo...@apache.org> wrote: >>>>> > >>>>> > >>>>> > I don't personally think there is a strong need for a feature >>>>> branch. If it makes it easy for you, please go ahead with a feature >>>>> branch. >>>>> > >>>>> > One thing I had raised in the past was the desire to have a flag >>>>> that would generate machine readable output for nodetool commands. If this >>>>> can be done with a minor incremental effort, it would definitely reduce >>>>> the >>>>> burden on operators / integrations that rely on the nodetool output. As I >>>>> have earlier indicated in the past, relying on human readable output for >>>>> CLI tools like nodetool is fragile and providing a JSON output as an >>>>> alternative is a great first step in eliminating that dependency. I'm just >>>>> curious about the level of effort. If it is too much or too invasive, we >>>>> can consider producing JSON output for inclusion in the next major >>>>> release. >>>>> > >>>>> > On Fri, Jun 28, 2024 at 6:47 AM Maxim Muzafarov <mmu...@apache.org> >>>>> wrote: >>>>> > >>>>> > >>>>> > Hello everyone, >>>>> > >>>>> > >>>>> > The nodetool relies on the airlift/airline library to mark up the CLI >>>>> > commands used to manage Cassandra, which are part of our public API. >>>>> > This library is no longer maintained, so we need to update it anyway, >>>>> > and the good news is that we already have several good alternatives: >>>>> > airline-2 [3] or picocli [2]. >>>>> > >>>>> > In this message, I'm mainly talking about CASSANDRA-17445 [4], which >>>>> > refers to the problem and is a prerequisite for a larger CEP-38 CQL >>>>> > Management API [5]. It doesn't make sense to use annotations from the >>>>> > deprecated library to build a new API, so this is another reason to >>>>> > update the library as soon as possible and do some inherently small >>>>> > code refactoring required for the CEP-38. >>>>> > >>>>> > In addition to being widely used and well supported, the Picocli >>>>> > library offers the following advantages for us: >>>>> > - We can detach the jmx-specific parameters from the commands so that >>>>> > they can be reused in other APIs (e.g. without host, port) while >>>>> > remaining backwards compatible; >>>>> > - We can set up nodetool's autocompletion after the migration with >>>>> > minimal effort; >>>>> > - There is a good Picocli ecosystem of tools that we can use to >>>>> > simplify our codebase, e.g. generate man pages tool to make our CLIs >>>>> > more Unix friendly [7]; >>>>> > >>>>> > >>>>> > = Prototype = >>>>> > >>>>> > I have a working prototype [8] that shows what the result will look >>>>> > like. The prototype includes: >>>>> > - Tests between the execution of commands via the nodetool and >>>>> nodtoolv2; >>>>> > - 5 out of 164 nodetool commands have been moved so far, to show the >>>>> > refactoring we need to do to the command's body; >>>>> > - The command help output under for the nodetoolv2 is the same as it >>>>> > is currently for the nodetool and this is the default, however a >>>>> > "cassandra.cli.picocli.layout" is added to switch to the Picocli >>>>> > defaults; >>>>> > - You can also see that the colour scheme is applied by the Picocli >>>>> > out of the box, and this is how it looks [9]; >>>>> > - The nodetoolv2 is called first when the shell is triggered, and if >>>>> > the nodetoolv2 doesn't contain the command it needs yet, it falls >>>>> back >>>>> > to the nodetool and the old argument parser; >>>>> > >>>>> > >>>>> > Since the number of commands is quite large (164), I'd like to create >>>>> > a feature branch and move all the commands one at a time, while >>>>> > keeping the output backwards by applying additional tests at the same >>>>> > time and checking that the CI is always green. I think the "feature >>>>> > branch" approach will be less stressful for us since it focuses on >>>>> > requiring a review of only tedious changes to the feature branch, >>>>> > rather than reviewing the 15k line patch. >>>>> > >>>>> > >>>>> > Anyway, I am open to any suggestions and advice based on your >>>>> > experience and best practices for this case. Looking forward to your >>>>> > thoughts and suggestions. >>>>> > >>>>> > >>>>> > >>>>> > [1] https://github.com/airlift/airline >>>>> > [2] https://picocli.info/ >>>>> > [3] https://github.com/rvesse/airline >>>>> > [4] https://issues.apache.org/jira/browse/CASSANDRA-17445 >>>>> > [5] >>>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-38%3A+CQL+Management+API >>>>> > [6] >>>>> https://github.com/apache/cassandra/pull/2497/files#diff-acdd5f29d28df5c02f4bfc933528f084508b4923112e312e68a4aff7df973bce >>>>> > [7] https://picocli.info/man/gen-manpage.html >>>>> > [8] https://github.com/apache/cassandra/pull/2497/files >>>>> > [9] >>>>> https://github.com/apache/cassandra/assets/3415046/57b14ae0-ff59-43d2-b542-10d3218ae075 >>>>> > >>>>> > >>>>> >>>> >>>>