Re: [DISCUSS] Change the useage of nodetool tablehistograms

Bowen Song via dev Thu, 23 Mar 2023 05:07:32 -0700

I don't think the nodetool tablestats command's parameters should beused as a reference implementation for the nodetool tablehistogramscommand. Because:


 * the tablehistograms command can take the keyspace and table as two
   separate parameters, but the tablestats command can't.
 * the tablestats command can take keyspace (without table) as a
   parameter, but the tablehistograms command can't.

The introduction of the -ks and -tbs options are unnecessary for thetablestats command, because it's parameters are:


   nodetool tablestats [<keyspace.table>|<keyspace>
   [<keyspace.table>|<keyspace>[...]]]

Which means any positional parameter without a dot is treated as akeyspace name, otherwise it's treated as dot-separated keyspace andtable name. That, however, does not apply to the nodetooltablehistograms command, which led to your workaround - the addition ofthe -ks and -tbs options.

But if you could just forget about the nodetool tablestats command for amoment, and look at the nodetool tablehistograms command alone, you willsee that it's unnecessary to introduce the -ks and -tbs options, becausethe command already takes keyspace name and table name, just in adifferent format.

In addition to that, I would be interested to know how often do peopleuse the -i option in the nodetool tablestats command. My best guess is,very very rarely.

If my guess is correct, we should keep the nodetool tablehistogramscommand as simple as:


   nodetool tablehistograms [<keyspace> <table> [<table> [...]] |
   <keyspace.table> [<keyspace.table>[...]]]

It's good enough if the above can cover the majority of use cases. Theremaining use cases can be dealt with individually, by multipleinvocations of the same command or providing it with a script-generatedlist of tables in the <keyspace.table> format.

TL;DR: The KISS principle <https://en.wikipedia.org/wiki/KISS_principle>should apply here - keep it simple.



On 23/03/2023 03:05, guo Maxwell wrote:

Maybe I didn't describe the usage of option "-i" clearly, The reasonwhy I think the command argument should be like this :



    1. nodetool tablehistograms ks.tb1 or ks tb1  ... //this is *one
    of the old way *of using tablehistogram. will print out the
    histograms of tabke ks.tb1 , we keep the old format to print out
    the table histograms,besides if more than two arguments is
    provied, suchu as nodetool tablehistograms system.local
    system_schema.columns system_schema.tables then all tables's
    histograms will be printed out (I think this is a bug that not as
    excepted in the document's decription, we should remind the user
    that this is an incorrenct usage)

    2. nodetool tablehistograms -tbs ks.tb1 ks.tb2 .... //print out
    list of tables' histograms with format keyspace.table
    3.nodetool tablehistograms -ks ks1 ks2 ks3 ... //print out list of
    keyspaces histograms
    4.nodetool tablehistograms -i -ks ks1 ks2 .... //print out list of
    table histograms except for the keyspaces list behind the option -i
    5.nodetool tablehistograns -i -tbs ks.tb1 ks.tb2 // print out list
    tables' histograms except for table in ks.tb1 ks.tb2
    6.nodetool tablehistograns -i -tbs ks.tb1 ks.tb2 -ks ks1 // print
    out list tables' histograms except for table in ks.tb1 ks.tb2 and
    all tables in ks1
    6.none option specified ,then all tables histograms will be print
    out.// this is *another one of the old way* of using tablehistogram.

is to make the command format to be consistent with the format ofnodetool tablestats, so for users, there will be a unified awarenessof using these two commands, rather than different commands requiringdifferent usage awareness , we can see the description ofthe tablestats doc for option "-i "


    Ignore the list of tables and display the remaining tables

that is to say if -i appears all the lists of tables and kespaceswill be ignored and will display the remaining tables.


    For example, for this command:

        (1)nodetool tablehistograns -ks ks1 -i -tbs ks1.tb1 ks2.tb2

    Which one of the following should it do?

     1. all tables in the keyspace ks1,  except the table tb1; or
     2. all tables in all keyspaces, except any table in the keyspace
        ks1 and the table tb2 in the keyspace ks2

    A more complex and possibly confusing option could be:

        (2)nodetool tablehistograms ks1 -i ks1.tb1 -i ks1.tb2  # all
        tables in the keyspace ks1, except the table tb1 and tb2

        (3)nodetool tablehistograms -i ks1.tb1 -i ks1.tb2 ks1  #
        identical as above, as -i takes only one parameter

In my mind it is better to use -i option only once (though it is rightto use before every ks and tbs lists ) , so (1) means all tables inks1 (including ks1.tb1) and ks2.tb2 will be ignored and display theremaining (2) will ignore all tables in ks1 (including ks1.tb1,ks1.tb2) and display remaing (3) will show the same result with (2)

the newly added options' behavior is same with nodetool tablestats ,the difference is I displayed parameters specifying option -ks and-tbs , but tablestats don't.





Josh McKenzie <jmcken...@apache.org> 于2023年3月22日周三 23:35写道：

    Agree w/Bowen. I think the straight forward simplicity of "clear
    inclusion and exclusion semantics, default to include all in scope
    excepting things that are explicitly ignored" would be ideal.


    On Wed, Mar 22, 2023, at 8:45 AM, Bowen Song via dev wrote:


    TBH, the syntax looks unnecessarily complex and confusing to me.

    For example, for this command:

        nodetool tablehistograns -ks ks1 -i -tbs ks1.tb1 ks2.tb2

    Which one of the following should it do?

     1. all tables in the keyspace ks1,  except the table tb1; or
     2. all tables in all keyspaces, except any table in the keyspace
        ks1 and the table tb2 in the keyspace ks2


    I personally would prefer the simplicity of this approach:

        nodetool tablehistograms ks1 tb1 tb2 tb3

        nodetool tablehistograms ks1.tb1 ks1.tb2 ks2.tb3

        nodetool tablehistograms -i ks1 -i ks2

        nodetool tablehistograms -i ks1.tb1 -i ks2.tb2


    They are self-explanatory. You don't need to read comments to
    understand what do they do, as long as you know that "-i" means
    "exclude".

    A more complex and possibly confusing option could be:



        nodetool tablehistograms ks1 -i ks1.tb1 -i ks1.tb2  # all
        tables in the keyspace ks1, except the table tb1 and tb2

        nodetool tablehistograms -i ks1.tb1 -i ks1.tb2 ks1  #
        identical as above, as -i takes only one parameter

    To avoid the above confusion, the command could enforce that the
    "-i" option may only be used after any positional options, thus
    makes the 2nd command a syntax error.


    Beyond that, I don't see why the user can't make multiple
    invocations of the nodetool tablehistograms command if they have
    more complex or specific need.

    For example, in this case:

        /> 6.nodetool tablehistograns -i -tbs ks.tb1 ks.tb2 -ks ks1
        // print out list tables' histograms except for table in
        ks.tb1 ks.tb2 and all tables in ks1/

    The same result can be achieved by concatenating the outputs of
    the following two commands:

        nodetool tablehistograms -i ks -i ks1

        nodetool tablehistograms ks -i ks.tb1 -i ks.tb2


    On 22/03/2023 05:12, guo Maxwell wrote:

    Thanks everyone , So It seems that it is better to add new
    parameter options to meet our needs, while keeping the original
    parameter functions unaffected to achieve backward compatibility.
    So the new options are :
    1. nodetool tablehistograms ks.tb1 or ks tb1  ... //this is *one
    of the old way *of using tablehistogram. will print out the
    histograms of tabke ks.tb1 , we keep the old format to print out
    the table histograms,besides if more than two arguments is
    provied, suchu as nodetool tablehistograms system.local
    system_schema.columns system_schema.tables then all tables's
    histograms will be printed out (I think this is a bug that not
    as excepted in the document's decription, we should remind the
    user that this is an incorrenct usage)

    2. nodetool tablehistograms -tbs ks.tb1 ks.tb2 .... //print out
    list of tables' histograms with format keyspace.table
    3.nodetool tablehistograms -ks ks1 ks2 ks3 ... //print out list
    of keyspaces histograms
    4.nodetool tablehistograms -i -ks ks1 ks2 .... //print out list
    of table histograms except for the keyspaces list behind the
    option -i
    5.nodetool tablehistograns -i -tbs ks.tb1 ks.tb2 // print out
    list tables' histograms except for table in ks.tb1 ks.tb2
    6.nodetool tablehistograns -i -tbs ks.tb1 ks.tb2 -ks ks1 //
    print out list tables' histograms except for table in ks.tb1
    ks.tb2 and all tables in ks1
    6.none option specified ,then all tables histograms will be
    print out.// this is *another one of the old way* of using
    tablehistogram.

    So we add some more options like "-i", "-ks", "-tbs" , we can
    combine these options  and we can also use any of them
    individually, besides, we can also use the tool through old way
    if a table with format ks.tb is provied.


    Jeremiah D Jordan <jeremiah.jor...@gmail.com> 于2023年3月16日周四
    23:14写道：

        -1 on any change which breaks the previously documented usage.
        +1 any additions to what the tool can do without breaking
        previously documented behavior.

        On Mar 16, 2023, at 7:42 AM, Josh McKenzie
        <jmcken...@apache.org> wrote:

        We could also consider augmenting the tool with new named
        arguments with the functionality you described and leave
        the positional usage intact.

        On Thu, Mar 16, 2023, at 6:43 AM, Bowen Song via dev wrote:


        The documented command options are:

            nodetool tablehistograms [<keyspace> <table> |
            <keyspace.table>]



        That means one parameter will be treated as dot separated
        keyspace and table. Alternatively, two parameters will be
        treated as the keyspace and table respectively.

        To remain compatible with the documented behaviour, my
        suggestion is to change the command options to:

            nodetool tablehistograms [<keyspace> <table> [<table2>
            [...]] | <keyspace.table> [<keyspace2.table2>[...]]]

        Feel free to add the "all except ..." feature to the above.

        This doesn't break backward compatibility in documented
        ways. It only changes the undocumented behaviour. If
        someone is using the undocumented behaviour, they must
        know things may break when the software is upgraded. We
        can just add a line to the NEWS.txt and let them update
        their scripts.


        On 16/03/2023 08:53, guo Maxwell wrote:

        Hello everyone :
        The nodetool tablehistograms have one argument which you
        can fill with only one table name with the format
        "keyspace_name.table_name /keyspace_name table_name", so
        that you can get the table histograms of the specied table.

        And if none arguments is set, all the tables' histograms
        will be print out.And if more than 2 arguments (nomatter
        the format is right or wrong) are set , all the tables'
        histograms will also be print out too(Which is a bug In
        my mind).

        So the usage of nodetool tablehistograms has some usage
        restrictions, That is either output one , or all
        informations.

        As CASSANDRA-18296
        <https://issues.apache.org/jira/browse/CASSANDRA-18296> described
        , I will change the usage of nodetool tablehistograms,
        which support the feature below:
        1. nodetool tablehistograms ks.tb1 ks.tb2 .... //print
        out list of tables' histograms with format keyspace.table
        2.nodetool tablehistograms ks1 ks2 ks3 ... //print out
        list of keyspaces histograms
        3.nodetool tablehistograms -i ks1 ks2 .... //print out
        list of table histograms except for the keyspaces list
        behind the option -i
        4.nodetool tablehistograns -i ks ks.tb // print out list
        tables' histograms except for table in keyspace ks and
        ks.tb table.
        5.none option specified ,then all tables histograms will
        be print out.

        The usage will breaks compatibility with how it was done
        previously, and as this is a user facing tool.

        So, What do you think?

        Thanks~~~




    --
    you are the apple of my eye !




--
you are the apple of my eye !

Re: [DISCUSS] Change the useage of nodetool tablehistograms

Reply via email to