Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
......................................................................


Patch Set 37:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/20770/34/be/src/service/workload-management.h
File be/src/service/workload-management.h:

http://gerrit.cloudera.org:8080/#/c/20770/34/be/src/service/workload-management.h@57
PS34, Line 57: ///
             : /// Parameters:
             : ///   `rec` - `QueryStateExpanded` object, an
> Mention the default option chosen here.
I don't like having the actual query options in the comments because there is a 
tendency for those sort of comments to get out of sync with the code. I did add 
a comment explaining where to go to find the actual query opts.


http://gerrit.cloudera.org:8080/#/c/20770/34/be/src/service/workload-management.cc
File be/src/service/workload-management.cc:

http://gerrit.cloudera.org:8080/#/c/20770/34/be/src/service/workload-management.cc@197
PS34, Line 197:
> Can drop std:: here and below since already using namespace std?
Copy-pasta error.  Done


http://gerrit.cloudera.org:8080/#/c/20770/34/be/src/service/workload-management.cc@211
PS34, Line 211:   TQueryOptions opts;
              :
> This can be removed, and INSERT_QUERY_OPTS simply do the the right hand sid
Done


http://gerrit.cloudera.org:8080/#/c/20770/34/be/src/service/workload-management.cc@443
PS34, Line 443:
> I think quote is unnecessary for logging here and below.
I like having the quotes only because it makes it easier for logging solutions 
to parse out table and query_id as fields from this log message.


http://gerrit.cloudera.org:8080/#/c/20770/36/be/src/service/workload-management.cc
File be/src/service/workload-management.cc:

http://gerrit.cloudera.org:8080/#/c/20770/36/be/src/service/workload-management.cc@195
PS36, Line 195: });
> gflags can't have been set from the CLI during static initialization. Which
Good catch!  I was not aware that static initialization happened before the 
flags were set.  I have fixed this issue and added a test to ensure the 
query_log_table_db and query_log_table_name flags actually change the completed 
queries db/table.


http://gerrit.cloudera.org:8080/#/c/20770/34/be/src/util/sql-util.cc
File be/src/util/sql-util.cc:

http://gerrit.cloudera.org:8080/#/c/20770/34/be/src/util/sql-util.cc@29
PS34, Line 29: ace i
> Can drop std:: here and below since already using namespace std?
Done


http://gerrit.cloudera.org:8080/#/c/20770/34/be/src/util/sql-util.cc@36
PS34, Line 36: };
> line has trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/20770/35/be/src/util/sql-util.cc
File be/src/util/sql-util.cc:

http://gerrit.cloudera.org:8080/#/c/20770/35/be/src/util/sql-util.cc@57
PS35, Line 57:         ret << c;
> A switch statement is a lot faster than hash map lookup. It wasn't that ver
Interesting, I would have expected similar performance since a hash map lookup 
complexity is O(1).  Either way, I switched back to the select case since the 
hash map was only used as part of a different approach I initially tried.

I also switched back to the function accepting a const string& instead of a 
begin and end iterator.  It makes both the calling code and this code a lot 
cleaner.


http://gerrit.cloudera.org:8080/#/c/20770/36/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/20770/36/common/thrift/metrics.json@3765
PS36, Line 3765:     "key": "impala-server.completed-queries.written"
> Use dashes in query keys, not underscores.
Done.

I initially found instances of both dashes and underscores in the metric names, 
thus I just picked an approach and ran with it.


http://gerrit.cloudera.org:8080/#/c/20770/31/tests/custom_cluster/test_query_log.py
File tests/custom_cluster/test_query_log.py:

http://gerrit.cloudera.org:8080/#/c/20770/31/tests/custom_cluster/test_query_log.py@68
PS31, Line 68:
> nit: most of the loops of profile_lines could instead use
I was able to replace three of the places where the code looped through the 
query lines with this pattern.


http://gerrit.cloudera.org:8080/#/c/20770/33/tests/custom_cluster/test_query_log.py
File tests/custom_cluster/test_query_log.py:

http://gerrit.cloudera.org:8080/#/c/20770/33/tests/custom_cluster/test_query_log.py@411
PS33, Line 411:
> I've seem some cases in the live query test suite where this is hundreds (n
I hit the same situation and made this change.


http://gerrit.cloudera.org:8080/#/c/20770/34/tests/custom_cluster/test_query_log.py
File tests/custom_cluster/test_query_log.py:

http://gerrit.cloudera.org:8080/#/c/20770/34/tests/custom_cluster/test_query_log.py@827
PS34, Line 827: for i in range(query_count):
> Can this be simplifield by just looping 30 times?
The purpose of this test is to assert that the 15 second flush interval is 
correctly implemented.  I was able to simplify this test quite a bit though 
based on your suggestion.


http://gerrit.cloudera.org:8080/#/c/20770/34/tests/custom_cluster/test_query_log.py@836
PS34, Line 836: client.close()
> Isn't it better if total_time is as close to 10s? So maybe just asserting <
Part of this test is to assert the queries are not written to the completed 
queries table too soon.  Thus, I like having the minimum as part of the assert.

That being said, I was able to simplify this test quite a bit based on your 
suggestions, and this assertion is now gone.



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 37
Gerrit-Owner: Jason Fehr <[email protected]>
Gerrit-Reviewer: Andrew Sherman <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Fri, 08 Mar 2024 21:09:20 +0000
Gerrit-HasComments: Yes

Reply via email to