Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 )
Change subject: IMPALA-12426: Query History Table ...................................................................... Patch Set 42: (34 comments) http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/query-state-record.cc File be/src/service/query-state-record.cc: http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/query-state-record.cc@208 PS39, Line 208: base_state->num_rows_fetched = exec_state.num_rows_fetched_counter(); > Nice, I'd noticed base could be null and fixed it in my patch. I only fixed this because I saw in your patch that you found and fixed it there! http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h File be/src/service/workload-management.h: http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@40 PS41, Line 40: > Can this renamed with something more explanatory? Perhaps "management" or " Yup, renamed to "workload_management" http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@102 PS41, Line 102: the comple > Looks like most columns are primitive types. So this can be replaced with T Done. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@248 PS41, Line 248: : // Query Options set by Configuration : FieldDefinition("query_opts_config", TPrimitiveType::STRING, : [](FieldParserContext& ctx){ : const std::string opts_str = DebugQueryOptions(ctx.record->query_options); : ctx.sql << "'" << EscapeSql(opts_str) << "'"; : }), : : // Resource Pool : FieldDefinition("resource_pool", TPrimitiveType::STRING, : [](FieldParserContext& ctx){ : ctx.sql << "'" << EscapeSql(ctx.record->base_state->resource_pool) << "'"; : }), : : // Per-host Memory Estimate : FieldDefinition("per_host_mem_estimate", TPrimitiveType::BIGINT, : [](FieldParserContext& ctx){ : ctx.sql << ctx.record->per_host_mem_estimate; : }), : : // Dedi > This can be feed directly to StringStream sql. This the how generated Types Done. That's much nicer! http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h File be/src/service/workload-management.h: http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h@42 PS39, Line 42: > This is going to allocate storage every place the header is included. Do we Nope, fixed. I left the constant declaration here but declared it extern. http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h@51 PS39, Line 51: /// object that is passed to each parser. > This should still be extern, or it won't use the right value in all cases. Done http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h@105 PS39, Line 105: /// `ctx` The field parse context object. > These should be `std::move`'d in to place. Right now it's making 2 copies o Done http://gerrit.cloudera.org:8080/#/c/20770/36/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/36/be/src/service/workload-management.cc@61 PS36, Line 61: > Decided this didn't need to be configurable? Yeah, after reviewing your active queries patch where the database was hardcoded to "sys", I decided to make the completed queries also be hardcoded to "sys" to match. http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.cc@206 PS39, Line 206: namespace workload_management { > This is still evaluated during static initialization, before query_log_writ Done http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.cc@354 PS39, Line 354: completed_queries_.emplace_back(make_pair(move(exp_rec), 0)); > Does this actually need to be a global? Looks like it's only used in this f nope, it does not. fixed http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@63 PS41, Line 63: "the query log table where completed queries will be stored."); : : DEFINE_validator(query_log_table_name, [](const char* name, const string& val) { : if (regex_match(val, alphanum_underscore_dash)) return true; : : LOG(ERROR) << "Invalid value for --" << name << ": must only contain alphanumeric " : "characters, underscores, or dashes and must not be empty"; : return false; : }); : > What if this constrained further to only contain alphanumeric and underscor Done http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@79 PS41, Line 79: return false; : } : : if (val.find_first_of('"') != string::npos) { : LOG(ERROR) << "Invalid value for --" << name << ": must not contain double quotes."; : return false; : } : : if (val.find_first_of('\n') != string::npos) { : LOG(ERROR) << "Invalid value for --" << name << ": must not contain newlines."; : return false; : } : : return true; : }); : > Is this mainly validate if val is a legal POSIX path? This parameter gets passed directly to the "LOCATION" portion of the create table statement. Thus, it could be a URL or a file path. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@140 PS41, Line 140: "will be recorded in the completed queries table. If a sql statement with a length " : "longer than this value is executed, the sql inserted into the completed queries " : "table will be trimmed to this length. Any characters that need escaping will have " : "their backslash character counted towards this limit."); : > Should this be validated against query_log_write_interval_s? It does not need to since a graceful shutdown will interrupt the wait and force the completed queries queue to be drained to the completed queries table. Thus, this value is actually the timeout for the drain process. I updated this flag to be named "query_log_shutdown_timeout_s" instead to better reflect its purpose. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@186 PS41, Line 186: DEFINE_int32_hidden(query_log_max_insert_attempts, 3, "Maximum number of times to " > nit: this could all be static variables local to this file since they're no I keep going back and forth on this topic, and I'm settling on making variables local static unless they are needed outside the cc file in which case they will go into the header file. My original thought was to make these constant in case other code needed them in the future, but it's trivial to move from a static local to a header file global in the unlikely event that the need actually arises. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@368 PS41, Line 368: > We should use this and the batch size to SET MAX_STATEMENT_LENGTH_BYTES for I reworked this code. There are now coordinator startup flags that control the max length of sql and plan statements. If a sql or plan exceeds that max length, they are truncated to the specified max length. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@382 PS41, Line 382: tringStreamPop fields; : fields << "INSERT INTO " << table_name << "("; : for (const auto& field : FIELD_DEFINITIONS) { : fields << field.db_column_name << > Should this moved inside if, right after L379? There really is no harm in initializing this variable if the previous condition evaluated to false, but to make the code cleaner, I updated the previous if statement to return when the condition evaluates to false. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@391 PS41, Line 391: he initialization code only works when run in a separat > Add DCHECK after this statement: Done http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@452 PS41, Line 452: / > After this loop, DCHECK that COMPLETED_QUERIES_QUEUED value >= size of quer Done http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@474 PS41, Line 474: max_row_size = row.size(); : } > DCHECK that COMPLETED_QUERIES_QUEUED >= 0 after this. Done http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@479 PS41, Line 479: : DCHECK(ImpaladMetrics::COMPLETED_QUERIES_QUEUED->GetValue() >= > COMPLETED_QUERIES_QUEUED is not decremented here. Are they enqueued again a Yes, each completed query is tried three times to be inserted into the completed queries table. The COMPLETED_QUERIES_QUEUED metric is incremented when a completed query is added to the completed_queries_ queue and that metric is decremented when the completed query is either successfully written to the completed queries table or else fails three times. The completed_queries_ list contains a std::pair<std::shared_ptr<QueryStateExpanded>, uint8_t>. The unit8_t is a counter of the number of attempts to insert the completed query into the completed queries table. Since the attempts is tracked on each individual completed query, to determine in any particular batch how many succeeded, how many failed but will be tried again, and how many failed but won't be tried again requires looping back through the queries_to_insert list and tracking each value. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@481 PS41, Line 481: eries_to_insert.size()); > Unnecessary extra indentation. Done http://gerrit.cloudera.org:8080/#/c/20770/42/be/src/util/thread.h File be/src/util/thread.h: http://gerrit.cloudera.org:8080/#/c/20770/42/be/src/util/thread.h@228 PS42, Line 228: Fixed in next patch. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/util/thread.h File be/src/util/thread.h: http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/util/thread.h@229 PS41, Line 229: } : > There are some specific *ThreadState defined already such as ScannerThreadS Done http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py File tests/custom_cluster/test_query_log.py: http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@80 PS41, Line 80: .format(self.QUERY_TBL, query_id)) : assert res.success > Add assert message, mention what indices 0 and 1 are about. Done http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@87 PS41, Line 87: == self.MA > For with_args, consider adding line break after open parentheses. There is not a well defined pattern for custom cluster tests. I like this format simply because it makes it easy to see what arguments are being provided to the test. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@168 PS41, Line 168: "--shutdown_grace_period_s=10 " : "--shutdown_deadline_s=60", : catalogd_args="--enable_workload_mgmt", : > Can this merged with test_query_log_table_ddl? This looks like a superset o It possibly could. The test_query_log_table_ddl test is specifically to test a ddl statement while this test is for a dml statement. There are enough differences that I would prefer to keep them separate. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@199 PS41, Line 199: finally: > Move this constant above, near class name. Done http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@411 PS41, Line 411: m > Move this constant above, near class name. Done http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@473 PS41, Line 473: uster.ge > Explain what data[42] is. Done http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@835 PS41, Line 835: > nit: "timeout=7", for clarity. I didn't use named arguments anywhere else, thus I don't want to include it here and have a one-off pattern. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@839 PS41, Line 839: # At least 10 seconds have already elapsed, wait up to 7 more seconds for the : # queries t > Move these constant above, near class name. Maybe name them specific to tes Moved the constants. This test does not need a cluster_id, so I removed it. The other test selects from the completed queries table based on cluster_id, thus each run of the test must have it's own unique cluster_id or else records from previous tests get picked up and cause the assertions to fail. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@908 PS41, Line 908: assert impalad.service.get_metric_value("impala-server.comp > Move constant above, near class name. Maybe name them specific to test meth Done http://gerrit.cloudera.org:8080/#/c/20770/41/tests/util/workload_management.py File tests/util/workload_management.py: http://gerrit.cloudera.org:8080/#/c/20770/41/tests/util/workload_management.py@74 PS41, Line 74: EAK_MEM_MIN > Array data is returned to caller of assert_query() function. Done http://gerrit.cloudera.org:8080/#/c/20770/41/tests/util/workload_management.py@259 PS41, Line 259: end_time_obj = datetime.strptime(end_time.group(1)[:-3], "%Y-%m-%d %H:%M:%S.%f") > This doesn't work when I try to write tests using some existing tables, lik Interesting. I switched my tests to this same regex and will see what happens. I'm also looking at using the functional tables in these tests instead of setting up new tables. -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 42 Gerrit-Owner: Jason Fehr <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Tue, 12 Mar 2024 19:32:00 +0000 Gerrit-HasComments: Yes
