gortiz commented on issue #15057: URL: https://github.com/apache/pinot/issues/15057#issuecomment-3150085110
Hi there! We have improved o11y in MSE during the last months: > Query failures [without metrics] We centralized the place where errors are reported in MSE: Now it is done in https://github.com/apache/pinot/blob/a09d01faae93e5156884d9429decdd43aa8f5582/pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/MultiStageBrokerRequestHandler.java#L252 > No logs or stats in response metadata that can be used to identify slow instances in any stage of executing a query. Or no way of correlating broker request IDs to the logs or stats. A bit of good and a bit of bad here. Originally, MSE stats were only collected when queries successfully finish, but we are now able to collect them when the query fails. The bad is: 1. When a query fails, it aborts the execution, so the stats returned only contain information of what was executed so far. 2. The stats are not going to be helpful to find slow instances because they are aggregated by stage, not by instance. > Timeouts are difficult to diagnose without taking an approach such as increasing the timeout, rerunning, and then profiling the query but once again that does not enable retrospective debugging. All that is available retrospectively are a high volume of logs across many instances such as: This is fixed. Now that the stats are returned along with the error, you should be able to see where the time was spent. The logs are now enriched with the correlation id of the query (called cid) and the stage. These properties are added to the slf4j context and are printed in the default log4j2.xml provided with Pinot. In case you use your custom log4j2.xml, you will need to modify your pattern to include the cid and the stage. You can get inspiration from https://github.com/apache/pinot/blob/a09d01faae93e5156884d9429decdd43aa8f5582/pinot-tools/src/main/resources/log4j2.xml#L31 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
