Tanya-W opened a new pull request, #16569: URL: https://github.com/apache/doris/pull/16569
# Proposed changes Issue Number: close #xxx ## Problem summary At the storage layer, the raw data of the index column will still be read after apply index(bitmap_index or inverted_index), although the index column is not in the result column returned by the query, that will generate more performance overhead on seek and read data. In addition, when there are multi-table join query, there will be many in or not_in predicate of runtime filter pushed down to the storage layer. According to our test, if apply those predicates by inverted index, the performance will be degraded because there are many conditions in in_predicate. Therefore, the inverted index not apply on in or not_in predicate which is produced by runtime_filter. Based on that situation, this pr will do: 1. reduce overhead on seek and read data for index column that only in where clause, optimization for query sql like: ``` sql 1: SELECT timestamp FROM tb WHERE log MATCH 'error'; sql 2: SELECT timestamp FROM tb WHERE log MATCH 'error' ORDER BY timestamp LIMIT 2; sql 3: SELECT timestamp FROM tb WHERE log MATCH 'error' AND status = 404; sql 4: SELECT timestamp FROM tb WHERE log MATCH 'error' AND status = 404 ORDER BY timestamp DESC LIMIT 10; sql 5: SELECT count() FROM tb WHERE log MATCH 'error'; ``` column `log` and column `status` is inverted index or bitmap index, above sqls only need seek and read data of column `timestamp` 2. not apply inverted index on in or not_in predicate which is produced by runtime_filter. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 4. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 5. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 6. Does it need to update dependencies: - [ ] Yes - [ ] No 7. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org