Re: [PR] [fix](Nereids )fill up miss slot of order having project [doris]
jackwener merged PR #27480: URL: https://github.com/apache/doris/pull/27480 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
(doris) branch master updated: [fix](Nereids): fill up miss slot of order having project (#27480)
This is an automated email from the ASF dual-hosted git repository. jakevin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new cbdb886b6e8 [fix](Nereids): fill up miss slot of order having project (#27480) cbdb886b6e8 is described below commit cbdb886b6e846e5f73ad9bd60bad65b928eaf767 Author: 谢健 AuthorDate: Mon Nov 27 16:00:29 2023 +0800 [fix](Nereids): fill up miss slot of order having project (#27480) fill up miss slot of order having project such as ``` select a + 1 as c from t having by c > 2 order by a ``` --- .../org/apache/doris/nereids/rules/RuleType.java | 1 + .../nereids/rules/analysis/FillUpMissingSlots.java | 20 .../rules/analysis/FillUpMissingSlotsTest.java | 7 +++ 3 files changed, 28 insertions(+) diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleType.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleType.java index 7afc0123aae..657189f931f 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleType.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleType.java @@ -65,6 +65,7 @@ public enum RuleType { FILL_UP_HAVING_AGGREGATE(RuleTypeClass.REWRITE), FILL_UP_HAVING_PROJECT(RuleTypeClass.REWRITE), FILL_UP_SORT_AGGREGATE(RuleTypeClass.REWRITE), +FILL_UP_SORT_HAVING_PROJECT(RuleTypeClass.REWRITE), FILL_UP_SORT_HAVING_AGGREGATE(RuleTypeClass.REWRITE), FILL_UP_SORT_PROJECT(RuleTypeClass.REWRITE), diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/FillUpMissingSlots.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/FillUpMissingSlots.java index 84d0dbae40a..afee320f6f5 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/FillUpMissingSlots.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/FillUpMissingSlots.java @@ -120,6 +120,26 @@ public class FillUpMissingSlots implements AnalysisRuleFactory { }); }) ), +RuleType.FILL_UP_SORT_HAVING_PROJECT.build( +logicalSort(logicalHaving(logicalProject())).then(sort -> { +Set childOutput = sort.child().getOutputSet(); +Set notExistedInProject = sort.getOrderKeys().stream() +.map(OrderKey::getExpr) +.map(Expression::getInputSlots) +.flatMap(Set::stream) +.filter(s -> !childOutput.contains(s)) +.collect(Collectors.toSet()); +if (notExistedInProject.size() == 0) { +return null; +} +LogicalProject project = sort.child().child(); +List projects = ImmutableList.builder() +.addAll(project.getProjects()) +.addAll(notExistedInProject).build(); +Plan child = sort.withChildren(sort.child().withChildren(project.withProjects(projects))); +return new LogicalProject<>(ImmutableList.copyOf(project.getOutput()), child); +}) +), RuleType.FILL_UP_HAVING_AGGREGATE.build( logicalHaving(aggregate()).then(having -> { Aggregate agg = having.child(); diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/analysis/FillUpMissingSlotsTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/analysis/FillUpMissingSlotsTest.java index 2e479c05953..7b0c8095ba6 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/analysis/FillUpMissingSlotsTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/analysis/FillUpMissingSlotsTest.java @@ -579,4 +579,11 @@ public class FillUpMissingSlotsTest extends AnalyzeCheckTestBase implements Memo PlanChecker.from(connectContext).analyze(sql) .matches(logicalFilter()); } + +@Test +void testSortHaving() { +String sql = "SELECT (pk + 1) as c FROM t1 HAVING c > 1 ORDER BY a1 + pk"; +PlanChecker.from(connectContext).analyze(sql) +.applyBottomUp(new CheckAfterRewrite()); +} } - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] Pick "[enhance](S3) Print the error detail for every s3 operation (#27572)" [doris]
doris-robot commented on PR #27615: URL: https://github.com/apache/doris/pull/27615#issuecomment-1827322968 TeamCity be ut coverage result: Function Coverage: 37.96% (7984/21032) Line Coverage: 29.67% (64770/218313) Region Coverage: 29.11% (33373/114634) Branch Coverage: 24.97% (17138/68642) Coverage Report: http://coverage.selectdb-in.cc/coverage/a8db1122b14a568df1118aa2b4b6ca74110c0748_a8db1122b14a568df1118aa2b4b6ca74110c0748/report/index.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [feature](Nereids): Pushdown TopN-Distinct through Union [doris]
jackwener opened a new pull request, #27628: URL: https://github.com/apache/doris/pull/27628 ## Proposed changes Issue Number: close #xxx ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [fix](doc)add config for delete timeout job [doris]
DongLiang-0 opened a new pull request, #27629: URL: https://github.com/apache/doris/pull/27629 ## Proposed changes Issue Number: close #xxx ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](doc)add config for delete timeout job [doris]
DongLiang-0 commented on PR #27629: URL: https://github.com/apache/doris/pull/27629#issuecomment-1827336570 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [I] [Bug] FE 频繁打满64G内存导致宕机,集群上只有Broker Load在定时执行,过一段时间内存就满了 [doris]
liugddx commented on issue #27594: URL: https://github.com/apache/doris/issues/27594#issuecomment-1827341661 Using G1GC -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](parquet)fix can not read parquet lz4 compress. [doris]
github-actions[bot] commented on PR #27383: URL: https://github.com/apache/doris/pull/27383#issuecomment-1827344751 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
(doris) branch master updated: [fix](ci) fix bug that "run build\n" not trigger pipeline (#27617)
This is an automated email from the ASF dual-hosted git repository. zhangstar333 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 3d0dc94b180 [fix](ci) fix bug that "run build\n" not trigger pipeline (#27617) 3d0dc94b180 is described below commit 3d0dc94b18081733dbee9398f6cd0caf481de1d2 Author: Dongyang Li AuthorDate: Mon Nov 27 16:23:42 2023 +0800 [fix](ci) fix bug that "run build\n" not trigger pipeline (#27617) Co-authored-by: stephen --- .github/workflows/comment-to-trigger-teamcity.yml | 29 --- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/.github/workflows/comment-to-trigger-teamcity.yml b/.github/workflows/comment-to-trigger-teamcity.yml index 3f236eb1f5a..92f0d6b962d 100644 --- a/.github/workflows/comment-to-trigger-teamcity.yml +++ b/.github/workflows/comment-to-trigger-teamcity.yml @@ -50,6 +50,7 @@ jobs: "${COMMENT_BODY}" == *'run tpch'* ]]; then echo "comment_trigger=true" | tee -a "$GITHUB_OUTPUT" else +echo "comment_trigger=false" | tee -a "$GITHUB_OUTPUT" echo "find no keyword in comment body, skip this action." exit fi @@ -63,17 +64,17 @@ jobs: echo "COMMENT_BODY='${COMMENT_BODY}'" | tee -a "$GITHUB_OUTPUT" reg="run (buildall|compile|p0|p1|feut|beut|external|clickbench|pipelinex_p0|arm|tpch)( [1-9]*[0-9]+)*" -COMMENT_TRIGGER_TYPE="$(echo "${COMMENT_BODY}" | xargs | grep -E "${reg}" | awk -F' ' '{print $2}' | sed -n 1p)" -COMMENT_REPEAT_TIMES="$(echo "${COMMENT_BODY}" | xargs | grep -E "${reg}" | awk -F' ' '{print $3}' | sed -n 1p)" +COMMENT_TRIGGER_TYPE="$(echo -e "${COMMENT_BODY}" | xargs | grep -E "${reg}" | awk -F' ' '{print $2}' | sed -n 1p)" +COMMENT_REPEAT_TIMES="$(echo -e "${COMMENT_BODY}" | xargs | grep -E "${reg}" | awk -F' ' '{print $3}' | sed -n 1p)" echo "COMMENT_TRIGGER_TYPE=${COMMENT_TRIGGER_TYPE}" | tee -a "$GITHUB_OUTPUT" echo "COMMENT_REPEAT_TIMES=${COMMENT_REPEAT_TIMES}" | tee -a "$GITHUB_OUTPUT" - name: "Checkout master" - if: ${{ steps.parse.outputs.comment_trigger }} + if: ${{ fromJSON(steps.parse.outputs.comment_trigger) }} uses: actions/checkout@v4 - name: "Check if pr need run build" - if: ${{ steps.parse.outputs.comment_trigger }} + if: ${{ fromJSON(steps.parse.outputs.comment_trigger) }} id: changes run: | source regression-test/pipeline/common/github-utils.sh @@ -130,7 +131,7 @@ jobs: # uses: mxschmitt/action-tmate@v3 - name: "Trigger or Skip feut" - if: ${{ steps.parse.outputs.comment_trigger && contains(fromJSON('["feut", "buildall"]'), steps.parse.outputs.COMMENT_TRIGGER_TYPE) }} + if: ${{ fromJSON(steps.parse.outputs.comment_trigger) && contains(fromJSON('["feut", "buildall"]'), steps.parse.outputs.COMMENT_TRIGGER_TYPE) }} run: | source ./regression-test/pipeline/common/teamcity-utils.sh set -x @@ -143,7 +144,7 @@ jobs: - name: "Trigger or Skip beut" - if: ${{ steps.parse.outputs.comment_trigger && contains(fromJSON('["beut", "buildall"]'), steps.parse.outputs.COMMENT_TRIGGER_TYPE) }} + if: ${{ fromJSON(steps.parse.outputs.comment_trigger) && contains(fromJSON('["beut", "buildall"]'), steps.parse.outputs.COMMENT_TRIGGER_TYPE) }} run: | source ./regression-test/pipeline/common/teamcity-utils.sh set -x @@ -155,7 +156,7 @@ jobs: "${{ steps.parse.outputs.COMMENT_REPEAT_TIMES }}" - name: "Trigger or Skip compile" - if: ${{ steps.parse.outputs.comment_trigger && contains(fromJSON('["compile", "buildall"]'), steps.parse.outputs.COMMENT_TRIGGER_TYPE) }} + if: ${{ fromJSON(steps.parse.outputs.comment_trigger) && contains(fromJSON('["compile", "buildall"]'), steps.parse.outputs.COMMENT_TRIGGER_TYPE) }} run: | source ./regression-test/pipeline/common/teamcity-utils.sh set -x @@ -167,7 +168,7 @@ jobs: "${{ steps.parse.outputs.COMMENT_REPEAT_TIMES }}" - name: "Trigger or Skip p0" - if: ${{ steps.parse.outputs.comment_trigger && contains(fromJSON('["p0", "buildall"]'), steps.parse.outputs.COMMENT_TRIGGER_TYPE) }} + if: ${{ fromJSON(steps.parse.outputs.comment_trigger) && contains(fromJSON('["p0", "buildall"]'), steps.parse.outputs.COMMENT_TRIGGER_TYPE) }} run: | source ./regression-test/pipeline/common/teamcity-utils.sh if [[ ${{ steps.parse.outputs.COMMENT_TRIGGER_TYPE }} == "buildall" ]]; then @@ -182,7 +183,7 @@ jobs: "${{ steps.parse.outputs.COMMENT_REPEAT_TIMES }}" - name: "Trigger or Skip p1" - if: ${{ steps.parse.outputs.comment_trigger && contains(fromJSON('["p1", "buildall"]'), steps.parse.outputs.COMMENT_TRIGGER_TYPE) }} +
Re: [PR] [fix](ci) fix bug that "run build\n" not trigger pipeline [doris]
zhangstar333 merged PR #27617: URL: https://github.com/apache/doris/pull/27617 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Feat](Nereids) support view as a independent unit of leading [doris]
LiBinfeng-01 commented on PR #27378: URL: https://github.com/apache/doris/pull/27378#issuecomment-1827351414 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [Improvement](index compaction) improve index compaction perf by prio… [doris-thirdparty]
airborne12 opened a new pull request, #139: URL: https://github.com/apache/doris-thirdparty/pull/139 …rity queue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] Pick "[enhance](S3) Print the error detail for every s3 operation (#27572)" [doris]
doris-robot commented on PR #27615: URL: https://github.com/apache/doris/pull/27615#issuecomment-1827354374 (From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 45.96 seconds stream load tsv: 566 seconds loaded 74807831229 Bytes, about 126 MB/s stream load json: 24 seconds loaded 2358488459 Bytes, about 93 MB/s stream load orc: 69 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s insert into select: 29.7 seconds inserted 1000 Rows, about 336K ops/s storage size: 17167736968 Bytes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Improvement](index compaction) improve index compaction perf by prio… [doris-thirdparty]
airborne12 closed pull request #139: [Improvement](index compaction) improve index compaction perf by prio… URL: https://github.com/apache/doris-thirdparty/pull/139 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [improve](stream_load) add prompt when all fields is null [doris]
HHoflittlefish777 opened a new pull request, #27630: URL: https://github.com/apache/doris/pull/27630 ## Proposed changes before: ``` Reason: All fields is null, this is a invalid row.. src line [{"name":"Name 1","age":21,"agent_id":"5fbfefd2-ea1c-44fd-bc54-6eb2582e1525"} ]; ``` after: ``` Reason: All fields is null, this is a invalid row. Table column names:[name, age, agent_id, ], please check columns mapping or data quality. src line [{"name":"Name 1","age":21,"agent_id":"5fbfefd2-ea1c-44fd-bc54-6eb2582e1525"} ]; ``` ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [regression case](broker laod) add case for without seq [doris]
github-actions[bot] commented on PR #27586: URL: https://github.com/apache/doris/pull/27586#issuecomment-182737 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](Nereids) non-deterministic expression should not be constant [doris]
morrySnow merged PR #27606: URL: https://github.com/apache/doris/pull/27606 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
(doris) branch master updated: [fix](Nereids) non-deterministic expression should not be constant (#27606)
This is an automated email from the ASF dual-hosted git repository. morrysnow pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new fde4bab048d [fix](Nereids) non-deterministic expression should not be constant (#27606) fde4bab048d is described below commit fde4bab048d2cc8cadcd943d4ebd6c0998b7fd3d Author: morrySnow <101034200+morrys...@users.noreply.github.com> AuthorDate: Mon Nov 27 16:40:30 2023 +0800 [fix](Nereids) non-deterministic expression should not be constant (#27606) --- .../java/org/apache/doris/nereids/trees/expressions/Expression.java| 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/Expression.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/Expression.java index a10de9282b2..6922f81ca62 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/Expression.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/Expression.java @@ -22,6 +22,7 @@ import org.apache.doris.nereids.analyzer.Unbound; import org.apache.doris.nereids.exceptions.AnalysisException; import org.apache.doris.nereids.trees.AbstractTreeNode; import org.apache.doris.nereids.trees.expressions.functions.ExpressionTrait; +import org.apache.doris.nereids.trees.expressions.functions.Nondeterministic; import org.apache.doris.nereids.trees.expressions.literal.Literal; import org.apache.doris.nereids.trees.expressions.literal.NullLiteral; import org.apache.doris.nereids.trees.expressions.shape.LeafExpression; @@ -226,7 +227,7 @@ public abstract class Expression extends AbstractTreeNode implements if (this instanceof LeafExpression) { return this instanceof Literal; } else { -return children().stream().allMatch(Expression::isConstant); +return !(this instanceof Nondeterministic) && children().stream().allMatch(Expression::isConstant); } } - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [improve](stream_load) add prompt when all fields is null [doris]
HHoflittlefish777 closed pull request #27630: [improve](stream_load) add prompt when all fields is null URL: https://github.com/apache/doris/pull/27630 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [improve](stream_load) add prompt when all fields is null [doris]
github-actions[bot] commented on PR #27630: URL: https://github.com/apache/doris/pull/27630#issuecomment-1827373995 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [improve](stream_load) add prompt when all fields is null [doris]
HHoflittlefish777 commented on PR #27630: URL: https://github.com/apache/doris/pull/27630#issuecomment-1827375451 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [fix](Nereids) non-deterministic expression should not be constant (#27606) [doris]
morrySnow opened a new pull request, #27631: URL: https://github.com/apache/doris/pull/27631 pick from master #27606 ## Proposed changes Issue Number: close #xxx ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [improve](stream_load) add prompt when all fields is null [doris]
github-actions[bot] commented on PR #27630: URL: https://github.com/apache/doris/pull/27630#issuecomment-1827377296 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [improve](stream_load) add prompt when all fields is null [doris]
github-actions[bot] commented on PR #27630: URL: https://github.com/apache/doris/pull/27630#issuecomment-1827385535 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [refactor](Nereids): unify one DateLiteral init() [doris]
jackwener commented on code in PR #27618: URL: https://github.com/apache/doris/pull/27618#discussion_r1405830706 ## fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java: ## @@ -485,10 +492,30 @@ private void init(String s, Type type) throws AnalysisException { minute = getOrDefault(dateTime, ChronoField.MINUTE_OF_HOUR, 0); second = getOrDefault(dateTime, ChronoField.SECOND_OF_MINUTE, 0); microsecond = getOrDefault(dateTime, ChronoField.MICRO_OF_SECOND, 0); -if (microsecond != 0 && type.isDatetime()) { -int dotIndex = s.lastIndexOf("."); -int scale = s.length() - dotIndex - 1; -type = ScalarType.createDatetimeV2Type(scale); + +if (type != null) { +if (microsecond != 0 && type.isDatetime()) { +int dotIndex = s.lastIndexOf("."); +int scale = s.length() - dotIndex - 1; +type = ScalarType.createDatetimeV2Type(scale); +} +} else { +if (hour == 0 && minute == 0 && second == 0 && microsecond == 0) { +type = ScalarType.getDefaultDateType(Type.DATE); +} else { +type = ScalarType.getDefaultDateType(Type.DATETIME); +if (type.isDatetimeV2() && microsecond != 0) { +int scale = 6; +for (int i = 0; i < 6; i++) { +if (microsecond % Math.pow(10.0, i + 1) > 0) { +break; +} else { +scale -= 1; +} +} +type = ScalarType.createDatetimeV2Type(scale); +} +} Review Comment: move it from `fromDateStr(String dateStr)` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [refactor](Nereids): unify one DateLiteral init() [doris]
github-actions[bot] commented on PR #27618: URL: https://github.com/apache/doris/pull/27618#issuecomment-1827397094 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [refactor](Nereids): unify one DateLiteral init() [doris]
github-actions[bot] commented on PR #27618: URL: https://github.com/apache/doris/pull/27618#issuecomment-1827397167 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](planner)sort node should materialized required slots for itself [doris]
doris-robot commented on PR #27620: URL: https://github.com/apache/doris/pull/27620#issuecomment-1827397900 (From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 45.51 seconds stream load tsv: 574 seconds loaded 74807831229 Bytes, about 124 MB/s stream load json: 23 seconds loaded 2358488459 Bytes, about 97 MB/s stream load orc: 67 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s insert into select: 29.7 seconds inserted 1000 Rows, about 336K ops/s storage size: 17168073266 Bytes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [opt](nereids) add distribute specification to plan shape check and add distribute hint tests [doris]
LiBinfeng-01 commented on PR #27537: URL: https://github.com/apache/doris/pull/27537#issuecomment-1827402595 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [performance](Nereids): avoid use `getStringValue()` in getTimeFormatter() [doris]
jackwener commented on PR #27625: URL: https://github.com/apache/doris/pull/27625#issuecomment-1827404838 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [refactor](Nereids): unify one DateLiteral init() [doris]
doris-robot commented on PR #27618: URL: https://github.com/apache/doris/pull/27618#issuecomment-1827414748 (From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 45.51 seconds stream load tsv: 562 seconds loaded 74807831229 Bytes, about 126 MB/s stream load json: 28 seconds loaded 2358488459 Bytes, about 80 MB/s stream load orc: 73 seconds loaded 1101869774 Bytes, about 14 MB/s stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s insert into select: 28.6 seconds inserted 1000 Rows, about 349K ops/s storage size: 17097758483 Bytes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [refactor](Nereids): unify one DateLiteral init() [doris]
jackwener merged PR #27618: URL: https://github.com/apache/doris/pull/27618 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
(doris) branch master updated: [refactor](Nereids): unify one DateLiteral init() (#27618)
This is an automated email from the ASF dual-hosted git repository. jakevin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 66eeafcd484 [refactor](Nereids): unify one DateLiteral init() (#27618) 66eeafcd484 is described below commit 66eeafcd484570c1cc80e64d45a435214969dca7 Author: jakevin AuthorDate: Mon Nov 27 17:09:45 2023 +0800 [refactor](Nereids): unify one DateLiteral init() (#27618) `fromDateStr` will parse `date string` into `dateLiteral`, but `init()` already handle it, so we can use `init()` replace it. --- .../org/apache/doris/analysis/DateLiteral.java | 179 + .../doris/rewrite/RewriteDateLiteralRule.java | 3 +- 2 files changed, 39 insertions(+), 143 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java index 6253f91fb3c..879795f8737 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java @@ -64,6 +64,7 @@ import java.util.Set; import java.util.TimeZone; import java.util.regex.Pattern; import java.util.stream.Collectors; +import javax.annotation.Nullable; public class DateLiteral extends LiteralExpr { private static final Logger LOG = LogManager.getLogger(DateLiteral.class); @@ -107,9 +108,6 @@ public class DateLiteral extends LiteralExpr { private static Map WEEK_DAY_NAME_DICT = Maps.newHashMap(); private static Set TIME_PART_SET = Sets.newHashSet(); private static final int[] DAYS_IN_MONTH = new int[] {0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}; -private static final int ALLOW_SPACE_MASK = 4 | 64; -private static final int MAX_DATE_PARTS = 8; -private static final int YY_PART_YEAR = 70; static { try { @@ -245,6 +243,12 @@ public class DateLiteral extends LiteralExpr { analysisDone(); } +public DateLiteral(String s) throws AnalysisException { +super(); +init(s, null); +analysisDone(); +} + public DateLiteral(long unixTimestamp, TimeZone timeZone, Type type) throws AnalysisException { Timestamp timestamp = new Timestamp(unixTimestamp); @@ -368,9 +372,11 @@ public class DateLiteral extends LiteralExpr { return new DateLiteral(type, false); } -private void init(String s, Type type) throws AnalysisException { +private void init(String s, @Nullable Type type) throws AnalysisException { try { -Preconditions.checkArgument(type.isDateType()); +if (type != null) { +Preconditions.checkArgument(type.isDateType()); +} TemporalAccessor dateTime = null; boolean parsed = false; int offset = 0; @@ -442,10 +448,11 @@ public class DateLiteral extends LiteralExpr { builder.appendLiteral(" "); } String[] timePart = s.contains(" ") ? s.split(" ")[1].split(":") : new String[] {}; -if (timePart.length > 0 && (type.equals(Type.DATE) || type.equals(Type.DATEV2))) { +if (timePart.length > 0 && type != null && (type.equals(Type.DATE) || type.equals(Type.DATEV2))) { throw new AnalysisException("Invalid date value: " + s); } -if (timePart.length == 0 && (type.equals(Type.DATETIME) || type.equals(Type.DATETIMEV2))) { +if (timePart.length == 0 && type != null && (type.equals(Type.DATETIME) || type.equals( +Type.DATETIMEV2))) { throw new AnalysisException("Invalid datetime value: " + s); } for (int i = 0; i < timePart.length; i++) { @@ -485,10 +492,30 @@ public class DateLiteral extends LiteralExpr { minute = getOrDefault(dateTime, ChronoField.MINUTE_OF_HOUR, 0); second = getOrDefault(dateTime, ChronoField.SECOND_OF_MINUTE, 0); microsecond = getOrDefault(dateTime, ChronoField.MICRO_OF_SECOND, 0); -if (microsecond != 0 && type.isDatetime()) { -int dotIndex = s.lastIndexOf("."); -int scale = s.length() - dotIndex - 1; -type = ScalarType.createDatetimeV2Type(scale); + +if (type != null) { +if (microsecond != 0 && type.isDatetime()) { +int dotIndex = s.lastIndexOf("."); +int scale = s.length() - dotIndex - 1; +type = ScalarType.createDatetimeV2Type(scale); +} +} else { +if (hour == 0 && minute == 0 && second == 0 && microsecond == 0) { +type = ScalarType.getDefaultDateType(Type.DATE); +} else { +
Re: [PR] Dnm [doris]
jackwener closed pull request #27373: Dnm URL: https://github.com/apache/doris/pull/27373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Fix](statistics)Fix bug and improve auto analyze. [doris]
Jibing-Li commented on PR #27626: URL: https://github.com/apache/doris/pull/27626#issuecomment-1827430903 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [regression](partial update) Fix unstable p0 case `test_primary_key_partial_update_parallel` due to conflicting table name [doris]
bobhan1 opened a new pull request, #27633: URL: https://github.com/apache/doris/pull/27633 ## Proposed changes ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](parquet)fix can not read parquet lz4 compress. [doris]
hubgeter commented on PR #27383: URL: https://github.com/apache/doris/pull/27383#issuecomment-1827433725 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [regression](partial update) Fix unstable p0 case `test_primary_key_partial_update_parallel` due to conflicting table name [doris]
bobhan1 commented on PR #27633: URL: https://github.com/apache/doris/pull/27633#issuecomment-1827433773 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [enhancement](stats) Stats auto collector execute in serial [doris]
Kikyou1997 opened a new pull request, #27634: URL: https://github.com/apache/doris/pull/27634 ## Proposed changes Issue Number: close #xxx ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
(doris) branch new_join2 updated (57160b4346f -> d40a449c943)
This is an automated email from the ASF dual-hosted git repository. panxiaolei pushed a change to branch new_join2 in repository https://gitbox.apache.org/repos/asf/doris.git from 57160b4346f fix add d40a449c943 fix No new revisions were added by this update. Summary of changes: be/src/pipeline/exec/hashjoin_build_sink.cpp | 2 +- be/src/vec/common/hash_table/hash_map.h | 12 +--- be/src/vec/exec/join/process_hash_table_probe_impl.h | 1 - 3 files changed, 6 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [Feature](CDC)Support database sync use single sink [doris-flink-connector]
JNSimba opened a new pull request, #245: URL: https://github.com/apache/doris-flink-connector/pull/245 # Proposed changes Issue Number: close #xxx ## Problem Summary: Currently, when synchronizing the entire database, a sink will be created for each table. When there are too many tables, Flink will have too many operators and there will be pressure when building the topology, so a single sink is used for synchronization. Other changes: 1. Upgrade cdc version to 2.4.2 2. Optimize some codes ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 4. Has unit tests been added: (Yes/No/No Need) 5. Has document been added or modified: (Yes/No/No Need) 6. Does it need to update dependencies: (Yes/No) 7. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] New join2 [doris]
BiteThet commented on PR #27557: URL: https://github.com/apache/doris/pull/27557#issuecomment-1827437789 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [enhancement](stats) Stats auto collector execute in serial [doris]
Kikyou1997 commented on PR #27634: URL: https://github.com/apache/doris/pull/27634#issuecomment-1827437873 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](parquet)fix can not read parquet lz4 compress. [doris]
github-actions[bot] commented on code in PR #27383: URL: https://github.com/apache/doris/pull/27383#discussion_r1405861182 ## be/src/util/block_compression.cpp: ## @@ -183,6 +185,31 @@ class Lz4BlockCompression : public BlockCompressionCodec { static const int32_t ACCELARATION = 1; }; +class HadoopLz4BlockCompression : public Lz4BlockCompression { +public: +static HadoopLz4BlockCompression* instance() { +static HadoopLz4BlockCompression s_instance; +return &s_instance; +} +Status decompress(const Slice& input, Slice* output) override { + RETURN_IF_ERROR(Decompressor::create_decompressor(CompressType::LZ4BLOCK, &_decompressor)); +size_t input_bytes_read = 0; +size_t decompressed_len = 0; +size_t more_input_bytes = 0; +size_t more_output_bytes = 0; +bool stream_end = false; +auto st = _decompressor->decompress((uint8_t*)input.data, input.size, &input_bytes_read, +(uint8_t*)output->data, output->size, &decompressed_len, +&stream_end, &more_input_bytes, &more_output_bytes); +//try decompress use hadoopLz4 ,if failed fall back lz4. +return (st != Status::OK() || stream_end != true) Review Comment: warning: redundant boolean literal supplied to boolean operator [readability-simplify-boolean-expr] ```suggestion return (st != Status::OK() || !stream_end) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](multi-catalog)add properties converter fe ut [doris]
wsjz commented on PR #27254: URL: https://github.com/apache/doris/pull/27254#issuecomment-1827444549 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Feat](Nereids) support view as a independent unit of leading [doris]
doris-robot commented on PR #27378: URL: https://github.com/apache/doris/pull/27378#issuecomment-1827450055 (From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 46.56 seconds stream load tsv: 568 seconds loaded 74807831229 Bytes, about 125 MB/s stream load json: 29 seconds loaded 2358488459 Bytes, about 77 MB/s stream load orc: 69 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s insert into select: 28.9 seconds inserted 1000 Rows, about 346K ops/s storage size: 17100025052 Bytes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] New join2 [doris]
github-actions[bot] commented on code in PR #27557: URL: https://github.com/apache/doris/pull/27557#discussion_r1405874257 ## be/src/vec/exec/join/process_hash_table_probe_impl.h: ## @@ -412,17 +203,18 @@ Status ProcessHashTableProbe::do_process(HashTableType& hash output_block->swap(mutable_block.to_block()); if constexpr (with_other_conjuncts) { -return do_other_join_conjuncts(output_block, is_mark_join, multi_matched_output_row_count, - is_the_last_sub_block); +return do_other_join_conjuncts(output_block, is_mark_join, + hash_table_ctx.hash_table->get_visited(), + hash_table_ctx.hash_table->has_null_key()); } return Status::OK(); } template Status ProcessHashTableProbe::do_other_join_conjuncts( Review Comment: warning: function 'do_other_join_conjuncts' has cognitive complexity of 83 (threshold 50) [readability-function-cognitive-complexity] ```cpp Status ProcessHashTableProbe::do_other_join_conjuncts( ^ ``` Additional context **be/src/vec/exec/join/process_hash_table_probe_impl.h:219:** +1, including nesting penalty of 0, nesting level increased to 1 ```cpp if (!row_count) { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:228:** +1, including nesting penalty of 0, nesting level increased to 1 ```cpp RETURN_IF_ERROR(VExprContext::execute_conjuncts(_parent->_other_join_conjuncts, nullptr, ^ ``` **be/src/common/status.h:523:** expanded from macro 'RETURN_IF_ERROR' ```cpp do {\ ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:228:** +2, including nesting penalty of 1, nesting level increased to 2 ```cpp RETURN_IF_ERROR(VExprContext::execute_conjuncts(_parent->_other_join_conjuncts, nullptr, ^ ``` **be/src/common/status.h:525:** expanded from macro 'RETURN_IF_ERROR' ```cpp if (UNLIKELY(!_status_.ok())) { \ ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:243:** +1, including nesting penalty of 0, nesting level increased to 1 ```cpp if constexpr (JoinOpType == TJoinOp::LEFT_OUTER_JOIN || ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:250:** +2, including nesting penalty of 1, nesting level increased to 2 ```cpp for (int i = 0; i < row_count; ++i) { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:254:** +3, including nesting penalty of 2, nesting level increased to 3 ```cpp if (!join_hit) { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:256:** +1, nesting level increased to 3 ```cpp } else { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:259:** +3, including nesting penalty of 2, nesting level increased to 3 ```cpp if (filter_map[i]) { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:264:** +2, including nesting penalty of 1, nesting level increased to 2 ```cpp for (size_t i = 0; i < row_count; ++i) { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:265:** +3, including nesting penalty of 2, nesting level increased to 3 ```cpp if (filter_map[i]) { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:268:** +4, including nesting penalty of 3, nesting level increased to 4 ```cpp if constexpr (JoinOpType == TJoinOp::FULL_OUTER_JOIN) { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:274:** +1, nesting level increased to 1 ```cpp } else if constexpr (JoinOpType == TJoinOp::LEFT_ANTI_JOIN || ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:280:** +2, including nesting penalty of 1, nesting level increased to 2 ```cpp if (is_mark_join) { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:285:** +3, including nesting penalty of 2, nesting level increased to 3 ```cpp for (size_t i = 0; i < row_count; ++i) { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:287:** +4, including nesting penalty of 3, nesting level increased to 4 ```cpp if (has_null_in_build_side && ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:290:** +1, nesting level increased to 4 ```cpp } else { ^ ``` **be/src/vec/exec/join/process_hash_table_probe_impl.h:294:** +1, nesting level increased to 2 ```cpp } else { ^ ```
Re: [PR] [refactor](Nereids): unify one DateLiteral init() [doris]
doris-robot commented on PR #27618: URL: https://github.com/apache/doris/pull/27618#issuecomment-1827456257 TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G' ``` Tpch sf100 test result on commit feab572a632112a74da6f6aa2ef20bb7d4d64f20, data reload: true run tpch-sf100 query with default conf and session variables q1 4881461346164613 q2 411 126 126 126 q3 2069193918991899 q4 1415126612251225 q5 4045395339593953 q6 238 126 124 124 q7 1418874 879 874 q8 2804279427722772 q9 10001 968595929592 q10 3459350535033503 q11 376 245 248 245 q12 446 290 299 290 q13 19277 382037833783 q14 311 299 276 276 q15 567 515 522 515 q16 664 581 580 580 q17 1150956 936 936 q18 7974756175017501 q19 1812167916651665 q20 541 306 302 302 q21 4431403940194019 q22 469 369 368 368 Total cold run time: 68759 ms Total hot run time: 49161 ms run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off q1 4579458845964588 q2 349 220 246 220 q3 4055401840304018 q4 2730270127172701 q5 9761958395219521 q6 247 121 124 121 q7 3048250525172505 q8 4433445044624450 q9 13220 13316 13180 13180 q10 4060420341794179 q11 810 643 647 643 q12 983 813 831 813 q13 4355356235693562 q14 377 373 351 351 q15 575 519 516 516 q16 737 686 697 686 q17 3831405539343934 q18 9675923691279127 q19 1810177918001779 q20 2428207420672067 q21 8768877987708770 q22 948 798 790 790 Total cold run time: 81779 ms Total hot run time: 78521 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [regression](partial update) Fix unstable p0 case `test_primary_key_partial_update_parallel` due to conflicting table name [doris]
github-actions[bot] commented on PR #27633: URL: https://github.com/apache/doris/pull/27633#issuecomment-1827456680 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [enhancement](stats) Stats auto collector execute in serial [doris]
Jibing-Li commented on code in PR #27634: URL: https://github.com/apache/doris/pull/27634#discussion_r1405874445 ## fe/fe-core/src/main/java/org/apache/doris/statistics/StatisticsCollector.java: ## @@ -83,7 +74,14 @@ protected void createSystemAnalysisJob(AnalysisInfo jobInfo) analysisManager.createTableLevelTaskForExternalTable(jobInfo, analysisTasks, false); } Env.getCurrentEnv().getAnalysisManager().registerSysJob(jobInfo, analysisTasks); -analysisTasks.values().forEach(analysisTaskExecutor::submitTask); +for (BaseAnalysisTask task : analysisTasks.values()) { +try { +task.execute(); Review Comment: Not a block. But it's better to write the successfully executed tasks' result to column statistics table, instead of fail the whole job when one task failed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [enhancement](stats) Stats auto collector execute in serial [doris]
github-actions[bot] commented on PR #27634: URL: https://github.com/apache/doris/pull/27634#issuecomment-1827457672 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [enhancement](stats) Stats auto collector execute in serial [doris]
Kikyou1997 commented on code in PR #27634: URL: https://github.com/apache/doris/pull/27634#discussion_r1405885284 ## fe/fe-core/src/main/java/org/apache/doris/statistics/StatisticsCollector.java: ## @@ -83,7 +74,14 @@ protected void createSystemAnalysisJob(AnalysisInfo jobInfo) analysisManager.createTableLevelTaskForExternalTable(jobInfo, analysisTasks, false); } Env.getCurrentEnv().getAnalysisManager().registerSysJob(jobInfo, analysisTasks); -analysisTasks.values().forEach(analysisTaskExecutor::submitTask); +for (BaseAnalysisTask task : analysisTasks.values()) { +try { +task.execute(); Review Comment: Yes, I agree with you and the logic you mentioned has been implemented in `AnalysisJob` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [profile](bugfix) should not cache profile content because the profile may not be a full profile [doris]
yiguolei opened a new pull request, #27635: URL: https://github.com/apache/doris/pull/27635 … ## Proposed changes Issue Number: close #xxx ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [profile](bugfix) should not cache profile content because the profile may not be a full profile [doris]
yiguolei commented on PR #27635: URL: https://github.com/apache/doris/pull/27635#issuecomment-1827471458 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] New join2 [doris]
doris-robot commented on PR #27557: URL: https://github.com/apache/doris/pull/27557#issuecomment-1827471932 TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G' ``` Tpch sf100 test result on commit 57160b4346fd48c5c597eb37e116f830a8658732, data reload: false run tpch-sf100 query with default conf and session variables q1 4924466446424642 q2 351 151 149 149 q3 1502127112941271 q4 1143972 948 948 q5 3234321932403219 q6 255 131 133 131 q7 1074602 574 574 q8 2196226122052205 q9 7181719671747174 q10 3299337233453345 q11 341 203 213 203 q12 352 208 209 208 q13 4634387038663866 q14 244 219 218 218 q15 578 538 527 527 q16 615 547 553 547 q17 1011660 586 586 q18 8120759075467546 q19 1551152015431520 q20 547 317 289 289 q21 8937845184838451 q22 367 307 303 303 Total cold run time: 52456 ms Total hot run time: 47922 ms run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off q1 4582454745884547 q2 319 187 216 187 q3 3757372037453720 q4 2518252124972497 q5 6174617661836176 q6 250 124 124 124 q7 2825219121492149 q8 3699370836803680 q9 9842980397939793 q10 4041412641394126 q11 635 473 509 473 q12 799 632 652 632 q13 4369364836203620 q14 285 251 252 251 q15 573 531 517 517 q16 652 616 598 598 q17 2085206020492049 q18 9652904490079007 q19 1793178117491749 q20 2301199019781978 q21 49799 49539 49549 49539 q22 637 557 577 557 Total cold run time: 111587 ms Total hot run time: 107969 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [performance](Nereids): avoid use `getStringValue()` in getTimeFormatter() [doris]
doris-robot commented on PR #27625: URL: https://github.com/apache/doris/pull/27625#issuecomment-1827472863 (From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 45.97 seconds stream load tsv: 563 seconds loaded 74807831229 Bytes, about 126 MB/s stream load json: 27 seconds loaded 2358488459 Bytes, about 83 MB/s stream load orc: 70 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s insert into select: 28.9 seconds inserted 1000 Rows, about 346K ops/s storage size: 17098825001 Bytes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
(doris) branch branch-2.0 updated: [enhance](S3) Print the error detail for every s3 operation (#27572) (#27615)
This is an automated email from the ASF dual-hosted git repository. dataroaring pushed a commit to branch branch-2.0 in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/branch-2.0 by this push: new c5c6926e15a [enhance](S3) Print the error detail for every s3 operation (#27572) (#27615) c5c6926e15a is described below commit c5c6926e15a74c47e77987dd8f21ea89b1aa466a Author: AlexYue AuthorDate: Mon Nov 27 17:43:57 2023 +0800 [enhance](S3) Print the error detail for every s3 operation (#27572) (#27615) --- be/src/io/fs/buffered_reader.cpp | 4 be/src/io/fs/s3_file_reader.cpp | 6 -- be/src/io/fs/s3_file_system.cpp | 7 --- be/src/io/fs/s3_file_writer.cpp | 38 +- 4 files changed, 37 insertions(+), 18 deletions(-) diff --git a/be/src/io/fs/buffered_reader.cpp b/be/src/io/fs/buffered_reader.cpp index fdcba04190f..18e638d6d75 100644 --- a/be/src/io/fs/buffered_reader.cpp +++ b/be/src/io/fs/buffered_reader.cpp @@ -476,6 +476,10 @@ void PrefetchBuffer::prefetch_buffer() { return; } if (!s.ok() && _offset < _reader->size()) { +// We should print the error msg since this buffer might not be accessed by the consumer +// which would result in the status being missed +LOG_WARNING("prefetch path {} failed, offset {}, error {}", _reader->path().native(), +_offset, s.to_string()); _prefetch_status = std::move(s); } _buffer_status = BufferStatus::PREFETCHED; diff --git a/be/src/io/fs/s3_file_reader.cpp b/be/src/io/fs/s3_file_reader.cpp index 2403b2497ed..ceebc683a94 100644 --- a/be/src/io/fs/s3_file_reader.cpp +++ b/be/src/io/fs/s3_file_reader.cpp @@ -96,8 +96,10 @@ Status S3FileReader::read_at_impl(size_t offset, Slice result, size_t* bytes_rea } auto outcome = client->GetObject(request); if (!outcome.IsSuccess()) { -return Status::IOError("failed to read from {}: {}", _path.native(), - outcome.GetError().GetMessage()); +return Status::IOError("failed to read from {}: {}, exception {}, error code {}", + _path.native(), outcome.GetError().GetMessage(), + outcome.GetError().GetExceptionName(), + outcome.GetError().GetResponseCode()); } *bytes_read = outcome.GetResult().GetContentLength(); if (*bytes_read != bytes_req) { diff --git a/be/src/io/fs/s3_file_system.cpp b/be/src/io/fs/s3_file_system.cpp index 79f2a324f44..a00a5efc9a9 100644 --- a/be/src/io/fs/s3_file_system.cpp +++ b/be/src/io/fs/s3_file_system.cpp @@ -527,9 +527,10 @@ Status S3FileSystem::get_key(const Path& path, std::string* key) const { template std::string S3FileSystem::error_msg(const std::string& key, const AwsOutcome& outcome) const { -return fmt::format("(endpoint: {}, bucket: {}, key:{}, {}), {}", _s3_conf.endpoint, - _s3_conf.bucket, key, outcome.GetError().GetExceptionName(), - outcome.GetError().GetMessage()); +return fmt::format("(endpoint: {}, bucket: {}, key:{}, {}), {}, error code {}", + _s3_conf.endpoint, _s3_conf.bucket, key, + outcome.GetError().GetExceptionName(), outcome.GetError().GetMessage(), + outcome.GetError().GetResponseCode()); } std::string S3FileSystem::error_msg(const std::string& key, const std::string& err) const { diff --git a/be/src/io/fs/s3_file_writer.cpp b/be/src/io/fs/s3_file_writer.cpp index 6bbd076d16a..4a937f52057 100644 --- a/be/src/io/fs/s3_file_writer.cpp +++ b/be/src/io/fs/s3_file_writer.cpp @@ -117,8 +117,11 @@ Status S3FileWriter::_create_multi_upload_request() { _upload_id = outcome.GetResult().GetUploadId(); return Status::OK(); } -return Status::IOError("failed to create multipart upload(bucket={}, key={}, upload_id={}): {}", - _bucket, _path.native(), _upload_id, outcome.GetError().GetMessage()); +return Status::IOError( +"failed to create multipart upload(bucket={}, key={}, upload_id={}): {}, exception {}, " +"error code {}", +_bucket, _path.native(), _upload_id, outcome.GetError().GetMessage(), +outcome.GetError().GetExceptionName(), outcome.GetError().GetResponseCode()); } void S3FileWriter::_wait_until_finish(std::string_view task_name) { @@ -168,8 +171,11 @@ Status S3FileWriter::abort() { _aborted = true; return Status::OK(); } -return Status::IOError("failed to abort multipart upload(bucket={}, key={}, upload_id={}): {}", - _bucket, _path.native(), _upload_id, outcome.GetError().GetMessage()); +return Status::IOError( +"failed to abort multipart upload(bucket={}, key={}, upload_id={}): {}, exception {}, " +
Re: [PR] [feat](window_function) support to secondary argument to ignore null values in first_value/last_value [doris]
doris-robot commented on PR #27623: URL: https://github.com/apache/doris/pull/27623#issuecomment-1827481752 (From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 47.55 seconds stream load tsv: 572 seconds loaded 74807831229 Bytes, about 124 MB/s stream load json: 25 seconds loaded 2358488459 Bytes, about 89 MB/s stream load orc: 72 seconds loaded 1101869774 Bytes, about 14 MB/s stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s insert into select: 28.9 seconds inserted 1000 Rows, about 346K ops/s storage size: 17100646185 Bytes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Feature-Variant](Variant Type) support variant type query and index [doris]
xiaokang commented on code in PR #26749: URL: https://github.com/apache/doris/pull/26749#discussion_r1405894365 ## be/src/olap/rowset/segment_creator.cpp: ## @@ -280,6 +441,17 @@ Status SegmentCreator::add_block(const vectorized::Block* block) { size_t row_avg_size_in_bytes = std::max((size_t)1, block_size_in_bytes / block_row_num); size_t row_offset = 0; +if (_segment_flusher.need_buffering()) { +const static int MAX_BUFFER_SIZE = config::flushing_block_buffer_size_bytes; // 400M Review Comment: flushing_block_buffer_size_bytes is still used -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] Pick "[enhance](S3) Print the error detail for every s3 operation (#27572)" [doris]
dataroaring merged PR #27615: URL: https://github.com/apache/doris/pull/27615 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Feat](Nereids) support view as a independent unit of leading [doris]
doris-robot commented on PR #27378: URL: https://github.com/apache/doris/pull/27378#issuecomment-1827484945 TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G' ``` Tpch sf100 test result on commit b180f13b5866d5d2dc469d8d311e7c7be31b7abf, data reload: false run tpch-sf100 query with default conf and session variables q1 5627470546634663 q2 361 171 160 160 q3 2048192519691925 q4 1398128612651265 q5 3957395440223954 q6 258 135 128 128 q7 1444888 886 886 q8 2789278927592759 q9 10274 10237 97409740 q10 3461351334953495 q11 373 252 248 248 q12 434 292 306 292 q13 4550379537933793 q14 322 287 288 287 q15 583 535 535 535 q16 664 591 582 582 q17 1135989 942 942 q18 8000756774367436 q19 1666167916721672 q20 593 303 289 289 q21 4463402840354028 q22 474 380 373 373 Total cold run time: 54874 ms Total hot run time: 49452 ms run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off q1 4566457845504550 q2 341 209 244 209 q3 4045401240074007 q4 2707270026982698 q5 9688953696389536 q6 248 120 126 120 q7 3053246924912469 q8 4480451444684468 q9 13278 13083 13097 13083 q10 4087414341544143 q11 756 675 681 675 q12 977 806 810 806 q13 4298360736243607 q14 386 340 347 340 q15 567 522 518 518 q16 741 667 702 667 q17 3854393038633863 q18 9715912990109010 q19 1785177217591759 q20 2389207220662066 q21 876185968596 q22 960 860 835 835 Total cold run time: 81682 ms Total hot run time: 78025 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](multi-catalog)add properties converter fe ut [doris]
wsjz commented on PR #27254: URL: https://github.com/apache/doris/pull/27254#issuecomment-1827486344 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Fix](statistics)Fix bug and improve auto analyze. [doris]
morningman commented on code in PR #27626: URL: https://github.com/apache/doris/pull/27626#discussion_r1405887622 ## fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java: ## @@ -1366,6 +1369,12 @@ public void setEnableLeftZigZag(boolean enableLeftZigZag) { + "tables larger than huge_table_lower_bound_size_in_bytes are analyzed only once."}) public long hugeTableAutoAnalyzeIntervalInMillis = TimeUnit.HOURS.toMillis(12); +@VariableMgr.VarAttr(name = EXTERNAL_TABLE_AUTO_ANALYZE_INTERVAL_IN_MILLIS, flag = VariableMgr.GLOBAL, +description = {"控制对外表的自动ANALYZE的最小时间间隔,在该时间间隔内的外表仅ANALYZE一次", +"This controls the minimum time interval for automatic ANALYZE on external tables." ++ "Within this interval, external tables are analyzed only once."}) +public long externalTableAutoAnalyzeIntervalInMillis = TimeUnit.HOURS.toMillis(240); Review Comment: 10 days is too long ## fe/fe-core/src/main/java/org/apache/doris/statistics/util/StatisticsUtil.java: ## @@ -906,6 +906,16 @@ public static long getHugeTableAutoAnalyzeIntervalInMillis() { return StatisticConstants.HUGE_TABLE_AUTO_ANALYZE_INTERVAL_IN_MILLIS; } +public static long getExternalTableAutoAnalyzeIntervalInMillis() { +try { +return findConfigFromGlobalSessionVar(SessionVariable.EXTERNAL_TABLE_AUTO_ANALYZE_INTERVAL_IN_MILLIS) +.externalTableAutoAnalyzeIntervalInMillis; +} catch (Exception e) { +LOG.warn("Failed to get value of externalTableAutoAnalyzeIntervalInMillis, return default", e); Review Comment: Why is an exception thrown here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Feature-Variant](Variant Type) support variant type query and index [doris]
xiaokang commented on code in PR #26749: URL: https://github.com/apache/doris/pull/26749#discussion_r1405899239 ## be/src/olap/tablet_schema.cpp: ## @@ -1015,9 +1145,15 @@ std::vector TabletSchema::get_indexes_for_column(int32_t col return indexes_for_column; } -bool TabletSchema::has_inverted_index(int32_t col_unique_id) const { +bool TabletSchema::has_inverted_index(const TabletColumn& col) const { // TODO use more efficient impl +int32_t col_unique_id = col.unique_id(); +const std::string& suffix_path = +!col.path_info().empty() ? escape_for_path_name(col.path_info().get_path()) : ""; for (size_t i = 0; i < _indexes.size(); i++) { +if (_indexes[i].get_escaped_index_suffix_path() != suffix_path) { Review Comment: can we put the above escape_for_path_name into if branch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [information_schema](tables)modify information_schema.tables rows column use cache rows. [doris]
morningman merged PR #27028: URL: https://github.com/apache/doris/pull/27028 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [opt](nereids)adjust distribution cost for better choice of broadcast join and shuffle join [doris]
doris-robot commented on PR #27113: URL: https://github.com/apache/doris/pull/27113#issuecomment-1827489765 (From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 45.85 seconds stream load tsv: 565 seconds loaded 74807831229 Bytes, about 126 MB/s stream load json: 26 seconds loaded 2358488459 Bytes, about 86 MB/s stream load orc: 71 seconds loaded 1101869774 Bytes, about 14 MB/s stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s insert into select: 28.6 seconds inserted 1000 Rows, about 349K ops/s storage size: 17098825574 Bytes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
(doris) branch master updated (66eeafcd484 -> d5a56dc7f4c)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/doris.git from 66eeafcd484 [refactor](Nereids): unify one DateLiteral init() (#27618) add d5a56dc7f4c [information_schema](tables)modify information_schema.tables rows column use cache rows. (#27028) No new revisions were added by this update. Summary of changes: .../main/java/org/apache/doris/catalog/OlapTable.java | 5 + .../src/main/java/org/apache/doris/catalog/Table.java | 4 .../src/main/java/org/apache/doris/catalog/TableIf.java | 4 .../apache/doris/catalog/external/ExternalTable.java| 4 .../apache/doris/catalog/external/HMSExternalTable.java | 17 + .../doris/catalog/external/JdbcExternalTable.java | 5 + .../org/apache/doris/service/FrontendServiceImpl.java | 2 +- 7 files changed, 40 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [fix](nereids) temp partition is always pruned [doris]
englefly opened a new pull request, #27636: URL: https://github.com/apache/doris/pull/27636 ## Proposed changes Issue Number: close #xxx ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
(doris) branch master updated: [DOC](sparkload)add spark load faq (#27455)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 50c442fc6c2 [DOC](sparkload)add spark load faq (#27455) 50c442fc6c2 is described below commit 50c442fc6c2c3c7507c724e4129e7481c52a5474 Author: wuwenchi AuthorDate: Mon Nov 27 17:49:52 2023 +0800 [DOC](sparkload)add spark load faq (#27455) add spark load FAQ --- .../import/import-way/spark-load-manual.md | 64 ++ .../import/import-way/spark-load-manual.md | 28 -- 2 files changed, 62 insertions(+), 30 deletions(-) diff --git a/docs/en/docs/data-operate/import/import-way/spark-load-manual.md b/docs/en/docs/data-operate/import/import-way/spark-load-manual.md index 76e3f7ed901..37c60a0b4a8 100644 --- a/docs/en/docs/data-operate/import/import-way/spark-load-manual.md +++ b/docs/en/docs/data-operate/import/import-way/spark-load-manual.md @@ -680,39 +680,39 @@ Refer to broker load for the meaning of parameters in the returned result set. T + State -The current phase of the load job. After the job is submitted, the status is pending. After the spark ETL is submitted, the status changes to ETL. After ETL is completed, Fe schedules be to execute push operation, and the status changes to finished after the push is completed and the version takes effect. + The current phase of the load job. After the job is submitted, the status is pending. After the spark ETL is submitted, the status changes to ETL. After ETL is completed, Fe schedules be to execute push operation, and the status changes to finished after the push is completed and the version takes effect. -There are two final stages of the load job: cancelled and finished. When the load job is in these two stages, the load is completed. Among them, cancelled is load failure, finished is load success. + There are two final stages of the load job: cancelled and finished. When the load job is in these two stages, the load is completed. Among them, cancelled is load failure, finished is load success. + Progress -Progress description of the load job. There are two kinds of progress: ETL and load, corresponding to the two stages of the load process, ETL and loading. + Progress description of the load job. There are two kinds of progress: ETL and load, corresponding to the two stages of the load process, ETL and loading. -The progress range of load is 0 ~ 100%. + The progress range of load is 0 ~ 100%. -```Load progress = the number of tables that have completed all replica imports / the total number of tables in this import task * 100%``` + ```Load progress = the number of tables that have completed all replica imports / the total number of tables in this import task * 100%``` -**If all load tables are loaded, the progress of load is 99%**, the load enters the final effective stage. After the whole load is completed, the load progress will be changed to 100%. + **If all load tables are loaded, the progress of load is 99%**, the load enters the final effective stage. After the whole load is completed, the load progress will be changed to 100%. -The load progress is not linear. Therefore, if the progress does not change over a period of time, it does not mean that the load is not in execution. + The load progress is not linear. Therefore, if the progress does not change over a period of time, it does not mean that the load is not in execution. + Type -Type of load job. Spark load is spark. + Type of load job. Spark load is spark. + CreateTime/EtlStartTime/EtlFinishTime/LoadStartTime/LoadFinishTime -These values represent the creation time of the load, the start time of the ETL phase, the completion time of the ETL phase, the start time of the loading phase, and the completion time of the entire load job. + These values represent the creation time of the load, the start time of the ETL phase, the completion time of the ETL phase, the start time of the loading phase, and the completion time of the entire load job. + JobDetails -Display the detailed running status of some jobs, which will be updated when ETL ends. It includes the number of loaded files, the total size (bytes), the number of subtasks, the number of processed original lines, etc. + Display the detailed running status of some jobs, which will be updated when ETL ends. It includes the number of loaded files, the total size (bytes), the number of subtasks, the number of processed original lines, etc. -```{"ScannedRows":139264,"TaskNumber":1,"FileNumber":1,"FileSize":940754064}``` + ```{"ScannedRows":139264,"TaskNumber":1,"FileNumber":1,"FileSize":940754064}``` + URL -Copy this url to the browser and jump to the web interface of the corresponding application. + Copy this url to the browser and jump to the web interface of
Re: [PR] [DOC](sparkload)add spark load faq [doris]
morningman merged PR #27455: URL: https://github.com/apache/doris/pull/27455 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [DOC](sparkload)add spark load faq [doris]
github-actions[bot] commented on PR #27455: URL: https://github.com/apache/doris/pull/27455#issuecomment-1827493201 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Fix](statistics)Fix bug and improve auto analyze. [doris]
Jibing-Li commented on code in PR #27626: URL: https://github.com/apache/doris/pull/27626#discussion_r1405902956 ## fe/fe-core/src/main/java/org/apache/doris/statistics/util/StatisticsUtil.java: ## @@ -906,6 +906,16 @@ public static long getHugeTableAutoAnalyzeIntervalInMillis() { return StatisticConstants.HUGE_TABLE_AUTO_ANALYZE_INTERVAL_IN_MILLIS; } +public static long getExternalTableAutoAnalyzeIntervalInMillis() { +try { +return findConfigFromGlobalSessionVar(SessionVariable.EXTERNAL_TABLE_AUTO_ANALYZE_INTERVAL_IN_MILLIS) +.externalTableAutoAnalyzeIntervalInMillis; +} catch (Exception e) { +LOG.warn("Failed to get value of externalTableAutoAnalyzeIntervalInMillis, return default", e); Review Comment: I didn't dive deep in this, I simply followed all the other variables' format. Seems like VariableMgr.getValue may throw exception when the session name not exist. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Opt](compression) Opt gzip decompress by libdeflate on X86 and X86_64 platforms. [doris]
kaka11chen commented on PR #27542: URL: https://github.com/apache/doris/pull/27542#issuecomment-1827496955 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [performance](Nereids): avoid use `getStringValue()` in getTimeFormatter() [doris]
doris-robot commented on PR #27625: URL: https://github.com/apache/doris/pull/27625#issuecomment-1827498904 TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G' ``` Tpch sf100 test result on commit 31510030fc01fc80a1d5d8f837be73766450e029, data reload: false run tpch-sf100 query with default conf and session variables q1 4952465946724659 q2 358 157 160 157 q3 2047194018771877 q4 1411128612761276 q5 3995400240694002 q6 258 133 131 131 q7 1458895 893 893 q8 2797280927842784 q9 999310030 96589658 q10 3469353635473536 q11 393 249 244 244 q12 435 288 292 288 q13 4583382638153815 q14 317 309 296 296 q15 599 539 525 525 q16 666 592 583 583 q17 1139968 970 968 q18 7909756974967496 q19 1695168516831683 q20 542 278 288 278 q21 4447404340474043 q22 482 385 391 385 Total cold run time: 53945 ms Total hot run time: 49577 ms run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off q1 4587459645854585 q2 351 225 255 225 q3 4066404740174017 q4 2742272227162716 q5 9722966696579657 q6 244 124 122 122 q7 3046249825052498 q8 4484442344524423 q9 13229 13154 13162 13154 q10 4042415941714159 q11 824 682 644 644 q12 971 806 808 806 q13 4315359135803580 q14 389 347 357 347 q15 565 522 524 522 q16 734 673 680 673 q17 3908387738393839 q18 9660921590289028 q19 1841180617981798 q20 2429207720582058 q21 8984869888388698 q22 899 845 766 766 Total cold run time: 82032 ms Total hot run time: 78315 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](nereids)when range bounder is inifinte, range length should be POSITIVE_INFINITE [doris]
doris-robot commented on PR #27624: URL: https://github.com/apache/doris/pull/27624#issuecomment-1827499680 (From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 45.85 seconds stream load tsv: 566 seconds loaded 74807831229 Bytes, about 126 MB/s stream load json: 28 seconds loaded 2358488459 Bytes, about 80 MB/s stream load orc: 70 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 34 seconds loaded 861443392 Bytes, about 24 MB/s insert into select: 28.8 seconds inserted 1000 Rows, about 347K ops/s storage size: 17099069457 Bytes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Pick-var](inverted index) pick read && seek optimization to variant branch [doris]
xiaokang merged PR #27600: URL: https://github.com/apache/doris/pull/27600 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Pick-var](inverted index) pick read && seek optimization to variant branch [doris]
xiaokang commented on PR #27600: URL: https://github.com/apache/doris/pull/27600#issuecomment-1827502077 https://github.com/apache/doris/pull/26689 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](parquet)fix can not read parquet lz4 compress. [doris]
github-actions[bot] commented on PR #27383: URL: https://github.com/apache/doris/pull/27383#issuecomment-1827502616 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
(doris) branch branch-2.0-var updated: [Pick-var](inverted index) pick read && seek optimization to variant branch #26689 (#27600)
This is an automated email from the ASF dual-hosted git repository. kxiao pushed a commit to branch branch-2.0-var in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/branch-2.0-var by this push: new f8e3264d032 [Pick-var](inverted index) pick read && seek optimization to variant branch #26689 (#27600) f8e3264d032 is described below commit f8e3264d032c6cde0b40116b169e98e4480145d0 Author: airborne12 AuthorDate: Mon Nov 27 17:54:12 2023 +0800 [Pick-var](inverted index) pick read && seek optimization to variant branch #26689 (#27600) --- be/src/olap/rowset/segment_v2/segment_iterator.cpp | 193 ++--- be/src/olap/rowset/segment_v2/segment_iterator.h | 3 +- 2 files changed, 133 insertions(+), 63 deletions(-) diff --git a/be/src/olap/rowset/segment_v2/segment_iterator.cpp b/be/src/olap/rowset/segment_v2/segment_iterator.cpp index 65ca6f7e61d..a36536745d7 100644 --- a/be/src/olap/rowset/segment_v2/segment_iterator.cpp +++ b/be/src/olap/rowset/segment_v2/segment_iterator.cpp @@ -104,11 +104,12 @@ public: explicit BitmapRangeIterator(const roaring::Roaring& bitmap) { roaring_init_iterator(&bitmap.roaring, &_iter); -_read_next_batch(); } bool has_more_range() const { return !_eof; } +[[nodiscard]] static uint32_t get_batch_size() { return kBatchSize; } + // read next range into [*from, *to) whose size <= max_range_size. // return false when there is no more range. virtual bool next_range(const uint32_t max_range_size, uint32_t* from, uint32_t* to) { @@ -147,6 +148,11 @@ public: return true; } +// read batch_size of rowids from roaring bitmap into buf array +virtual uint32_t read_batch_rowids(rowid_t* buf, uint32_t batch_size) { +return roaring::api::roaring_read_uint32_iterator(&_iter, buf, batch_size); +} + private: void _read_next_batch() { _buf_pos = 0; @@ -171,6 +177,8 @@ class SegmentIterator::BackwardBitmapRangeIterator : public SegmentIterator::Bit public: explicit BackwardBitmapRangeIterator(const roaring::Roaring& bitmap) { roaring_init_iterator_last(&bitmap.roaring, &_riter); +_rowid_count = roaring_bitmap_get_cardinality(&bitmap.roaring); +_rowid_left = _rowid_count; } bool has_more_range() const { return !_riter.has_value; } @@ -194,9 +202,51 @@ public: return true; } +/** + * Reads a batch of row IDs from a roaring bitmap, starting from the end and moving backwards. + * This function retrieves the last `batch_size` row IDs from the bitmap and stores them in the provided buffer. + * It updates the internal state to track how many row IDs are left to read in subsequent calls. + * + * The row IDs are read in reverse order, but stored in the buffer maintaining their original order in the bitmap. + * + * Example: + * input bitmap: [0 1 4 5 6 7 10 15 16 17 18 19] + * If the bitmap has 12 elements and batch_size is set to 5, the function will first read [15, 16, 17, 18, 19] + * into the buffer, leaving 7 elements left. In the next call with batch_size 5, it will read [4, 5, 6, 7, 10]. + * + */ +uint32_t read_batch_rowids(rowid_t* buf, uint32_t batch_size) override { +if (!_riter.has_value || _rowid_left == 0) { +return 0; +} + +if (_rowid_count <= batch_size) { +roaring_bitmap_to_uint32_array(_riter.parent, + buf); // Fill 'buf' with '_rowid_count' elements. +uint32_t num_read = _rowid_left; // Save the number of row IDs read. +_rowid_left = 0; // No row IDs left after this operation. +return num_read; // Return the number of row IDs read. +} + +uint32_t read_size = std::min(batch_size, _rowid_left); +uint32_t num_read = 0; // Counter for the number of row IDs read. + +// Read row IDs into the buffer in reverse order. +while (num_read < read_size && _riter.has_value) { +buf[read_size - num_read - 1] = _riter.current_value; +num_read++; +_rowid_left--; // Decrement the count of remaining row IDs. +roaring_previous_uint32_iterator(&_riter); +} + +// Return the actual number of row IDs read. +return num_read; +} private: roaring::api::roaring_uint32_iterator_t _riter; +uint32_t _rowid_count; +uint32_t _rowid_left; }; SegmentIterator::SegmentIterator(std::shared_ptr segment, SchemaSPtr schema) @@ -1694,56 +1744,86 @@ void SegmentIterator::_output_non_pred_columns(vectorized::Block* block) { } } +/** + * Reads columns by their index, handling both continuous and discontinuous rowid scenarios. + * + * This function is designed to read a specified number of rows (up to nr
Re: [PR] [Pick](branch-2.0) Pick from branch-2.0 [doris]
eldenmoon merged PR #27602: URL: https://github.com/apache/doris/pull/27602 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [fix](insert) txn insert and group commit should write \N string corr… [doris]
mymeiyi opened a new pull request, #27637: URL: https://github.com/apache/doris/pull/27637 …ectly ## Proposed changes For insert into values, we can write `\N` string to a column. ``` insert into t2 values(1, '"b"', 100); insert into t2 values(2, '\N', 100); insert into t2 values(3, '\\N', 100); mysql> select * from t2 where name is null; Empty set (0.03 sec) ``` But txn insert and group commit insert use stream load as the inner implementation, the `\N` means `null`, so the write result is not correct. ``` begin; insert into t2 values(1, '"b"', 100); insert into t2 values(2, '\N', 100); insert into t2 values(3, '\\N', 100); commit; mysql> select * from t2 where name is null; +--+--+---+ | id| name | score | +--+--+---+ |3 | NULL | 100| +--+--+---+ 1 row in set (0.02 sec) ``` ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](doc)add config for delete timeout job [doris]
github-actions[bot] commented on PR #27629: URL: https://github.com/apache/doris/pull/27629#issuecomment-1827510740 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](nereids) temp partition is always pruned [doris]
github-actions[bot] commented on PR #27636: URL: https://github.com/apache/doris/pull/27636#issuecomment-1827512244 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](nereids) temp partition is always pruned [doris]
github-actions[bot] commented on PR #27636: URL: https://github.com/apache/doris/pull/27636#issuecomment-1827512311 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](insert) txn insert and group commit should write \N string corr… [doris]
mymeiyi commented on PR #27637: URL: https://github.com/apache/doris/pull/27637#issuecomment-1827512553 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](parquet)fix can not read parquet lz4 compress. [doris]
doris-robot commented on PR #27383: URL: https://github.com/apache/doris/pull/27383#issuecomment-1827513171 (From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 44.39 seconds stream load tsv: 583 seconds loaded 74807831229 Bytes, about 122 MB/s stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s insert into select: 28.8 seconds inserted 1000 Rows, about 347K ops/s storage size: 17099609669 Bytes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](Nereids) non-deterministic expression should not be constant (#27606) [doris]
xiaokang commented on PR #27631: URL: https://github.com/apache/doris/pull/27631#issuecomment-1827515083 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [feat](window_function) support to secondary argument to ignore null values in first_value/last_value [doris]
doris-robot commented on PR #27623: URL: https://github.com/apache/doris/pull/27623#issuecomment-1827516227 TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G' ``` Tpch sf100 test result on commit 2c928178414b7a8ae3caea905df74e1937f4, data reload: false run tpch-sf100 query with default conf and session variables q1 4913468246484648 q2 359 166 158 158 q3 2066195718821882 q4 1390125412481248 q5 3968398040333980 q6 259 133 132 132 q7 1442885 886 885 q8 2794282127602760 q9 9898971396649664 q10 3470353835623538 q11 381 255 242 242 q12 438 287 288 287 q13 4576381238253812 q14 332 293 294 293 q15 578 533 514 514 q16 662 582 579 579 q17 1130960 924 924 q18 7917752474657465 q19 1688172816791679 q20 574 299 310 299 q21 4467400840444008 q22 474 377 393 377 Total cold run time: 53776 ms Total hot run time: 49374 ms run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off q1 4590458045594559 q2 349 233 249 233 q3 4050404040214021 q4 2727270427072704 q5 9821981196869686 q6 251 121 126 121 q7 3030251024862486 q8 4471444344544443 q9 13209 13135 13145 13135 q10 4095414641564146 q11 750 654 726 654 q12 964 817 803 803 q13 4297360736083607 q14 381 349 370 349 q15 566 511 534 511 q16 724 687 705 687 q17 3871388039103880 q18 9564911590619061 q19 1812178117761776 q20 2420208520262026 q21 8949884486248624 q22 903 850 754 754 Total cold run time: 81794 ms Total hot run time: 78266 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Improvement](materialized-view) forbidden mv rewriter when select stmt's from clause not have mv [doris]
BiteThet commented on PR #27638: URL: https://github.com/apache/doris/pull/27638#issuecomment-1827520957 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [feature](Nereids): support `alter table t add constraint name constraints` [doris]
doris-robot commented on PR #27627: URL: https://github.com/apache/doris/pull/27627#issuecomment-1827521273 (From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 45.67 seconds stream load tsv: 571 seconds loaded 74807831229 Bytes, about 124 MB/s stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s insert into select: 29.1 seconds inserted 1000 Rows, about 343K ops/s storage size: 17100352885 Bytes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[PR] [Improvement](materialized-view) forbidden mv rewriter when select stmt's from clause not have mv [doris]
BiteThet opened a new pull request, #27638: URL: https://github.com/apache/doris/pull/27638 ## Proposed changes forbidden mv rewriter when select stmt's from clause not have mv ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Feature-Variant](Variant Type) support variant type query and index [doris]
eldenmoon commented on code in PR #26749: URL: https://github.com/apache/doris/pull/26749#discussion_r1405920546 ## be/src/olap/rowset/segment_creator.cpp: ## @@ -280,6 +441,17 @@ Status SegmentCreator::add_block(const vectorized::Block* block) { size_t row_avg_size_in_bytes = std::max((size_t)1, block_size_in_bytes / block_row_num); size_t row_offset = 0; +if (_segment_flusher.need_buffering()) { +const static int MAX_BUFFER_SIZE = config::flushing_block_buffer_size_bytes; // 400M Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](doc) spell error fixes for FE & BE Config documents [doris]
Nitin-Kashyap commented on code in PR #27619: URL: https://github.com/apache/doris/pull/27619#discussion_r1405921195 ## docs/zh-CN/docs/admin-manual/config/be-config.md: ## @@ -203,7 +203,7 @@ BE 重启后该配置将失效。如果想持久化修改结果,使用如下 * 类型:int32 * 描述:配置BE的所属于的集群id。 - - 该值通常由FE通过心跳向BE下发,不需要额外进行配置。当确认某BE属于某一个确定的Drois集群时,可以进行配置,同时需要修改数据目录下的cluster_id文件,使二者相同。 + - 该值通常由FE通过心跳向BE下发,不需要额外进行配置。当确认某BE属于某一个确定的Doris集群时,可以进行配置,同时需要修改数据目录下的cluster_id文件,使二者相同。 Review Comment: fixed ## docs/zh-CN/docs/admin-manual/config/be-config.md: ## @@ -224,7 +224,7 @@ BE 重启后该配置将失效。如果想持久化修改结果,使用如下 `es_scroll_keepalive` -* 描述:es scroll Keeplive保持时间,默认5分钟 +* 描述:es scroll keep-alive保持时间,默认5分钟 Review Comment: fixed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](doc) spell error fixes for FE & BE Config documents [doris]
Nitin-Kashyap commented on code in PR #27619: URL: https://github.com/apache/doris/pull/27619#discussion_r1405921579 ## docs/zh-CN/docs/admin-manual/config/be-config.md: ## @@ -411,7 +411,7 @@ BE 重启后该配置将失效。如果想持久化修改结果,使用如下 `enable_prefetch` * 类型:bool -* 描述:当使用PartitionedHashTable进行聚合和join计算时,是否进行HashBuket的预取,推荐设置为true。 +* 描述:当使用PartitionedHashTable进行聚合和join计算时,是否进行HashBucket的预取,推荐设置为true。 Review Comment: fixed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [pipelineX](bug) Fix timeout [doris]
Gabriel39 commented on PR #27596: URL: https://github.com/apache/doris/pull/27596#issuecomment-1827524225 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Feature-Variant](Variant Type) support variant type query and index [doris]
eldenmoon commented on code in PR #26749: URL: https://github.com/apache/doris/pull/26749#discussion_r1405925632 ## be/src/olap/tablet_schema.cpp: ## @@ -1015,9 +1145,15 @@ std::vector TabletSchema::get_indexes_for_column(int32_t col return indexes_for_column; } -bool TabletSchema::has_inverted_index(int32_t col_unique_id) const { +bool TabletSchema::has_inverted_index(const TabletColumn& col) const { // TODO use more efficient impl +int32_t col_unique_id = col.unique_id(); +const std::string& suffix_path = +!col.path_info().empty() ? escape_for_path_name(col.path_info().get_path()) : ""; for (size_t i = 0; i < _indexes.size(); i++) { +if (_indexes[i].get_escaped_index_suffix_path() != suffix_path) { Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [opt](nereids)adjust distribution cost for better choice of broadcast join and shuffle join [doris]
doris-robot commented on PR #27113: URL: https://github.com/apache/doris/pull/27113#issuecomment-1827529996 TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G' ``` Tpch sf100 test result on commit 6360b29f8ad3ea60641a1c7cee869819f40ff9e0, data reload: false run tpch-sf100 query with default conf and session variables q1 4935464446974644 q2 359 161 161 161 q3 2062193919281928 q4 1399126712551255 q5 3993395540313955 q6 256 137 130 130 q7 1341851 833 833 q8 2804282127982798 q9 9820973996289628 q10 3452354235313531 q11 384 239 250 239 q12 437 290 301 290 q13 4552382338173817 q14 333 284 304 284 q15 589 523 525 523 q16 495 452 459 452 q17 1147979 966 966 q18 7983747874767476 q19 1678169416801680 q20 530 314 305 305 q21 4473404540324032 q22 475 375 386 375 Total cold run time: 53497 ms Total hot run time: 49302 ms run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off q1 4596457746004577 q2 340 217 244 217 q3 4036404140354035 q4 2718273627082708 q5 9595965096729650 q6 246 127 125 125 q7 2996244324932443 q8 4427441444384414 q9 13248 13157 13260 13157 q10 4047416141364136 q11 771 663 657 657 q12 975 823 801 801 q13 4289355435603554 q14 384 348 360 348 q15 566 525 522 522 q16 586 572 587 572 q17 3935385938163816 q18 9610916192169161 q19 1787177717821777 q20 2424206920472047 q21 8901867787098677 q22 876 791 795 791 Total cold run time: 81353 ms Total hot run time: 78185 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Feature-Variant](Variant Type) support variant type query and index [doris]
amorynan commented on code in PR #26749: URL: https://github.com/apache/doris/pull/26749#discussion_r1405926099 ## be/src/vec/data_types/serde/data_type_jsonb_serde.cpp: ## @@ -199,7 +199,7 @@ void DataTypeJsonbSerDe::write_one_cell_to_json(const IColumn& column, rapidjson auto& data = assert_cast(column); const auto jsonb_val = data.get_data_at(row_num); if (jsonb_val.empty()) { -result.SetNull(); +return; Review Comment: why not set null and then return ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [Feature-Variant](Variant Type) support variant type query and index [doris]
eldenmoon commented on PR #26749: URL: https://github.com/apache/doris/pull/26749#issuecomment-1827529337 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix](doc) spell error fixes for FE & BE Config documents [doris]
LemonLiTree commented on PR #27619: URL: https://github.com/apache/doris/pull/27619#issuecomment-1827536622 GOOD JOB -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org