[jira] [Commented] (IMPALA-13661) Support parallelism above JDBC tables for joins/aggregates

ASF subversion and git services (Jira) Wed, 12 Nov 2025 15:03:06 -0800


    [ 
https://issues.apache.org/jira/browse/IMPALA-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18037909#comment-18037909
 ]


ASF subversion and git services commented on IMPALA-13661:
----------------------------------------------------------

Commit 5f91838adaf55dd3a5818f5ad680254b292ebe1b in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5f91838ad ]

IMPALA-14545: Don't use absolute hdfs paths for JDBC table driver.url

After IMPALA-13661 merged, S3PlannerTest.testDataSourceTables
has been failing with an error trying to fetch the JDBC driver
for functional.jdbc_decimal_tbl. This particular table's
definition uses a path like 'hdfs://localhost:20500/test-warehouse/...'
which explicitly depends on HDFS rather than relying
on the default filesystem. Changing this to use a path like
'/test-warehouse/...' without the HDFS dependency fixes the
S3PlannerTest. This changes create-ext-data-source-table.sql
to a template using WAREHOUSE_LOCATION_PREFIX and replaces that
variable before executing it. This is important for Ozone, as
Ozone uses a WAREHOUSE_LOCATION_PREFIX set to the Ozone volume.

Testing:
 - Ran S3 and regular HDFS fe tests

Change-Id: I3f2c86fcc6c1dee75d7d9a9be04468cb197ae13c
Reviewed-on: http://gerrit.cloudera.org:8080/23658
Reviewed-by: Wenzhe Zhou <[email protected]>
Reviewed-by: Michael Smith <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Support parallelism above JDBC tables for joins/aggregates
> ----------------------------------------------------------
>
>                 Key: IMPALA-13661
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13661
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Frontend
>            Reporter: Wenzhe Zhou
>            Assignee: Pranav Yogi Lodha
>            Priority: Major
>              Labels: flaky
>
> Currently Impala planer generate single node plan with one scanner thread for 
> query accessing JDBC tables since table stats are not available for JDBC 
> tables.
> Even the JDBC fetching has to be single-threaded but we could use exchange or 
> row batch threading to parallelize multiple JDBC connections. This will 
> improve the performance significantly.
> We need to change the planer to create multiple scanner threads for query 
> which access multiple jdbc tables with joins or aggregations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-13661) Support parallelism above JDBC tables for joins/aggregates

Reply via email to