JeremyXin commented on code in PR #9446:
URL: https://github.com/apache/seatunnel/pull/9446#discussion_r2155055935


##########
seatunnel-connectors-v2/connector-clickhouse/src/main/java/org/apache/seatunnel/connectors/seatunnel/clickhouse/util/ClickhouseProxy.java:
##########
@@ -429,6 +438,100 @@ public void dropDatabase(String database, boolean 
ignoreIfNotExists) {
         executeSql(ClickhouseCatalogUtil.INSTANCE.getDropDatabaseSql(database, 
ignoreIfNotExists));
     }
 
+    public List<ClickhousePart> getPartList(
+            String database, String table, Shard shard, List<String> 
partitionList) {
+
+        String sql =
+                String.format(
+                        "select name from system.parts where database = '%s' 
and table = '%s'",
+                        database, table);
+
+        if (partitionList != null && !partitionList.isEmpty()) {
+            StringJoiner joiner = new StringJoiner("', '", "('", "')");
+            partitionList.forEach(joiner::add);
+
+            sql += " and partition in " + joiner.toString();
+        }
+
+        sql += " group by name";
+
+        log.debug("get part sql: {}", sql);
+
+        try (ClickHouseResponse response = 
clickhouseRequest.query(sql).executeAndWait()) {
+            Iterable<ClickHouseRecord> records = response.records();
+            return StreamSupport.stream(records.spliterator(), false)
+                    .map(r -> new ClickhousePart(r.getValue(0).asString(), 
database, table, shard))
+                    .collect(Collectors.toList());
+        } catch (ClickHouseException e) {
+            throw new ClickhouseConnectorException(
+                    ClickhouseConnectorErrorCode.GET_PART_ERROR,
+                    "Cannot get part name from system.parts",
+                    e);
+        }
+    }
+
+    public List<SeaTunnelRow> getDataFromSplit(
+            ClickhousePart part,
+            SeaTunnelRowType seaTunnelRowType,
+            ClickhouseSourceTable clickhouseSourceTable,
+            int offset) {
+
+        long st = System.currentTimeMillis();
+        List<SeaTunnelRow> seaTunnelRowList = new ArrayList<>();
+        TablePath tablePath = TablePath.of(part.getDatabase(), 
part.getTable());
+
+        String whereClause = String.format("_part = '%s'", part.getName());
+        if (StringUtils.isNotEmpty(clickhouseSourceTable.getFilterQuery())) {
+            whereClause += " AND (" + clickhouseSourceTable.getFilterQuery() + 
")";
+        }
+
+        String sql =
+                String.format(
+                        "select * from %s.%s where %s limit %d, %d",

Review Comment:
   After reading the clickhouse documentation, I found that clickhouse supports 
a kind of `LIMIT... WITH TIES` way, this can ensure that the data with the same 
value in the `Order By` field will be queried in the same batch. Meanwhile, the 
Order By field of the table is used to define the sorting key when query part. 
Can this solution solve the problem?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to