JeremyXin commented on code in PR #9446:
URL: https://github.com/apache/seatunnel/pull/9446#discussion_r2154391814


##########
seatunnel-connectors-v2/connector-clickhouse/src/main/java/org/apache/seatunnel/connectors/seatunnel/clickhouse/util/ClickhouseProxy.java:
##########
@@ -429,6 +438,100 @@ public void dropDatabase(String database, boolean 
ignoreIfNotExists) {
         executeSql(ClickhouseCatalogUtil.INSTANCE.getDropDatabaseSql(database, 
ignoreIfNotExists));
     }
 
+    public List<ClickhousePart> getPartList(
+            String database, String table, Shard shard, List<String> 
partitionList) {
+
+        String sql =
+                String.format(
+                        "select name from system.parts where database = '%s' 
and table = '%s'",
+                        database, table);
+
+        if (partitionList != null && !partitionList.isEmpty()) {
+            StringJoiner joiner = new StringJoiner("', '", "('", "')");
+            partitionList.forEach(joiner::add);
+
+            sql += " and partition in " + joiner.toString();
+        }
+
+        sql += " group by name";
+
+        log.debug("get part sql: {}", sql);
+
+        try (ClickHouseResponse response = 
clickhouseRequest.query(sql).executeAndWait()) {
+            Iterable<ClickHouseRecord> records = response.records();
+            return StreamSupport.stream(records.spliterator(), false)
+                    .map(r -> new ClickhousePart(r.getValue(0).asString(), 
database, table, shard))
+                    .collect(Collectors.toList());
+        } catch (ClickHouseException e) {
+            throw new ClickhouseConnectorException(
+                    ClickhouseConnectorErrorCode.GET_PART_ERROR,
+                    "Cannot get part name from system.parts",
+                    e);
+        }
+    }
+
+    public List<SeaTunnelRow> getDataFromSplit(
+            ClickhousePart part,
+            SeaTunnelRowType seaTunnelRowType,
+            ClickhouseSourceTable clickhouseSourceTable,
+            int offset) {
+
+        long st = System.currentTimeMillis();
+        List<SeaTunnelRow> seaTunnelRowList = new ArrayList<>();
+        TablePath tablePath = TablePath.of(part.getDatabase(), 
part.getTable());
+
+        String whereClause = String.format("_part = '%s'", part.getName());
+        if (StringUtils.isNotEmpty(clickhouseSourceTable.getFilterQuery())) {
+            whereClause += " AND (" + clickhouseSourceTable.getFilterQuery() + 
")";
+        }
+
+        String sql =
+                String.format(
+                        "select * from %s.%s where %s limit %d, %d",

Review Comment:
   This implementation is designed to read parts in batches to avoid large 
amounts of data when reading in parallel. Each `ClickhousePart` object has an 
`offset` attribute to record the offset of the current part that has been read, 
thereby ensuring the order of batch reading.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to