GitHub user ZTE-EBASE added a comment to the discussion: Extend the gpfdist 
tool to support SFTP/HDFS protocols for high-performance multi-source data 
ingestion

Yes, our implementation relies on libssh along with the arrow/parquet 
libraries. This approach is tailored to specific business requirements, and 
since the business scenario involves large-scale data, we adopt a parallel 
strategy to achieve high-performance data ingestion and querying.
Regarding the HDFS protocol, we have implemented FDW (Foreign Data Wrapper) for 
it. However, this involved a significant amount of code modification and 
changes to the kernel. Should there be a need, we can provide this 
implementation later.

GitHub link: 
https://github.com/apache/cloudberry/discussions/1205#discussioncomment-13638452

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to