Hi  Sutou Kouhei/Team

*[Background]*

Working on intel gazelle_plugin
<https://github.com/oap-project/gazelle_plugin>,
It's a C++ based backend with an arrow compute engine for spark.
Now during scan i.e reading data from HDFS/Cloud currently we are using
cloud/hdfs APIs as mentioned above.
But now we have Alluxio Cache
<https://docs.alluxio.io/ee/user/stable/en/core-services/Caching.html> in
between for fast data access.

*[Problem]*

HDFS/Cloud --------> Alluxio ----> arrow FS api ---> arrow parquet scan

*[Need help]*

Below connection
[  Alluxio    -----> arrow  FS api ]


On Tue, 6 Sept 2022 at 02:17, Sutou Kouhei <k...@clear-code.com> wrote:

> Hi,
>
> Could you try our HDFS support?
>
> *
> https://arrow.apache.org/docs/cpp/dataset.html#reading-from-cloud-storage
> *
> https://arrow.apache.org/docs/cpp/api/filesystem.html#_CPPv4N5arrow2fs16HadoopFileSystemE
>
> (You're using Apache Arrow C++, right?)
>
>
> Thanks,
> --
> kou
>
> In <CAH63-+8ueLN_CqPRJqvAfgsxz_M2Br=syjeqe6jqcqvou_j...@mail.gmail.com>
>   "Alluxio cache read support" on Mon, 5 Sep 2022 16:58:57 +0530,
>   Manoj Kumar <man...@zettabolt.com> wrote:
>
> > Hi Team,
> >
> > Anyone know how to access HDFS/cloud FS backend by Alluxio via the arrow
> > filesystem ?
>

Reply via email to