Re: Getting data[parquet,json,...] from S3 buckets to apache ignite

Denis Magda Thu, 17 Oct 2019 13:38:58 -0700

Hi Viktor,

Could you please clarify a bit, do you need just to load data
once/periodically or do you want Ignite to write-back to S3 on updates? If
the loading is all you need then create a custom Java app/class that pulls
data from S3 and streams into Ignite IgniteDataStreamer (fastest loading
technique) [1].


If the write-back is needed then the 3rd party store (CacheStore) is the
best way to go. GridGain Web Console (free tool) [2] goes with a model
importing feature [3] that should be able to read the schema of Drill via
JDBC and produce an Ignite configuration.

[1] https://apacheignite.readme.io/docs/data-loading#ignitedatastreamer
[2]
https://www.gridgain.com/docs/web-console/latest/web-console-getting-started
[3] https://apacheignite-tools.readme.io/docs/automatic-rdbms-integration

-
Denis


On Thu, Oct 17, 2019 at 6:45 AM viktor <[email protected]> wrote:

> Hi,
>
> I'm currently working on r&d project where we would like to retrieve data
> files[parquet, json, ...] from S3 buckets and load the data into apache
> ignite for machine learning purposes with tensorflow.
>
> With the removal of IGFS in the next release I'm having troubles finding a
> solution.
> What would be an optimal way to facilitate the data for apache ignite?
>
> I'm currently looking into using the 3rd party store features of ignite to
> integrate with apache drill as it is able to query these s3 bucket data
> files.
> However from a glance it doesn't look like as a great solution since every
> table structure has to be manually defined in the ignite configuration or
> semi-automatic with the agent.
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Getting data[parquet,json,...] from S3 buckets to apache ignite

Reply via email to