Hi Viktor, Could you please clarify a bit, do you need just to load data once/periodically or do you want Ignite to write-back to S3 on updates? If the loading is all you need then create a custom Java app/class that pulls data from S3 and streams into Ignite IgniteDataStreamer (fastest loading technique) [1].
If the write-back is needed then the 3rd party store (CacheStore) is the best way to go. GridGain Web Console (free tool) [2] goes with a model importing feature [3] that should be able to read the schema of Drill via JDBC and produce an Ignite configuration. [1] https://apacheignite.readme.io/docs/data-loading#ignitedatastreamer [2] https://www.gridgain.com/docs/web-console/latest/web-console-getting-started [3] https://apacheignite-tools.readme.io/docs/automatic-rdbms-integration - Denis On Thu, Oct 17, 2019 at 6:45 AM viktor <[email protected]> wrote: > Hi, > > I'm currently working on r&d project where we would like to retrieve data > files[parquet, json, ...] from S3 buckets and load the data into apache > ignite for machine learning purposes with tensorflow. > > With the removal of IGFS in the next release I'm having troubles finding a > solution. > What would be an optimal way to facilitate the data for apache ignite? > > I'm currently looking into using the 3rd party store features of ignite to > integrate with apache drill as it is able to query these s3 bucket data > files. > However from a glance it doesn't look like as a great solution since every > table structure has to be manually defined in the ignite configuration or > semi-automatic with the agent. > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >
