Hi Susmit, For an example of what David Li is proposing, you can take a look at this project (https://github.com/voltrondata/sqlflite). It's a Flight SQL server (in C++ though) that can forward queries to either SQLite or DuckDB.
-- Felipe On Wed, Oct 16, 2024 at 10:22 AM David Li <lidav...@apache.org> wrote: > If your clients are sending full SQL queries to be executed, and you need > to execute them against S3 on the server, why not consider something like > Apache DataFusion or DuckDB to implement that part instead of building the > query parser/engine yourself? (There are probably already examples of > wrapping both these projects in Flight SQL floating around.) > > On Wed, Oct 16, 2024, at 21:38, Susmit Sarkar wrote: > > Hi Community Members > > > > > > We are planning to build an Arrow flight server on top of data lying in > s3. > > > > > > *Detailed Use Case:* > > > > > > The requirement is we need to sync data from HDFS to a short term storage > > S3 is our case. Basically a DataSync Service between cloud storages > > > > > > I have already built the service using Apache Pekko / Akka HDFS & S3 > > connectors, and data is in sync with HDFS & S3. > > > > > > Now comes the data reading part for end users. The data is stored in > > Cloudian s3 (Cloudian managed S3 not AWS) short term storage in parquet. > We > > want to build a Data as a Service on top of the data lying in S3 and > expose > > API endpoints for clients to query. The data lying will be short term, > data > > may be of week or months (max 3 months) use-cases varies from teams to > > teams. > > > > > > So we felt Apache Sql Flight Server will be the best suited for our use > > case and the client should send a FlightDescriptor object wrapped with > the > > sql query. > > > > > > We parsed the query and query s3 using the aws s3 sdks, and return the > > data, but the issue is we will end up building our own query parser, > which > > is a bigger task. > > > > Is there any other approach we can try out ? > > > > > > Thanks, > > > > Susmit >