Hi Gavin,

> Just curious whether there is any interest/intention of possibly making a
> higher level API around the basic FlightSQL one?


IIUC, I don't think this is an issue with Flight but one with generic
conversion between data into Arrow.  I don't think anyone is actively
working on something like this, but creating a new contrib module that maps
from java objects (just like there are JDBC and Avro ones) seems
worthwhile.  If you are interested in contributing something like this I
think a short design doc would be worth-while.

VectorSchemaRoot root = df.toVectorSchemaRoot();
> listener.setVectorSchemaRoot(root);
> listener.sendVectorSchemaRootContents();


A small nit.  Generally, the preferred pattern is one VectorSchemaRoot that
gets reloaded each time.  So an API like "df.loadVectorSchemaRoot(root)"
probably makes more sense but we can iterate on this.  This wasn't commonly
understood when some of the other contrib modules were developed.

Cheers,
Micah


On Sat, Mar 12, 2022 at 12:15 PM Gavin Ray <ray.gavi...@gmail.com> wrote:

> While trying to implement and introduce the idea of adopting FlightSQL, the
> largest challenge was the API itself
>
> I know it's meant to be low-level. But I found that most of the development
> time was in code to convert to/from
> row-based data (IE Map<String, Object>) and Java types, and columnar data +
> Arrow types.
>
> I'm likely in the minority position here -- I know that Arrow and FlightSQL
> users are largely looking at transferring large volumes of data and
> servicing OLAP-type workloads
> But the thing that excites me most about FlightSQL, isn't its performance
> (always nice to have), but that it's a language-agnostic standard for data
> access.
>
> That has broad implications -- for all kinds of data-access workloads and
> business usecases.
>
> The challenge is that in trying to advocate for it, when presenting a
> proof-of-concept,
> rather than what a developer might expect to see, something like:
>
> // FlightSQL handler code
> List<Map<String, Object>> results = ....;
> results.add(Map.of("id", 1, "name", "Person 1");
> return results;
>
> A significant portion of the code is in Arrow-specific implementation
> details:
> creating a VectorSchemaRoot, FieldVector, de-serializing the results on the
> client, etc.
>
> Just curious whether there is any interest/intention of possibly making a
> higher level API around the basic FlightSQL one?
> Maybe something closer to the traditional notion of a row-based "DataFrame"
> or "Table", like:
>
> DataFrame df = new DataFrame();
> df.addColumn("id", ArrowTypes.Int);
> df.addColumn("name", ArrowTypes.VarChar);
> df.addRow(Map.of("id", 1, "name", "Person 1"));
> VectorSchemaRoot root = df.toVectorSchemaRoot();
> listener.setVectorSchemaRoot(root);
> listener.sendVectorSchemaRootContents();
>

Reply via email to