Yeah, I agree, it should be an interface defined as part of Arrow. Not driver-specific.
> On Oct 31, 2017, at 1:37 PM, Laurent Goujon <laur...@dremio.com> wrote: > > I really like Julian's idea of unwrapping Arrow objects out of the JDBC > ResultSet, but I wonder if the unwrap class has to be specific to the > driver and if an interface can be designed to be used by multiple drivers: > for drivers based on Arrow, it means you could totally skip the > serialization/deserialization from/to JDBC records. > If such an interface exists, I would propose to add it to the Arrow > project, with Arrow product/projects in charge of adding support for it in > their own JDBC driver. > > Laurent > > On Tue, Oct 31, 2017 at 1:18 PM, Atul Dambalkar <atul.dambal...@xoriant.com> > wrote: > >> Thanks for your thoughts Julian. I think, adding support for Arrow objects >> for Avatica Remote Driver (AvaticaToArrowConverter) can be certainly taken >> up as another activity. And you are right, we will have to look at specific >> JDBC driver to really optimize it individually. >> >> I would be curious if there are any further inputs/comments from other Dev >> folks, on the JDBC adapter aspect. >> >> -Atul >> >> -----Original Message----- >> From: Julian Hyde [mailto:jh...@apache.org] >> Sent: Tuesday, October 31, 2017 11:12 AM >> To: dev@arrow.apache.org >> Subject: Re: JDBC Adapter for Apache-Arrow >> >> Sorry I didn’t read your email thoroughly enough. I was talking about the >> inverse (JDBC reading from Arrow) whereas you are talking about Arrow >> reading from JDBC. Your proposal makes perfect sense. >> >> JDBC is quite a chatty interface (a call for every column of every row, >> plus an occasional call to find out whether values are null, and objects >> such as strings and timestamps become a Java heap object) so for specific >> JDBC drivers it may be possible to optimize. For example, the Avatica >> remove driver receives row sets in an RPC response in protobuf format. It >> may be useful if the JDBC driver were able to expose a direct path from >> protobuf to Arrow. "ResultSet.unwrap(AvaticaToArrowConverter.class)” >> might be one way to achieve this. >> >> Julian >> >> >> >> >>> On Oct 31, 2017, at 10:41 AM, Atul Dambalkar <atul.dambal...@xoriant.com> >> wrote: >>> >>> Hi Julian, >>> >>> Thanks for your response. If I understand correctly (looking at other >> adapters), Calcite-Arrow adapter would provide SQL front end for in-memory >> Arrow data objects/structures. So from that perspective, are you suggesting >> building the Calcite-Arrow adapter? >>> >>> In this case, what we are saying is to provide a mechanism for upstream >> apps to be able to get/create Arrow objects/structures from a relational >> database. This would also mean converting row like data from a SQL Database >> to columnar Arrow data structures. The utility may be, can make use of >> JDBC's MetaData features to figure out the underlying DB schema and define >> Arrow columnar schema. Also underlying database in this case would be any >> relational DB and hence would be persisted to the disk, but the Arrow >> objects being in-memory can be ephemeral. >>> >>> Please correct me if I am missing anything. >>> >>> -Atul >>> >>> -----Original Message----- >>> From: Julian Hyde [mailto:jhyde.apa...@gmail.com] >>> Sent: Monday, October 30, 2017 7:50 PM >>> To: dev@arrow.apache.org >>> Subject: Re: JDBC Adapter for Apache-Arrow >>> >>> How about writing an Arrow adapter for Calcite? I think it amounts to >> the same thing - you would inherit Calcite’s SQL parser and Avatica JDBC >> stack. >>> >>> Would this database be ephemeral (i.e. would the data go away when you >> close the connection)? If not, how would you know where to load the data >> from? >>> >>> Julian >>> >>>> On Oct 30, 2017, at 6:17 PM, Atul Dambalkar <atul.dambal...@xoriant.com> >> wrote: >>>> >>>> Hi all, >>>> >>>> I wanted to open up a conversation here regarding developing a >> Java-based JDBC Adapter for Apache Arrow. I have had a preliminary >> discussion with Wes McKinney and Siddharth Teotia on this a couple weeks >> earlier. >>>> >>>> Basically at a high level (over-simplified) this adapter/API will allow >> upstream apps to query RDBMS data over JDBC and get the JDBC objects >> converted to Arrow in-memory (JVM) objects/structures. The upstream utility >> can then work with Arrow objects/structures with usual performance >> benefits. The utility will be very much similar to C++ implementation of >> "Convert a vector of row-wise data into an Arrow table" as described here - >> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html. >>>> >>>> How useful this adapter would be and which other Apache projects would >> benefit from this? Based on the usability we can open a JIRA for this >> activity and start looking into the implementation details. >>>> >>>> Regards, >>>> -Atul Dambalkar >>>> >>>> >> >>