I am wondering if this can be considered as an opportunity to implement support in Arrow for building high performance in-memory row stores for low latency and high throughput key based queries. In other words, we can design the in-memory record format keeping efficient RDMA reads as one of the goals too. Consider two data structures in memory -- a hash table and a row-store comprising of records in Arrow row format. Hashtable points to row store and information can be read from both data structures without interrupting the CPU on server. This client-server code-path support can also be incorporated into Arrow Flight
On Tue, Jun 26, 2018 at 7:49 PM, Jacques Nadeau <jacq...@apache.org> wrote: > I'm not sure this makes sense as an external stable api. I definitely think > it is useful as an internal representation for use within a particular > algorithm. I also think that can be informed by the particular algorithm > that you're working on. > > We definitely had this requirement in Dremio and came up with an internal > representation that we are happy with for the use in hash tables. I'll try > to dig up the design docs we had around this but the actual > pivoting/unpivoting code that we developed can be seen here: [1], [2]. > > Our main model is two blocks: a fixed width block and a variable width > block (with the fixed width block also carrying address & length of the > variable data). Fixed width is randomly accessible and variable width is > randomly accessible through fixed width. > > [1] > https://github.com/dremio/dremio-oss/blob/master/sabot/ > kernel/src/main/java/com/dremio/sabot/op/common/ht2/Pivots.java > [2] > https://github.com/dremio/dremio-oss/blob/master/sabot/ > kernel/src/main/java/com/dremio/sabot/op/common/ht2/Unpivots.java > > On Tue, Jun 26, 2018 at 10:20 AM, Wes McKinney <wesmck...@gmail.com> > wrote: > > > hi Antoine, > > > > On Sun, Jun 24, 2018 at 1:06 PM, Antoine Pitrou <anto...@python.org> > > wrote: > > > > > > Hi Wes, > > > > > > Le 24/06/2018 à 08:24, Wes McKinney a écrit : > > >> > > >> If this sounds interesting to the community, I could help to kickstart > > >> a design process which would likely take a significant amount of time. > > >> The requirements could be complex (i.e. we might want to support > > >> variable-size record fields while also providing random access > > >> guarantees). > > > > > > What do you call "variable-sized" here? A scheme where the length of a > > > record's field is determined by the value of another field in the same > > > record? > > > > As an example, here is a fixed size record > > > > record foo { > > a: int32; > > b: float64; > > c: uint8; > > } > > > > With padding suppose this is 16 bytes per record; so if we have a > > column of these, then random accessing any value in any record is > > simple. > > > > Here's a variable-length record: > > > > record bar { > > a: string; > > b: list<int32>; > > } > > > > What I've seen done to represent this in memory is to have a fixed > > size record followed by a sidecar containing the variable-length data, > > so the fixed size portion might look something like > > > > a_offset: int32; > > a_length: int32; > > b_offset: int32; > > b_length: int32; > > > > So from this, you can do random access into the record. If you wanted > > to do random access on a _column_ of such records, it is similar to > > our current variable-length Binary type. So it might be that the > > underlying Arrow memory layout would be FixedSizeBinary for fixed-size > > records and variable Binary for variable-size records. > > > > - Wes > > > > > > > > > > > > > > Regards > > > > > > Antoine. > > >