At GTC San Jose last month, NVidia's Joe Eaton (cc'd) presented on the nvGraph <https://developer.nvidia.com/nvgraph> team's goals for accelerating in-memory graph processing and analytics. A major component of that is advancing and standardizing a common, efficient representation for graphs that can support a broad ranges of use-cases, from small to large.
To that end, I'd like to kick off the discussion about native graph representations in Arrow. Joe's team has prepared a preliminary FlatBuffers schema for efficient columnar representations of the four most common graph formats. It includes embedded edge and vertex property tables, and is designed to be compatible with the existing Arrow column types. My initial thoughts are that we could add an optional 5th Graph Message type, similar to how Tensor Messages are presently implemented. I've pushed Joe's initial GraphSchema.fbs to this branch on my Arrow fork <https://github.com/trxcllnt/arrow/blob/78f6b6c6a5b9e4e7bf96f5bbc4dfed7528b1cca7/format/GraphSchema_Triples_Quads.fbs>. >From what I understand, the tables have been expanded into separate definitions for the sake of comprehension, and the final forms will be collapsed into each distinct Graph type, parameterized by sizes defined at the top. I also understand the nvGraph team supports these layouts natively, enabling the community to take advantage of high-performance GPU kernels very early on, and possibly align with libraries like Hornet <https://github.com/hornet-gt/hornetsnest> (previously cuStinger). Cheers, Paul