Hi All,

I'm investigating the possibility of using Arrow Flight with graph
databases, and exploring how to enable Arrow Flight endpoint in Apache
Tinkerpop Gremlin server.

Now graph databases use several incompatible protocols that make it
difficult to use and spread the technology.
A common features for graph databases are
1. Lack of a scheme. Each vertex of the graph can have its own set of
properties, including properties with the same name but different types.
Metadata such as type and size are also passed with each value, which
increases the amount of data transferred. Some data types are not supported
by all languages.
2. Internal representation of data is different for all implementations.
For data exchange we used a set of formats like customized JSON and custom
binary, but we would like to get a performance gain from using Arrow Flight.
3. The difference in concepts like transactions, sessions, etc.
Conceptually this may differ from the implementation in SQL.
Gremlin server does not natively support transactions, so we use the Neo4J
plugin.

We are currently working on a prototype in which we are trying to use Arrow
Flight as a transport for transmitting requests and data to Gremlin Server.
Serialization is still based on an internal format due to schema creation
complexity.

Ideas are welcome.

Regards, Valentyn

Reply via email to