Hello Everyone, Some background. My name is Michael and I work at FactSet, which if you use Arrow you may have heard because one of our architects did a talk on using Arrow and Dremio. https://hello.dremio.com/eliminate-data-transfer-bottlenecks-with-apache-arrow-flight.html?utm_medium=social-free&utm_source=linkedin&utm_term=na&utm_content=na&utm_campaign=eliminate-data-transfer-bottlenecks-with-apache-arrow-flight
His team has decided to use Arrow as a tabular data interchange format. Other teams are doing other things. We are working on standardizing our tabular data interchange format at our company. We have our own open-sourced columnar based schema defined in protobuf. https://github.com/factset/stachschema We looked into Apache Arrow a few years ago, but decided not to use it as it was not mature enough at the time and we had two specific requirements 1) We needed this data not just for analytics but rendering as well and rendering requires a lot more complicated information such as understanding the type of data and relationship between data i.e. grouping 2) We need SDKs that support typescript/javascript both browser and node and supports both creating and consuming arrow. Now that Apache Arrow is more mature and stabilized i.e. the schema and sdks are post 1.x we are looking into it again. 1. we are thinking of defining specific metadata in a similar way we do for STACH that let’s us define some rendering specific e.g. adding a metadata to a Field Schema called isHidden to denote whether we should render the data column or not. 2. It seems like there is a well developed javascript SDK that we can use. I am still reading the source code and the Observable articles to truly understand how it works. * I read one of the issues is that the JS library might be out sync, so do people know how actively that repo is maintained. * If there needs to be work done I think we would be able to help if we had some help getting started with understanding that repo. If possible we would be interested to continue to chat about the above ideas, get more information about if Apache Arrow is right for the job, and if there is already discussion of other people are using arrow for rendering in addition to analytics. To clarify what I mean for existing render technologies I know stuff like Falcon and Perspective exist, but those seem to be for basic table rendering for simple tables. I mean to create a superset of arrow by definfing metadata that allows for complex nested headers and nested rows. Something like the image below. Then you can imagine even more data attached such as describing the data and relationships to other data on the page. You can image in the dataset there is some `personId` that is set to not be rendered. This personId can then be used to gather more information in another api call if you wanted to render a tooltip with maybe some bio information. In short, rendered tables require a lot more information than just the data. Does it make sense to build this upon Arrow. [cid:image001.png@01D70C15.94EDD4E0] -Thanks Michael