This topic came up again, and I have started a PR [1] to see if we can record / build a larger consensus
Andrew [1] https://github.com/apache/arrow-datafusion/pull/1104 On Tue, Jun 22, 2021 at 1:25 PM Andrew Lamb <al...@influxdata.com> wrote: > Thank you for bringing this topic up. > > Expanding on what you suggested, here is another about this for a vision? > > DataFusion's vision is to become *the de facto query engine* of choice for > new analytic applications, by leveraging the unique features of Rust and > Apache Arrow to provide: > 1. best-in-class query performance for a single node > 2. A feature-complete declarative query interface via (most of) > PostgreSQL > 3. A feature-rich procedural interface for creating and running execution > plans > 4. High performance extensibility at at every layer > > The current [2] readme describes *what* Datafusion is, but does not really > give a vision going forward. A few months ago we tried a "what is everyone > thinking of working on" type approach [1] to create a roadmap. While that > was insightful, I agree having a single unified (even if vague) goal would > be very helpful > > I would welcome other thoughts as well: if there appears to be some > consensus then we can make a PR to add the proposal to the DataFusion readme > > @Andy Grove <andygrov...@gmail.com> do you have any thoughts? > > Andrew > > > [1] > https://docs.google.com/document/d/1qspsOM_dknOxJKdGvKbC1aoVoO0M3i6x1CIo58mmN2Y/edit?userstoinvite=jonas.hansen%40airbus.com&ts=604a2a22&actionButton=1 > [2] https://github.com/apache/arrow-datafusion#readme > > On Tue, Jun 22, 2021 at 3:18 AM Jiayu Liu <ji...@hey.com.invalid> wrote: > >> Hi, >> >> This is regarding my question about the datafusion's vision and roadmap. >> >> As a new contributor, I wonder what would be a vision and roadmap that >> most of the contributors can/already have be aligned upon. >> >> Maybe due to my lack of prior context I might have missed such >> discussion, or maybe this is intentionally left to be open so that >> different contributors and companies can have their own features to be >> compatible. But I still believe in the value of having one, and it can >> somehow be shown in the README.md or contributing guideline, so that >> users and the community can see what to expect from and contribute to. >> >> By "vision" I mean something that's necessarily vague and serving as an >> overarching goal, e.g. "leveraging rust and arrow and become the most >> performant SQL-compatible query engine on a single node", or "fully >> compatible with (most of) PostgreSQL syntax and pluggable in most of the >> web-scale analytical engines". >> >> I believe having this in place can help pushing the project forwards >> esp. in cases of trade off, e.g. sticking to newest rust release v.s. >> providing LTS, or incorporating as many features as possible (e.g. >> recursive CTE? BSON support? query materializations?) v.s. keeping >> binary size small and everything else into a plugin mode. >> >