alamb commented on issue #11404: URL: https://github.com/apache/datafusion/issues/11404#issuecomment-2225183104
> was to add general-purpose functionality upstream and keep stream processing focused features downstream > I think the consensus reached at the time of https://github.com/apache/datafusion/issues/4285 proved to be a quite reasonable one. Yes, I entirely agree. The features that were added, substantially by @ozankabak @metesynnada @berkaysynnada and @mustafasrepo have proven to be applicable to many different systems (not just stream processing). I think this follows the basic philosophy we take at InfluxData as well (general purpose things in DataFusion, timeseries specific stuff downstream), though of course where exactly to draw the line always takes some judgement In general I think the basic rule of thumb should be "if more than one downstream system will use a feature, then consider putting it in DataFusion". If it is a feature that realistically only one will use I think it is best left downstream Given the interest in streaming systems I think extending / adding more support in DataFusion makes a lot of sense to me Here are some other potential related things I think would help: * https://github.com/apache/datafusion/issues/9016 (making a motivating example might be really helpful) * https://github.com/apache/datafusion/issues/8583 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
