NNhanptnk commented on code in PR #21105:
URL: https://github.com/apache/datafusion/pull/21105#discussion_r2982839584
##########
dev/wiki/apache-datafusion.wikitext:
##########
@@ -0,0 +1,113 @@
+<!--
+Draft Wikipedia article.
+-->
+
+{{Short description|Open-source query engine}}
+{{Draft topics|technology|software}}
+{{Infobox software
+| name = Apache DataFusion
+| developer = [[Apache Software Foundation]]
+| programming language = [[Rust (programming language)|Rust]]
+| genre = Query engine
+| license = [[Apache License]]
+| website = {{URL|https://datafusion.apache.org/}}
+}}
+
+'''Apache DataFusion''' is an [[open-source software|open-source]], embeddable
analytical query engine written in [[Rust (programming language)|Rust]], built
on [[Apache Arrow]]'s columnar memory format.<ref name="sigmod-paper">{{cite
journal |last1=Lamb |first1=Andrew |last2=Shen |first2=Yijie |last3=Heres
|first3=Daniel |last4=Chakraborty |first4=Jayjeet |last5=Kabak |first5=Mehmet
Ozan |last6=Hsieh |first6=Liang-Chi |last7=Sun |first7=Chao |title=Apache Arrow
DataFusion: A Fast, Embeddable, Modular Analytic Query Engine
|journal=Proceedings of the 2024 International Conference on Management of Data
|year=2024 |doi=10.1145/3626246.3653368}}</ref><ref name="intro-docs">{{cite
web |title=Introduction
|url=https://datafusion.apache.org/user-guide/introduction.html |website=Apache
DataFusion |publisher=Apache Software Foundation
|access-date=2026-03-22}}</ref> It provides [[SQL]] and DataFrame interfaces
for analytical query execution and is designed to be used as a library by
develop
ers building databases, query engines, and analytical tools, rather than as a
standalone database server.<ref name="sigmod-paper" /><ref name="intro-docs" />
The project originated in 2017, was donated to the [[Apache Arrow]] project in
2019, and became a top-level project of the [[Apache Software Foundation]] in
2024.<ref name="donation-post">{{cite web |title=DataFusion: A Rust-native
Query Engine for Apache Arrow
|url=https://datafusion.apache.org/blog/2019/02/04/datafusion-donation/
|website=Apache DataFusion Blog |publisher=Apache Software Foundation
|date=2019-02-04 |access-date=2026-03-22}}</ref><ref name="asf-tlp">{{cite web
|title=Apache Software Foundation Announces New Top-Level Project Apache
DataFusion
|url=https://news.apache.org/foundation/entry/apache-software-foundation-announces-new-top-level-project-apache-datafusion
|website=The ASF Blog |publisher=Apache Software Foundation |date=2024-06-11
|access-date=2026-03-22}}</ref>
Review Comment:
> It provides [[SQL]] and DataFrame interfaces for analytical query
execution and is designed to be used as a library by developers building
databases, query engines, and analytical tools, rather than as a standalone
database server.<ref name="sigmod-paper" /><ref name="intro-docs" />
I think we can make this a bit better in the sense of introducing DataFusion
and its uniqueness. Here's what I think :
Often described as the "LLVM for Databases," [Source 1] Apache DataFusion is
a modular, Arrow-native query engine library designed for embedding into custom
systems rather than operating as a monolithic standalone server [Source 2 and
3]. This high-performance Rust framework provides a composable foundation,
allowing developers to precisely extend query planning and vectorized execution
to meet unique architectural requirements. [Source 2 and 3]
Source 1 : https://midas.bu.edu/assets/slides/andrew_lamb_slides.pdf (cc
@alamb )
Source 2 and 3 (this is the first two reference) : <ref
name="sigmod-paper">{{cite journal |last1=Lamb |first1=Andrew |last2=Shen
|first2=Yijie |last3=Heres |first3=Daniel |last4=Chakraborty |first4=Jayjeet
|last5=Kabak |first5=Mehmet Ozan |last6=Hsieh |first6=Liang-Chi |last7=Sun
|first7=Chao |title=Apache Arrow DataFusion: A Fast, Embeddable, Modular
Analytic Query Engine |journal=Proceedings of the 2024 International Conference
on Management of Data |year=2024 |doi=10.1145/3626246.3653368}}</ref><ref
name="intro-docs">{{cite web |title=Introduction
|url=https://datafusion.apache.org/user-guide/introduction.html |website=Apache
DataFusion |publisher=Apache Software Foundation |access-date=2026-03-22}}</ref>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]