NNhanptnk commented on code in PR #21105:
URL: https://github.com/apache/datafusion/pull/21105#discussion_r2982839584


##########
dev/wiki/apache-datafusion.wikitext:
##########
@@ -0,0 +1,113 @@
+<!--
+Draft Wikipedia article.
+-->
+
+{{Short description|Open-source query engine}}
+{{Draft topics|technology|software}}
+{{Infobox software
+| name = Apache DataFusion
+| developer = [[Apache Software Foundation]]
+| programming language = [[Rust (programming language)|Rust]]
+| genre = Query engine
+| license = [[Apache License]]
+| website = {{URL|https://datafusion.apache.org/}}
+}}
+
+'''Apache DataFusion''' is an [[open-source software|open-source]], embeddable 
analytical query engine written in [[Rust (programming language)|Rust]], built 
on [[Apache Arrow]]'s columnar memory format.<ref name="sigmod-paper">{{cite 
journal |last1=Lamb |first1=Andrew |last2=Shen |first2=Yijie |last3=Heres 
|first3=Daniel |last4=Chakraborty |first4=Jayjeet |last5=Kabak |first5=Mehmet 
Ozan |last6=Hsieh |first6=Liang-Chi |last7=Sun |first7=Chao |title=Apache Arrow 
DataFusion: A Fast, Embeddable, Modular Analytic Query Engine 
|journal=Proceedings of the 2024 International Conference on Management of Data 
|year=2024 |doi=10.1145/3626246.3653368}}</ref><ref name="intro-docs">{{cite 
web |title=Introduction 
|url=https://datafusion.apache.org/user-guide/introduction.html |website=Apache 
DataFusion |publisher=Apache Software Foundation 
|access-date=2026-03-22}}</ref> It provides [[SQL]] and DataFrame interfaces 
for analytical query execution and is designed to be used as a library by 
develop
 ers building databases, query engines, and analytical tools, rather than as a 
standalone database server.<ref name="sigmod-paper" /><ref name="intro-docs" /> 
The project originated in 2017, was donated to the [[Apache Arrow]] project in 
2019, and became a top-level project of the [[Apache Software Foundation]] in 
2024.<ref name="donation-post">{{cite web |title=DataFusion: A Rust-native 
Query Engine for Apache Arrow 
|url=https://datafusion.apache.org/blog/2019/02/04/datafusion-donation/ 
|website=Apache DataFusion Blog |publisher=Apache Software Foundation 
|date=2019-02-04 |access-date=2026-03-22}}</ref><ref name="asf-tlp">{{cite web 
|title=Apache Software Foundation Announces New Top-Level Project Apache 
DataFusion 
|url=https://news.apache.org/foundation/entry/apache-software-foundation-announces-new-top-level-project-apache-datafusion
 |website=The ASF Blog |publisher=Apache Software Foundation |date=2024-06-11 
|access-date=2026-03-22}}</ref>

Review Comment:
   > It provides [[SQL]] and DataFrame interfaces for analytical query 
execution and is designed to be used as a library by developers building 
databases, query engines, and analytical tools, rather than as a standalone 
database server.<ref name="sigmod-paper" /><ref name="intro-docs" />
   
   I think we can make this a bit better in the sense of introducing DataFusion 
and its uniqueness. Here's what I think : 
   
   Often described as the "LLVM for Databases," [Source 1] Apache DataFusion is 
a modular, Arrow-native query engine library designed for embedding into custom 
systems rather than operating as a monolithic standalone server [Source 2 and 
3]. This high-performance Rust framework provides a composable foundation, 
allowing developers to precisely extend query planning and vectorized execution 
to meet unique architectural requirements. [Source 2 and 3]
   
   Source 1 : https://midas.bu.edu/assets/slides/andrew_lamb_slides.pdf (cc 
@alamb )
   
   Source 2 and 3 (this is the first two reference) : <ref 
name="sigmod-paper">{{cite journal |last1=Lamb |first1=Andrew |last2=Shen 
|first2=Yijie |last3=Heres |first3=Daniel |last4=Chakraborty |first4=Jayjeet 
|last5=Kabak |first5=Mehmet Ozan |last6=Hsieh |first6=Liang-Chi |last7=Sun 
|first7=Chao |title=Apache Arrow DataFusion: A Fast, Embeddable, Modular 
Analytic Query Engine |journal=Proceedings of the 2024 International Conference 
on Management of Data |year=2024 |doi=10.1145/3626246.3653368}}</ref><ref 
name="intro-docs">{{cite web |title=Introduction 
|url=https://datafusion.apache.org/user-guide/introduction.html |website=Apache 
DataFusion |publisher=Apache Software Foundation |access-date=2026-03-22}}</ref>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to