What might be the biggest factor affecting running time here is that
Drill's query execution is not fault tolerant while Spark's is. The
philosophy is different, Drill's says "when you're doing interactive
analytics and a node dies, killing your query as it goes, just run the
query again."
O
Hi Wes,
Thanks for the report! I like it (mostly because it's short and concise).
Thank you.
I know nothing about Drill and am curious about the similar execution times
and this sentence in the report: "Spark is the second fastest, that should
be reasonable, since both Spark and Drill have almost
I made a simple test to query time for several SQL engines including
mysql, hive, drill and spark. The report,
https://cloudcache.net/data/query-time-mysql-hive-drill-spark.pdf
It maybe have no special meaning, just for fun. :)
regards.