Thanks Steve to answer in detail. I was under same feeling with Chandan
from the line as well: it was against my knowledge as rename operation
itself in HDFS is atomic, and I didn't imagine it was for tackling object
store.
I learned a lot for object store from your answer. Thanks again.
Jungtaek
Thanks a lot Steve and Jungtaek for your answers.
Steve,
You explained really well in depth.
I understood that the existing old implementation was not correct for
object store like S3. The new implementation will address that. And for
better performance we should better choose a Direct Write base
I'd say that it was important to be compatible with Hive in the past, but
that's becoming less important over time. Spark is well established with
Hadoop users and I think the focus moving forward should be to make Spark
more predictable as a SQL engine for people coming from more traditional
datab
I think it has been an important “selling point” that Spark is “mostly
compatible“ with Hive DDL.
I have see a lot of teams suffering from switching between Presto and Hive
dialects.
So one question I have is, we are at a point of switch from Hive compatible to
ANSI SQL, say?
Perhaps a more c
On 2 Oct 2018, at 04:44, tigerquoll
mailto:tigerqu...@outlook.com>> wrote:
Hi Steve,
I think that passing a kerberos keytab around is one of those bad ideas that
is entirely appropriate to re-question every single time you come across it.
It has been used already in spark when interacting with
I agree with Ryan, a "standard" and more widely adopted syntax is usually a
good idea, with possibly some slight improvements like "bulk deletion" of
columns (especially because both the syntax and the semantics are clear),
rather than stay with Hive syntax at any cost.
I am personally following t