I also share the concerns of "writing twice", which hurts performance a
lot. What's worse, the final write may not be scalable, like writing the
staging table to the final table.
If the sink itself doesn't support global transaction, but only local
transaction (e.g. kafla), using staging tables se
Thanks Deepak! I'll try it.
On Thu, Dec 5, 2019 at 4:13 PM Deepak Vohra wrote:
> The Guava issue could be fixed in one of two ways:
>
> - Use Hadoop v3
> - Create an Uber jar, refer
>
> https://gite.lirmm.fr/yagoubi/spark/commit/c9f743957fa963bc1dbed7a44a346ffce1a45cf2
> Managing Java depende
Hi Deepak,
For Spark, I am using master branch and just have code updated yesterday.
For Guava, I actually deleted my old versions from the local Maven repo.
The build process of Spark automatically downloaded a few versions. The
oldest version is 14.0.1.
But even in 14.0,1 (
https://guava.dev/
Hi Sean,
Oh, sorry. I just came back to Spark home. However, the same error came
out.
D:\apache\spark\bin>cd ..
D:\apache\spark>bin\spark-shell
Exception in thread "main" java.lang.NoSuchMethodError:
com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
Hi Deepak,
Yes, I did use Maven. I even have the build pass successfully when setting
Hadoop version to 3.2. Please see my response to Sean's email.
Unfortunately, I only have Docker Toolbox as my Windows doesn't have
Microsoft Hyper-V. So I want to avoid using Docker to do major work if
possib
No, the build works fine, at least certainly on test machines. As I
say, try running from the actual Spark home, not bin/. You are still
running spark-shell there.
On Thu, Dec 5, 2019 at 4:37 PM Ping Liu wrote:
>
> Hi Sean,
>
> Thanks for your response!
>
> Sorry, I didn't mention that "build/mvn
Hi Sean,
Thanks for your response!
Sorry, I didn't mention that "build/mvn ..." doesn't work. So I did go to
Spark home directory and ran mvn from there. Following is my build and
running result. The source code was just updated yesterday. I guess the
POM should specify newer Guava library so
> Anyway, there were a *lot* of people on the call today and we didn't get
a chance to dig into the nitty-gritty details of these points. I would like
to know what others think of these (not-fleshed-out) proposals, how they do
(or do not) work with disaggregated shuffle implementations in the wild,
What was the build error? you didn't say. Are you sure it succeeded?
Try running from the Spark home dir, not bin.
I know we do run Windows tests and it appears to pass tests, etc.
On Thu, Dec 5, 2019 at 3:28 PM Ping Liu wrote:
>
> Hello,
>
> I understand Spark is preferably built on Linux. But
Hello,
I understand Spark is preferably built on Linux. But I have a Windows
machine with a slow Virtual Box for Linux. So I wish I am able to build
and run Spark code on Windows environment.
Unfortunately,
# Apache Hadoop 2.6.X
./build/mvn -Pyarn -DskipTests clean package
# Apache Hadoop 2.7
It’s that topic again. 😄
We have almost 500 open PRs. A good chunk of them are more than a year old.
The oldest open PR dates to summer 2015.
https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aopen+sort%3Acreated-asc
GitHub has an Action for closing stale PRs.
https://github.com/marketplace/a
+1 for the proposal. The current behavior is confusing.
We also came up with another case that we should consider while
implementing a ViewCatalog: an unresolved relation in a permanent view
(from a view catalog) should never resolve a temporary table. If I have a
view `pview` defined as `select *
Hi all,
For better SQL standard support, I recently opened a pull request
https://github.com/apache/spark/pull/26537 to support real type as float and
numeric as decimal. We have researched a bit and discussed it among several
contributors/committers. Sending the email to the dev list to welc
13 matches
Mail list logo