Hi all,
I am running into a problem that once in a while my job is giving me the
following exception(s):
java.net.SocketTimeoutException: Accept timed out
at java.net.PlainSocketImpl.socketAccept(Native Method)
at
java.net.AbstractPlainSocketImpl.accept(AbstractPl
Hi,
I use HCatalog Streaming Mutation API to write data to hive transactional
table, and then, I use SparkSQL to read data from the hive transactional
table. I get the right result.
However, SparkSQL uses more time to read hive orc bucket transactional
table, beacause SparkSQL rea
Best Regards,
Vamshi T
In your program persist the smaller table and use count to force it to
materialize. Then in the Spark UI go to the Storage tab. The size of your
table as spark sees it should be displayed there. Out of curiosity what
version / language of Spark are you using?
On Mon, Oct 15, 2018 at 11:53 AM Venka
We have a case where we interact with a Kerberized service and found a simple
workaround to distribute and make use of the driver’s Kerberos credential cache
file in the executors. Maybe some of the ideas there can be of help for this
case too? Our case in on Linux though. Details:
https://git
I am trying to do a broadcast join on two tables. The size of the
smaller table will vary based upon the parameters but the size of the
larger table is close to 2TB. What I have noticed is that if I don't
set the spark.sql.autoBroadcastJoinThreshold to 10G some of these
operations do a SortMergeJoi
Spark only does Kerberos authentication on the driver. For executors it
currently only supports Hadoop's delegation tokens for Kerberos.
To use something that does not support delegation tokens you have to
manually manage the Kerberos login in your code that runs in executors,
which might be trick
Currently (In Spark 2.3.1) we cannot bucket DataFrames by nested columns, e.g
df.write.bucketBy(10, "key.a").saveAsTable(“junk”)
will result in the following exception:
org.apache.spark.sql.AnalysisException: bucket column key.a is not defined in
table junk, defined table columns are: key, val
How about
select unix_timestamp(timestamp2) – unix_timestamp(timestamp1)?
From: Paras Agarwal
Date: Monday, October 15, 2018 at 2:41 AM
To: John Zhuge
Cc: user , dev
Subject: Re: Timestamp Difference/operations
Thanks John,
Actually need full date and time difference not just d
Hi Jungtaek,
Thanks, we thought that might be the issue but haven't tested yet as
building against an unreleased version of Spark is tough for us, due to
network restrictions. We will try though. I will report back if we find
anything.
Best regards,
Patrick
On Fri, Oct 12, 2018, 2:57 PM Jungtaek
Has anyone gotten spark to write to SQL server using Kerberos
authentication with Microsoft's JDBC driver? I'm having limited success,
though in theory it should work.
I'm using a YARN-mode 4-node Spark 2.3.0 cluster and trying to write a
simple table to SQL Server 2016. I can get it to work if I
11 matches
Mail list logo