Thanks Manu for your response.
I already checked the logs and didn't see anything that can help me
understanding the issue.
The more weird thing, i have a small CI cluster which run on single
NameNode and i see the Spark2 job in the UI, i'm still not sure if it may
related to the NameNode HA, i t
Accidentally to get it working, though don't thoroughly understand why (So far
as I know, it's to configure in allowing executor refers to the conf file after
copying to executors' working dir). Basically it's a combination of parameters
--conf, --files, and --driver-class-path, instead of any s
Hi
Am getting below exception when I Run Spark-submit in linux machine , can
someone give quick solution with commands
Driver stacktrace:
- Job 0 failed: count at DailyGainersAndLosersPublisher.scala:145, took
5.749450 s
org.apache.spark.SparkException: Job aborted due to stage failure: Task 4 in
Hi,
I've added table level security using spark extensions based on the ongoing
work proposed for ranger in RANGER-2128. Following the same logic, you
could mask columns and work on the logical plan, but not filtering or
skipping rows, as those are not present in these hooks.
The only difficult I
Hi Venkata,
On a quick glance, it looks like a file-related issue more so than an
executor issue. If the logs are not that important, I would clear
/tmp/spark-events/ directory and assign a suitable permission (e.g., chmod
755) to that and rerun the application.
chmod 755 /tmp/spark-events/
Than
Hi,
Any help on this?
Thanks,
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Hi,
I am executing a insert into Hive table using SparkSession in Java. When I
execute select via beeline, I don't see these inserted data. And when I
insert data using beeline I don't see via my program using SparkSession.
It's looks like there are different Hive instances running.
How can I po
You probably need to take a look at your hive-site.xml and see what the
location is for the Hive Metastore. As for beeline, you can explicitly use an
instance of Hive server by passing in the JDBC url to the hiveServer when you
launch the client; e.g. beeline –u “jdbc://example.com:5432”
Try ta
Hi all,
I'm trying to create a dataframe enforcing a schema so that I can write it
to a parquet file. The schema has timestamps and I get an error with
pyspark. The following is a snippet of code that exhibits the problem,
df = sqlctx.range(1000)
schema = StructType([StructField('a', TimestampTyp
Resurfacing The question to get more attention
Hello,
>
> im running Spark 2.3 job on kubernetes cluster
>>
>> kubectl version
>>
>> Client Version: version.Info{Major:"1", Minor:"9",
>> GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b",
>> GitTreeState:"clean", BuildDa
10 matches
Mail list logo