Do you have winutils in your system relevant for your system.
This SO post has infomation related
https://stackoverflow.com/questions/34196302/the-root-scratch-dir-tmp-hive-on-hdfs-should-be-writable-current-permissions
On 21 October 2017 at 03:16, Marco Mistroni wrote:
> Did u build spark or
You can use following options
* spark-submit from shell
* some kind of job server. See spark-jobserver for details
* some notebook environment See Zeppelin for example
On 18 July 2016 at 17:13, manish jaiswal wrote:
> Hi,
>
>
> What is the best approach to trigger spark job in production cl
Hi,
Please see the property spark.sql.autoBroadcastJoinThreshold here
http://spark.apache.org/docs/latest/sql-programming-guide.html#other-configuration-options
Thanks,
Jagat Singh
On Sat, Jul 9, 2016 at 9:50 AM, Lalitha MV wrote:
> Hi,
>
> 1. What implementation is used for the
Hi,
I am using by telling Spark about hive version we are using. This is done
by setting following properties
spark.sql.hive.version
spark.sql.hive.metastore.jars
Thanks
On Wed, Feb 10, 2016 at 7:39 AM, Koert Kuipers wrote:
> hey thanks. hive-site is on classpath in conf directory
>
> i curr
Hi,
What is the correct way to stop fully the Spark job which is running as
yarn-client using spark-submit.
We are using sc.stop in the code and can see the job still running (in yarn
resource manager) after final hive insert is complete.
The code flow is
start context
do somework
insert to hiv
Not direct answer to your question.
But It might be useful for you to check Spring XD Spark integration.
https://github.com/spring-projects/spring-xd-samples/tree/master/spark-streaming-wordcount-java-processor
On Mon, Nov 16, 2015 at 6:14 AM, Muthu Jayakumar wrote:
> I have only written Akk
Hello Steve,
Thanks for confirmation.
Is there any work planned work on this.
Thanks,
Jagat Singh
On Wed, Sep 30, 2015 at 9:37 PM, Vinay Shukla wrote:
> Steve is right,
> The Spark thing server does not profs page end user identity downstream
> yet.
>
>
>
> On We
Hello Nicolas,
Hive solution is just to concatenate the files , it does not alter or
change records.
On 3 Oct 2015 6:42 pm, wrote:
> Hello,
> Finally Hive is not a solution as I cannot update the data.
> And for archive file I think it would be the same issue.
> Any other solutions ?
>
> Nicolas
trying to read as spark user , using which we started thrift
server.
Since spark user does not have actual read access we get the error.
However the beeline is used by end user not spark user and throws error.
Thanks,
Jagat Singh
On Wed, Sep 30, 2015 at 11:24 AM, Mohammed Guller
wrote:
> D
Hi,
I have started the Spark thrift service using spark user.
Does each user needs to start own thrift server to use it?
Using beeline i am able to connect to server and execute show tables;
However when we try to execute some real query it runs as spark user and
HDFS permissions does not allow
Sorry to answer your question fully.
The job starts tasks and few of them fail and some are successful. The
failed one have that PermGen error in logs.
But ultimately full job is marked fail and session quits.
On Sun, Sep 13, 2015 at 10:48 AM, Jagat Singh wrote:
> Hi Davies,
>
>
queries?
>
> Is this in local mode or cluster mode?
>
> On Fri, Sep 11, 2015 at 3:00 AM, Jagat Singh wrote:
> > Hi,
> >
> > We have queries which were running fine on 1.4.1 system.
> >
> > We are testing upgrade and even simple query like
> >
> >
Hi,
We have queries which were running fine on 1.4.1 system.
We are testing upgrade and even simple query like
val t1= sqlContext.sql("select count(*) from table")
t1.show
This works perfectly fine on 1.4.1 but throws OOM error in 1.5.0
Are there any changes in default memory settings from 1.
Will this recognize the hive partitions as well.
Example
insert into specific partition of hive ?
On Tue, Mar 3, 2015 at 11:42 PM, Cheng, Hao wrote:
> Using the SchemaRDD / DataFrame API via HiveContext
>
> Assume you're using the latest code, something probably like:
>
> val hc = new HiveCont
Hi,
I want to work on some use case something like below.
Just want to know if something similar has been already done which can be
reused.
Idea is to use Spark for ETL / Data Science / Streaming pipeline.
So when data comes inside the cluster front door we will do following steps
1)
Upload
What setting you are using for
persist() or cache()
http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence
On Tue, Oct 28, 2014 at 6:18 PM, shahab wrote:
> Hi,
>
> I have a standalone spark , where the executor is set to have 6.3 G memory
> , as I am using two workers so in
pass in a single
one. For example, you might start your SparkContext pointing to
spark://host1:port1,host2:port2. This would cause your SparkContext to try
registering with both Masters - if host1 goes down, this configuration
would still be correct as we'd find the new leader, host2.
Thanks,
Hi Jenny,
How are you packaging your jar.
Can you please confirm if you have included the Mlib jar inside the fat jar
you have created for your code.
libraryDependencies += "org.apache.spark" % "spark-mllib_2.9.3" %
"0.8.1-incubating"
Thanks,
Jagat Singh
O
18 matches
Mail list logo