(Sorry, first one sent to incubator maling list which probably doesn't come
here)
Hi, I have been stuck at this for a week.
I have a relatively simple dataframe like this:
+-+-++---+
| item| item_id| target| start
Hi, I have been stuck at this for a week.
I have a relatively simple dataframe like this:
+-+-++---+
| item| item_id| target| start|
+-+-++---+
|sensor123|sensor1
Hi ,
when i try to perform CrossValidator i get the stackoverflowError
i have aleardy perform all necessary transforimation Stringindexer vector
and save data frame in HDFS like parquet
afeter that i load all in new data frame and
split to train and test
when i try fit(train_set) i get
LongWritable.class, BytesWritable.class, config);
However, when I run the job I get the following error:
com.fasterxml.jackson.databind.JsonMappingException: Infinite recursion
(StackOverflowError) (through reference chain:
scala.collection.convert.IterableWrapper[0]->org.apache.spark.rdd.RDDOp
It would be great if you could try with the 2.0.2 RC. Thanks for creating
an issue.
On Wed, Nov 9, 2016 at 1:22 PM, Raviteja Lokineni <
raviteja.lokin...@gmail.com> wrote:
> Well I've tried with 1.5.2, 1.6.2 and 2.0.1
>
> FYI, I have created https://issues.apache.org/jira/browse/SPARK-18388
>
>
Well I've tried with 1.5.2, 1.6.2 and 2.0.1
FYI, I have created https://issues.apache.org/jira/browse/SPARK-18388
On Wed, Nov 9, 2016 at 3:08 PM, Michael Armbrust
wrote:
> Which version of Spark? Does seem like a bug.
>
> On Wed, Nov 9, 2016 at 10:06 AM, Raviteja Lokineni <
> raviteja.lokin...
Which version of Spark? Does seem like a bug.
On Wed, Nov 9, 2016 at 10:06 AM, Raviteja Lokineni <
raviteja.lokin...@gmail.com> wrote:
> Does this stacktrace look like a bug guys? Definitely seems like one to me.
>
> Caused by: java.lang.StackOverflowError
> at org.apache.spark.sql.executi
Does this stacktrace look like a bug guys? Definitely seems like one to me.
Caused by: java.lang.StackOverflowError
at org.apache.spark.sql.execution.SparkPlan.prepare(SparkPlan.scala:195)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$prepare$1.apply(SparkPlan.scala:195)
fn).distinct();
>>
>>allMetrics.add(distinctFileMetrics);
>> }
>>
>> JavaRDD finalOutput =
>> jsc.union(allMetrics.toArray(...)).coalesce(10);
>> finalOutput.saveAsTextFile(...);
>>
>> There are posts suggesting
>> <https://st
utput =
> jsc.union(allMetrics.toArray(...)).coalesce(10);
> finalOutput.saveAsTextFile(...);
>
> There are posts suggesting
> <https://stackoverflow.com/questions/30522564/spark-when-union-a-lot-of-rdd-throws-stack-overflow-error>
> that using JavaRDD union(JavaRDD ot
ns/30522564/spark-when-union-a-lot-of-rdd-throws-stack-overflow-error>
that using JavaRDD union(JavaRDD other) many times creates a long
lineage that results in a StackOverflowError.
However, I'm seeing the StackOverflowError even with JavaSparkContext
union(JavaRDD... rdds).
Should this s
I’ve seen this when I specified “too many” where clauses in the SQL query. I
was able to adjust my query to use a single ‘in’ clause rather than many ‘=’
clauses but I realize that may not be an option in all cases.
Jeff
On 5/4/16, 2:04 PM, "BenD" wrote:
>I am getting a java.lang.StackOverflo
I’m running Spark 1.6.0 in a standalone cluster. Periodically I’ve seen
StackOverflowErrors when running queries. An example below.
In the past I’ve been able to avoid such situations by ensuring we don’t have
too many arguments in ‘in’ clauses or too many unioned queries both of which
seem to t
I am getting a java.lang.StackOverflowError somewhere in my program. I am not
able to pinpoint which part causes it because the stack trace seems to be
incomplete (see end of message). The error doesn't happen all the time, and
I think it is based on the number of files that I load. I am running on
Can you give us some more info about the dataframe and caching? Ideally a
set of steps to reproduce the issue
On 9 December 2015 at 14:59, apu mishra . rr wrote:
> The command
>
> mydataframe.write.saveAsTable(name="tablename")
>
> sometimes results in java.lang.StackOverflowError (see below fo
The command
mydataframe.write.saveAsTable(name="tablename")
sometimes results in java.lang.StackOverflowError (see below for fuller
error message).
This is after I am able to successfully run cache() and show() methods on
mydataframe.
The issue is not deterministic, i.e. the same code sometimes
-StackOverFlowError-tp24508p24531.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
s I am still
>>>>>> exploring
>>>>>> > the data. Let me know if there is an alternate way of constructing
>>>>>> the
>>>>>> > RowMatrix.
>>>>>> >
>>>>>
t is going on, but I think it's some strange
>>>>> >> interaction between how you're building up the list and what the
>>>>> >> resulting representation happens to be like, and how the closure
>>>>> >> cleaner works, which ca
ableList[(String,String,Double)]()
>>>> >> (0 to 1).foreach(i => lst :+ ("10", "10", i.toDouble))
>>>> >>
>>>> >> or just
>>>> >>
>>>> >> val lst = (0
saying that something's default
>>> >> Java serialization graph is very deep, so it's like the code you wrote
>>> >> plus the closure cleaner ends up pulling in some huge linked list and
>>> >> serializing it the direct
ing that's nice to just work but isn't
>> >> to do with Spark per se. Or, have a look at others related to the
>> >> closure and shell and you may find this is related to other known
>> >> behavior.
>> >>
>> >>
>>
Sun, Aug 30, 2015 at 8:08 PM, Ashish Shrowty
> >> wrote:
> >> > Sean .. does the code below work for you in the Spark shell? Ted got
> the
> >> > same error -
> >> >
> >> > val a=10
> >> > val lst = MutableL
ean .. does the code below work for you in the Spark shell? Ted got the
>> > same error -
>> >
>> > val a=10
>> > val lst = MutableList[(String,String,Double)]()
>> > Range(0,1).foreach(i=>lst+=(("10","10",i:Double)))
>> >
"10","10",i:Double)))
> sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>
> -Ashish
>
>
> On Sun, Aug 30, 2015 at 2:52 PM Sean Owen wrote:
>>
>> I'm not sure how to reproduce it? this code does not produce an error in
>> master.
>>
>&g
I'm not sure how to reproduce it? this code does not produce an error in master.
On Sun, Aug 30, 2015 at 7:26 PM, Ashish Shrowty
wrote:
> Do you think I should create a JIRA?
>
>
> On Sun, Aug 30, 2015 at 12:56 PM Ted Yu wrote:
>>
>> I got StackOverFlowError as
Do you think I should create a JIRA?
On Sun, Aug 30, 2015 at 12:56 PM Ted Yu wrote:
> I got StackOverFlowError as well :-(
>
> On Sun, Aug 30, 2015 at 9:47 AM, Ashish Shrowty
> wrote:
>
>> Yep .. I tried that too earlier. Doesn't make a difference. Are you able
&g
guide.html#broadcast-variables
>
> Cheers
>
> On Sun, Aug 30, 2015 at 8:54 AM, Ashish Shrowty
> wrote:
>
>> @Sean - Agree that there is no action, but I still get the
>> stackoverflowerror, its very weird
>>
>> @Ted - Variable a is just an int - val a = 10 .
I see.
What about using the following in place of variable a ?
http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables
Cheers
On Sun, Aug 30, 2015 at 8:54 AM, Ashish Shrowty
wrote:
> @Sean - Agree that there is no action, but I still get the
> stackoverflowerro
ug 30, 2015 at 4:21 AM, ashrowty
> wrote:
> > I am running the Spark shell (1.2.1) in local mode and I have a simple
> > RDD[(String,String,Double)] with about 10,000 objects in it. I get a
> > StackOverFlowError each time I try to run the following code (the code
> > itself i
and I have a simple
> RDD[(String,String,Double)] with about 10,000 objects in it. I get a
> StackOverFlowError each time I try to run the following code (the code
> itself is just representative of other logic where I need to pass in a
> variable). I tried broadcasting the variable to
I am running the Spark shell (1.2.1) in local mode and I have a simple
RDD[(String,String,Double)] with about 10,000 objects in it. I get a
StackOverFlowError each time I try to run the following code (the code
itself is just representative of other logic where I need to pass in a
variable). I
I'm using spark-1.4.1 and compile it against CDH5.3.2. When I use
ALS.trainImplicit to build a model, I got this error when rank=40 and
iterations=30.
It worked for (rank=10, iteration=10) and (rank=20, iteration=20).
What was wrong with (rank=40, iterations=30)?
15/08/13 01:16:40 INFO sched
Yeah, this really shouldn't be recursive. It can't be optimized since
it's not a final/private method. I think you're welcome to try a PR
to un-recursivize it.
On Thu, Apr 16, 2015 at 7:31 PM, Jeff Nadler wrote:
>
> I've got a Kafka topic on which lots of data has built up, and a streaming
> app
I've got a Kafka topic on which lots of data has built up, and a streaming
app with a rate limit.
During maintenance for example records will build up on Kafka and we'll
burn them off on restart. The rate limit keeps the job stable while
burning off the backlog.
Sometimes on the first or second i
Use SparkContext#union[T](rdds: Seq[RDD[T]])
On Tue, Feb 3, 2015 at 7:43 PM, Thomas Kwan wrote:
> I am trying to combine multiple RDDs into 1 RDD, and I am using the union
> function. I wonder if anyone has seen StackOverflowError as follows:
>
> Exception in
I am trying to combine multiple RDDs into 1 RDD, and I am using the union
function. I wonder if anyone has seen StackOverflowError as follows:
Exception in thread "main" java.lang.StackOverflowError
at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
Hi,
I am getting a stack overflow error when querying a schemardd comprised of
parquet files. This is (part of) the stack trace:
Caused by: java.lang.StackOverflowError
at
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at
scala.collection.TraversableOnce$
Hi all,
I'm developing a spark application where I need to iteratively update an
RDD over a large number of iterations (1000+). From reading online,
I've found that I should use .checkpoint() to keep the graph from
growing too large. Even when doing this, I keep getting
StackOverflowError's
At 2014-10-28 16:27:20 +0300, Zuhair Khayyat wrote:
> I am using connected components function of GraphX (on Spark 1.0.2) on some
> graph. However for some reason the fails with StackOverflowError. The graph
> is not too big; it contains 1 vertices and 50 edges.
>
> [...]
&
Dear All,
I am using connected components function of GraphX (on Spark 1.0.2) on some
graph. However for some reason the fails with StackOverflowError. The graph
is not too big; it contains 1 vertices and 50 edges. Can any one
help me to avoid this error? Below is the output of Spark:
14
enlarge the JVM’s
stack.
Thanks
Jerry
From: gm yu [mailto:husty...@gmail.com]
Sent: Thursday, September 18, 2014 6:08 PM
To: user@spark.apache.org
Subject: StackOverflowError
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to
stage failure: Task 736952.0:
What were you trying to do?
Thanks
Best Regards
On Thu, Sep 18, 2014 at 3:37 PM, gm yu wrote:
> Exception in thread "main" org.apache.spark.SparkException: Job aborted
> due to stage failure: Task 736952.0:2 failed 1 times, most recent failure:
> Exception failure in TID 21006 on host localhost
Exception in thread "main" org.apache.spark.SparkException: Job aborted due
to stage failure: Task 736952.0:2 failed 1 times, most recent failure:
Exception failure in TID 21006 on host localhost:
java.lang.StackOverflowError
java.util.zip.GZIPInputStream.read(GZIPInputStream.java:116)
java.util.zi
unning on yarn-standalone.
>>
>> the last three lines of the code as below,
>>
>> val result = model.predict(prdctpairs)
>> result.map(x =>
>> x.user+","+x.product+","+x.rating).saveAsTextFile(output)
>> sc.stop()
>>
mes be able to run successfully and could give out
> the right result, while from time to time, it throws StackOverflowError and
> fail.
>
> and I don`t have a clue how I should debug.
>
> below is the error, (the start and end portion to be exact):
>
>
> 14-05-09 17:55:51
model.predict(prdctpairs)
> result.map(x =>
> x.user+","+x.product+","+x.rating).saveAsTextFile(output)
> sc.stop()
>
> the same code, sometimes be able to run successfully and could give out the
> right result, while from time to time, it throws
successfully and could give out the
right result, while from time to time, it throws StackOverflowError and
fail.
and I don`t have a clue how I should debug.
below is the error, (the start and end portion to be exact):
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17]
MapOutputTr
48 matches
Mail list logo