> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
--
~Yours, Xuefeng Wu/吴雪峰 敬上
what's the dump info by jstack?
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2015年2月6日, at 上午10:20, Michael Albert
> wrote:
>
> My apologies for following up my own post, but I thought this might be of
> interest.
>
> I terminated the java process corresponding to executor whic
could you find the shuffle files? or the files were deleted by other processes?
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2015年2月5日, at 下午11:14, Yifan LI wrote:
>
> Hi,
>
> I am running a heavy memory/cpu overhead graphx application, I think the
> memory is sufficient and set RDDs’
It looks because different snappy version, if you disable compress or switch to
lz4, the size is no different.
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2015年2月10日, at 下午6:13, chris wrote:
>
> Hello,
>
> as the original message from Kevin Jung never got accepted to the
> mailinglis
there is docker script for spark 0.9 in spark git
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年8月10日, at 下午8:27, 诺铁 wrote:
>
> hi, all,
>
> I am playing with docker, trying to create a spark cluster with docker
> containers.
>
> since spark master, worker, driver all nee
4 at 5:39 PM, Xuefeng Wu wrote:
>
>> scala> import scala.collection.GenSeq
>> scala> val seq = GenSeq("This", "is", "an", "example")
>>
>> scala> seq.aggregate("0")(_ + _, _ + _)
>> res0: String = 0Th
Scores = for {
(_, ageScores) <- takeTop(scores, _.age)
(_, numScores) <- takeTop(ageScores, _.num)
} yield {
numScores
}
topScores.size
--
~Yours, Xuefeng Wu/吴雪峰 敬上
hi Debasish,
I found test code in map translate,
would it collect all products too?
+ val sortedProducts = products.toArray.sorted(ord.reverse)
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年12月2日, at 上午1:33, Debasish Das wrote:
>
> rdd.top collects it on master...
>
> If you want top
I have similar requirememt,take top N by key. right now I use groupByKey,but
one key would group more than half data in some dataset.
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年12月4日, at 上午7:26, Nathan Kronenfeld
> wrote:
>
> I think it would depend on the type and amount of inform
looks good.
I concern about the foldLeftByKey which looks break the consistence from
foldLeft in RDD and aggregateByKey in PairRDD
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年12月4日, at 上午7:47, Koert Kuipers wrote:
>
> fold
how about save as object?
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年12月30日, at 下午9:27, Jason Hong wrote:
>
> Dear all:)
>
> We're trying to make a graph using large input data and get a subgraph
> applied some filter.
>
> Now, we wanna save this graph to HDFS so that
Hi Aureliaono,
First, docker is not ready for production, unless you know what are doing and
prepared for some risk.
Then, in my opinion , there are so many hard code in spark docker script, you
have to modify it for your goal.
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年3月10日, at 上午12
atMap and what
> is a good use case for each?
>
> --
> Eran | CTO
>
--
~Yours, Xuefeng Wu/吴雪峰 敬上
13 matches
Mail list logo