Re: running the Terasort example

2014-12-16 Thread Ewan Higgs
Hi Tim, > On 16 Dec 2014, at 19:27, Tim Harsch wrote: > > Hi Ewan, > Thanks, I think I was just a bit confused at the time, I was looking at > the spark-perf repo when there was the problem (uh.. ok)… > The PR that I am working on is indeed for spark-perf. > …snip... > > > I can get past th

[ANNOUNCE] Requiring JIRA for inclusion in release credits

2014-12-16 Thread Patrick Wendell
Hey All, Due to the very high volume of contributions, we're switching to an automated process for generating release credits. This process relies on JIRA for categorizing contributions, so it's not possible for us to provide credits in the case where users submit pull requests with no associated

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-16 Thread Patrick Wendell
I'm closing this vote now, will send results in a new thread. On Sat, Dec 13, 2014 at 12:47 PM, Sean McNamara wrote: > +1 tested on OS X and deployed+tested our apps via YARN into our staging > cluster. > > Sean > > >> On Dec 11, 2014, at 10:40 AM, Reynold Xin wrote: >> >> +1 >> >> Tested on OS

[RESULT] [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-16 Thread Patrick Wendell
This vote has PASSED with 12 +1 votes (8 binding) and no 0 or -1 votes: +1: Matei Zaharia* Madhu Siddalingaiah Reynold Xin* Sandy Ryza Josh Rozen* Mark Hamstra* Denny Lee Tom Graves* GuiQiang Li Nick Pentreath* Sean McNamara* Patrick Wendell* 0: -1: I'll finalize and package this release in the

Re: RDD data flow

2014-12-16 Thread Patrick Wendell
> Why is that? Shouldn't all Partitions be Iterators? Clearly I'm missing > something. The Partition itself doesn't need to be an iterator - the iterator comes from the result of compute(partition). The Partition is just an identifier for that partition, not the data itself. Take a look at the sig

Re: Interested in contributing to GraphX in Python

2014-12-16 Thread GregBowyer
I have been thinking about this for a little while and I wonder if it makes sense to look at forcing off heap mmap storage what can be shared with python. The idea would be that java makes a DirectByteBuffer (or similar) with python doing memoryview over that buffer. Then for all except for real

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Nicholas Chammas
Shot down again. ​ On Tue Dec 16 2014 at 9:41:39 PM Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > I see. That’s a separate

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Nicholas Chammas
I see. That’s a separate discussion about closing PRs vs. just updating the CI status on individual commits. I’ll comment on INFRA-7367 . Nick ​ On Tue Dec 16 2014 at 9:38:04 PM Reynold Xin wrote: > This was the ticket: https://issues.apache.or

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Reynold Xin
This was the ticket: https://issues.apache.org/jira/browse/INFRA-7918 On Tue, Dec 16, 2014 at 6:23 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > > Actually, reading through the existing issue opened for this > back in February, I > d

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Nicholas Chammas
Actually, reading through the existing issue opened for this back in February, I don’t see any explanation from ASF Infra as to why they won’t grant permission against the Status API. They just recommended transitioning to the Apache Jenkins instan

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Patrick Wendell
Yeah you can do it - just make sure they understand it is a new feature so we're asking them to revisit it. They looked at it in the past and they concluded they couldn't give us access without giving us push access. - Patrick On Tue, Dec 16, 2014 at 6:06 PM, Reynold Xin wrote: > It's worth tryi

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Reynold Xin
It's worth trying :) On Tue, Dec 16, 2014 at 6:02 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > > News flash! > > From the latest version of the GitHub API > : > > Note that the repo:status OAuth scope >

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Nicholas Chammas
News flash! >From the latest version of the GitHub API : Note that the repo:status OAuth scope grants targeted access to Statuses *without* also granting access to repository code, while the repo scop

Re: running the Terasort example

2014-12-16 Thread Tim Harsch
Hi Ewan, Thanks, I think I was just a bit confused at the time, I was looking at the spark-perf repo when there was the problem (uh.. ok)… I notice now with a pull down just minutes back that I still get a compile problem. [ERROR] /Users/tharsch/git/ehiggs/spark/examples/src/main/scala/org/apach

RDD data flow

2014-12-16 Thread Madhu
I was looking at some of the Partition implementations in core/rdd and getOrCompute(...) in CacheManager. It appears that getOrCompute(...) returns an InterruptibleIterator, which delegates to a wrapped Iterator. That would imply that Partitions should extend Iterator, but that is not always the ca

Data Loss - Spark streaming

2014-12-16 Thread Jeniba Johnson
Hi, I need a clarification, while running streaming examples, suppose the batch interval is set to 5 minutes, after collecting the data from the input source(FLUME) and processing till 5 minutes. What will happen to the data which is flowing continuously from the input source to spark streamin

Re: running the Terasort example

2014-12-16 Thread Ewan Higgs
Hi Tim, run-example is here: https://github.com/ehiggs/spark/blob/terasort/bin/run-example It should be in the repository that you cloned. So if you were at the top level of the checkout, run-example would be run as ./bin/run-example. Yours, Ewan Higgs On 12/12/14 01:06, Tim Harsch wrote: Hi