All,
I strongly suspect this might be caused by a glitch in the communication
with Google Cloud Storage where my job is writing to, as this NPE exception
shows up fairly randomly. Any ideas?
Exception in thread "Thread-126" java.lang.NullPointerException
at
scala.collection.mutable.ArrayO
How do I build the scaladoc html files from the spark source distribution?
Alex Bareta
I finally came to realize that there is a special maven target to build the
scaladocs, although arguably a very unintuitive on: mvn verify. So now I
have scaladocs for each package, but not for the whole spark project.
Specifically, build/docs/api/scala/index.html is missing. Indeed the whole
build
have a separate thing with new dependencies in order to
> build the web docs, but that's how it is at the moment.
>
> Nick
>
> On Fri, Nov 7, 2014 at 3:39 PM, Alessandro Baretta
> wrote:
>
>> I finally came to realize that there is a special maven target to build
>
All,
I'm using the Spark shell to interact with a small test deployment of
Spark, built from the current master branch. I'm processing a dataset
comprising a few thousand objects on Google Cloud Storage, split into a
half dozen directories. My code constructs an object--let me call it the
Dataset
Lee wrote:
>
> I'm curious if you're seeing the same thing when using bdutil against
> GCS? I'm wondering if this may be an issue concerning the transfer rate of
> Spark -> Hadoop -> GCS Connector -> GCS.
>
>
> On Wed Dec 17 2014 at 10:09:17 PM Alessa
n just as fast
> scans?
>
>
> On Wed Dec 17 2014 at 10:44:45 PM Alessandro Baretta <
> alexbare...@gmail.com> wrote:
>
>> Denny,
>>
>> No, gsutil scans through the listing of the bucket quickly. See the
>> following.
>>
>> alex@had
n Wed, Dec 17, 2014 at 11:24 PM, Alessandro Baretta
wrote:
>
> Well, what do you suggest I run to test this? But more importantly, what
> information would this give me?
>
> On Wed, Dec 17, 2014 at 10:46 PM, Denny Lee wrote:
>>
>> Oh, it makes sense of gsutil scans
Gents,
I'm building spark using the current master branch and deploying in to
Google Compute Engine on top of Hadoop 2.4/YARN via bdutil, Google's Hadoop
cluster provisioning tool. bdutils configures Spark with
spark.local.dir=/hadoop/spark/tmp,
but this option is ignored in combination with YAR
v, +user
>
> http://spark.apache.org/docs/latest/job-scheduling.html
>
>
> On Sat, Jan 10, 2015 at 4:40 PM, Alessandro Baretta > wrote:
>
>> Is it possible to specify a priority level for a job, such that the active
>> jobs might be scheduled in order of priority?
>>
>> Alex
>>
>
>
/docs/latest/job-scheduling.html#configuring-pool-properties
>
> "Setting a high weight such as 1000 also makes it possible to implement
> *priority* between pools—in essence, the weight-1000 pool will always get
> to launch tasks first whenever it has jobs active."
>
> On Sat,
>
> On Sunday, January 11, 2015, Alessandro Baretta
> wrote:
>
>> Cody,
>>
>> Maybe I'm not getting this, but it doesn't look like this page is
>> describing a priority queue scheduling policy. What this section discusses
>> is how resources
12 matches
Mail list logo