+1000.
Especially if the UI can help correlate exceptions, and we can reduce
some exceptions.
There are some exceptions which are in practice very common, such as
the nasty ClassNotFoundException, that most folks end up spending tons
of time debugging.
On Mon, Apr 18, 2016 at 12:16 PM, Reynold
y database multiple times.
On Sun, Apr 17, 2016 at 9:51 AM, Jon Maurer wrote:
> Take a look at spark testing base.
> https://github.com/holdenk/spark-testing-base/blob/master/README.md
>
> On Apr 17, 2016 10:28 AM, "Evan Chan" wrote:
>>
>> What I want to fi
er` mode by
> yourself like
> 'https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/ShuffleSuite.scala#L55'?
>
> // maropu
>
> On Sun, Apr 17, 2016 at 9:47 AM, Evan Chan wrote:
>>
>> Hey folks,
>>
>> I'd like to
Hey folks,
I'd like to use local-cluster mode in my Spark-related projects to
test Spark functionality in an automated way in a simulated local
cluster.The idea is to test multi-process things in a much easier
fashion than setting up a real cluster. However, getting this up and
running in a
Hi folks,
Sorry to join the discussion late. I had a look at the design doc
earlier in this thread, and it was not mentioned what types of
projects are the targets of this new "spark extras" ASF umbrella
Is the desire to have a maintained set of spark-related projects that
keep pace with the
Hey guys,
Is there any guidance on what the different tracks for Spark Summit
West mean? There are some new ones, like "Third Party Apps", which
seems like it would be similar to the "Use Cases". Any further
guidance would be great.
thanks,
Evan
---
Why not just use SLF4J?
On Tue, Feb 3, 2015 at 2:22 PM, Reynold Xin wrote:
> We can use ScalaTest's privateMethodTester also instead of exposing that.
>
> On Tue, Feb 3, 2015 at 2:18 PM, Marcelo Vanzin wrote:
>
>> Hi Jay,
>>
>> On Tue, Feb 3, 2015 at 6:28 AM, jayhutfles wrote:
>> > // Expos
Congrats everyone!!!
On Tue, Feb 3, 2015 at 3:17 PM, Timothy Chen wrote:
> Congrats all!
>
> Tim
>
>
>> On Feb 4, 2015, at 7:10 AM, Pritish Nawlakhe
>> wrote:
>>
>> Congrats and welcome back!!
>>
>>
>>
>> Thank you!!
>>
>> Regards
>> Pritish
>> Nirvana International Inc.
>>
>> Big Data, Hadoop,
nar. does spark SQL already use something
>> like
>> that? Evan mentioned "Spark SQL columnar compression", which sounds like
>> it. where can i find that?
>>
>> thanks
>>
>> On Thu, Jan 29, 2015 at 2:32 PM, Evan Chan
>> wrote:
>>
>
"null".
>
> See, e.g. http://www.r-bloggers.com/r-na-vs-null/
>
>
>
> On Wed, Jan 28, 2015 at 4:42 PM, Reynold Xin wrote:
>>
>> Isn't that just "null" in SQL?
>>
>> On Wed, Jan 28, 2015 at 4:41 PM, Evan Chan
>> wrote:
>>
wrote:
> Isn't that just "null" in SQL?
>
> On Wed, Jan 28, 2015 at 4:41 PM, Evan Chan wrote:
>>
>> I believe that most DataFrame implementations out there, like Pandas,
>> supports the idea of missing values / NA, and some support the idea of
>> No
sql.types. After 1.3, sql.catalyst is hidden from users, and all public APIs
> have first class classes/objects defined in sql directly.
>
>
>
> On Wed, Jan 28, 2015 at 4:20 PM, Evan Chan wrote:
>>
>> Hey guys,
>>
>> How does this impact the data sources API? I
Hey guys,
How does this impact the data sources API? I was planning on using
this for a project.
+1 that many things from spark-sql / DataFrame is universally
desirable and useful.
By the way, one thing that prevents the columnar compression stuff in
Spark SQL from being more useful is, at leas
Ashwin,
I would say the strategies in general are:
1) Have each user submit separate Spark app (each its own Spark
Context), with its own resource settings, and share data through HDFS
or something like Tachyon for speed.
2) Share a single spark context amongst multiple users, using fair
schedul
James,
Michael at the meetup last night said there was some development
activity around ORCFiles.
I'm curious though, what are the pros and cons of ORCFiles vs Parquet?
On Wed, Oct 8, 2014 at 10:03 AM, James Yu wrote:
> Didn't see anyone asked the question before, but I was wondering if anyone
d would be to read data from Cassandra/Vertica/etc. and
write back into Parquet, but this would take a long time and incur
huge I/O overhead.
>
> I'm sorry it just sounds like its worth clearly defining what your key
> requirement/goal is.
>
>
> On Thu, Aug 28, 2014 at
>
>> The reason I'm asking about the columnar compressed format is that
>> there are some problems for which Parquet is not practical.
>
>
> Can you elaborate?
Sure.
- Organization or co has no Hadoop, but significant investment in some
other NoSQL store.
- Need to efficiently add a new column to
What would be the timeline for the parquet caching work?
The reason I'm asking about the columnar compressed format is that
there are some problems for which Parquet is not practical.
On Mon, Aug 25, 2014 at 1:13 PM, Michael Armbrust
wrote:
>> What is the plan for getting Tachyon/off-heap suppor
Hey guys,
What is the plan for getting Tachyon/off-heap support for the columnar
compressed store? It's not in 1.1 is it?
In particular:
- being able to set TACHYON as the caching mode
- loading of hot columns or all columns
- write-through of columnar store data to HDFS or backing store
- b
I'm hoping to get in some doc enhancements and small bug fixes for Spark SQL.
Also possibly a small new API to list the tables in sqlContext.
Oh, and to get the doc page I had talked about before, a list of
community Spark projects.
thanks,
Evan
-
Dear community,
Wow, I remember when we first open sourced the job server, at the
first Spark Summit in December. Since then, more and more of you have
started using it and contributing to it. It is awesome to see!
If you are not familiar with the spark job server, it is a REST API
for managin
> > My typical use case is a large scale distributed graph traversal in real
> > time, with billions of nodes.
> >
> > Thanks,
> > Love.
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-spark-developers-list.1001
https://github.com/apache/spark/pull/288
It's for fixing SPARK-1154, which would help Spark be a better citizen for
most deploys, and should be really small and easy to review.
thanks,
Evan
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
<http://www.ooyala.com/>
<http://ww
gt; > a single lib/ folder, so in some ways it's even easier to manage than the
> > assembly.
> >
>
> You might also check out the
> sbt-native-packager<https://github.com/sbt/sbt-native-packager>.
>
>
> Cheers,
> Lee
>
--
--
Evan Chan
Staff Engineer
e
can already be created from the Maven build: mvn
> >> -Pdeb ...
> >>
> >>
> >> On Tue, Apr 1, 2014 at 11:24 AM, Evan Chan wrote:
> >>
> >> > Also, I understand this is the last week / merge window for 1.0, so if
> >>
Also, I understand this is the last week / merge window for 1.0, so if
folks are interested I'd like to get in a PR quickly.
thanks,
Evan
On Tue, Apr 1, 2014 at 11:24 AM, Evan Chan wrote:
> Hey folks,
>
> We are in the middle of creating a Chef recipe for Spark. As part of
/ folder, so in some ways it's even easier to manage than the
assembly.
Also I'm not sure if there's an equivalent plugin for Maven.
thanks,
Evan
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
<http://www.ooyala.com/>
<http://www.facebook.com/ooyala><http://www.lin
done by a single voice, preventing contradicting comments
> > etc... Knowing that other projects actually demand the patch-submitter to
> ask
> > for shepherding, I figured why not doing the same.
> >
> > For that ExternalContainerizer baby, I would kindly like to call ou
anning code is not
> considered a public API and so is likely to change quite a bit as we improve
> the optimizer. Its not currently something that we plan to expose for
> external components to modify.
>
> Michael
>
>
> On Sun, Mar 23, 2014 at 11:49 PM, Evan Chan wrote:
>
it out. We have backported several bug fixes into the 0.9 and updated JIRA
>>>
>>> accordingly<https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)>.
>>>
>>> Please let me know if there are fixes that were not backported but you
>>> would like to see them in 0.9.1.
>>>
>>> Thanks!
>>>
>>> TD
>>>
>>
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
Suhas,
You're welcome. We are planning to speak about the job server at the
Spark Summit by the way.
-Evan
On Mon, Mar 24, 2014 at 9:38 AM, Suhas Satish wrote:
> Thanks a lot for this update Evan , really appreciate the effort.
>
> On Monday, March 24, 2014, Evan Chan wro
Modifying* Spark's dependency graph...
>>
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
spark-contrib.
On Sat, Mar 22, 2014 at 6:15 PM, Suhas Satish wrote:
> Any plans of integrating SPARK-818 into spark trunk ? The pull request is
> open.
> It offers spark as a service with spark jobserver running as a separate
> process.
>
>
> Thanks,
> Suhas.
--
--
Ev
elease! You can submit the PR and we can merge
> it branch-0.9. If we have to cut another release, then we can include it.
>
>
>
> On Sun, Mar 23, 2014 at 11:42 PM, Evan Chan wrote:
>
>> I also have a really minor fix for SPARK-1057 (upgrading fastutil),
>> could that a
ver (and years of testing). Once SparkSQL graduates from Alpha
> status, it'll likely become the new backend for Shark.
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
rtant
>> > bug
>> > > fixes and we would like to make a bug-fix release of Spark 0.9.1. We
>> are
>> > > going to cut a release candidate soon and we would love it if people
>> test
>> > > it out. We have backported several bug fixes into the 0.9 and updated
>> > JIRA
>> > > accordingly<
>> > >
>> >
>> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
>> > > >.
>> > > Please let me know if there are fixes that were not backported but you
>> > > would like to see them in 0.9.1.
>> > >
>> > > Thanks!
>> > >
>> > > TD
>> > >
>> >
>>
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
>
> For sure, we'll try to share it when we'll reach this point to deploy using
> marathon (should be planned for April)
>
> greetz and again, Nice Work Evan!
>
> Ndi
>
> On Wed, Mar 19, 2014 at 7:27 AM, Evan Chan wrote:
>
>> Andy,
>>
>> Yeah, w
o rebuild and deploy
>> spark manually.
>>
>> --
>> Nathan Kronenfeld
>> Senior Visualization Developer
>> Oculus Info Inc
>> 2 Berkeley Street, Suite 600,
>> Toronto, Ontario M5A 4J5
>> Phone: +1-416-203-3003 x 238
>> Email: nkronenf...@oculusinfo.com
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
gt; repo set-up for the 1.0 release.
>>
>> On Tue, Mar 18, 2014 at 11:28 PM, Evan Chan wrote:
>> > Matei,
>> >
>> > Maybe it's time to explore the spark-contrib idea again? Should I
>> > start a JIRA ticket?
>> >
>> > -Evan
>
Powered+By+Spark.
>
> Matei
>
> On Mar 18, 2014, at 1:51 PM, Evan Chan wrote:
>
>> Dear Spark developers,
>>
>> Ooyala is happy to announce that we have pushed our official, Spark
>> 0.9.0 / Scala 2.10-compatible, job server as a github repo:
>>
>> https
ews, Evan + Ooyala team: Great Job again.
>
> andy
>
> On Tue, Mar 18, 2014 at 11:39 PM, Henry Saputra
> wrote:
>
>> W00t!
>>
>> Thanks for releasing this, Evan.
>>
>> - Henry
>>
>> On Tue, Mar 18, 2014 at 1:51 PM, Evan Chan wrote:
>>
now closed.
Please have a look; pull requests are very welcome.
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
>
> Thanks!
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Necessity-of-Maven-and-SBT-Build-in-Spark-tp2315p5682.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
t; more values inside then it cannot be also a value itself, i think. so this
>>> would work fine:
>>> spark.speculation.enabled=true
>>> spark.speculation.interval=0.5
>>>
>>> just a heads up. i would probably suggest we avoid this situation.
>>>
>>
>>
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
Evan
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
ame allocation for
> second RDD? (all 'a's from rdd2 going to the same machine where 'a's from
> first RDD went to).
>
> Is there a way to achieve this?
>
> Manoj
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
back with a
help email.
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
cron job to clean up old folders.
thanks,
-Evan
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
.org/jira/browse/SPARK
>>
>> Best,
>>
>> --
>> Nan Zhu
>>
>>
>> On Friday, February 28, 2014 at 2:29 PM, Evan Chan wrote:
>>
>> > Hey guys,
>> >
>> > There is no plan to move the Spark JIRA from the current
>> >
Hey guys,
There is no plan to move the Spark JIRA from the current
https://spark-project.atlassian.net/
right?
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
tely
> satisfactory SBT build from a Maven build would be quite challenging.)
>
>
> On Wed, Feb 26, 2014 at 11:34 AM, Evan Chan wrote:
>
>> Mark,
>>
>> No, I haven't tried this myself yet :-p Also I would expect that
>> sbt-pom-reader does not do assemblies at
etely.
>
> It's not completely obvious to me how to proceed with what sbt-pom-reader
> produces in order build the assemblies, run the test suites, etc., so I'm
> wondering if you have already worked out what that requires?
>
>
> On Wed, Feb 26, 2014 at 9:31 AM, Evan
any objections to using sbt or maven !
> Too many exclude versions, pinned versions, etc would just make things
> unmanageable in future.
>
>
> Regards,
> Mridul
>
>
>
>
> On Wed, Feb 26, 2014 at 8:56 AM, Evan chan wrote:
>> Actually you can control exactly h
>
>>
>> I was wondering actually, do you know if it's possible to added shaded
>> artifacts to the *spark jar* using this plug-in (e.g. not an uber
>> jar)? That's something I could see being really handy in the future.
>>
>> - Patrick
>>
ject that would allow this kind of thing?
>
> -Sandy
>
>
> On Tue, Feb 25, 2014 at 4:23 PM, Evan Chan wrote:
>
>> Hi Patrick,
>>
>> If you include shaded dependencies inside of the main Spark jar, such
>> that it would have combined classes from all depende
park-core_2.10/0.9.0-incubating/spark-core_2.10-0.9.0-incubating.jar
>
> On Tue, Feb 25, 2014 at 4:04 PM, Evan Chan wrote:
>> Patrick -- not sure I understand your request, do you mean
>> - somehow creating a shaded jar (eg with maven shader plugin)
>> - then including
ifacts to the *spark jar* using this plug-in (e.g. not an uber
> jar)? That's something I could see being really handy in the future.
>
> - Patrick
>
> On Tue, Feb 25, 2014 at 3:39 PM, Evan Chan wrote:
>> The problem is that plugins are not equivalent. There is AFAIK no
&
rk clients. But I do agree to only
> keep one if there is a promising way to generate correct configuration from
> the other.
>
> -Shengzhe
>
>
> On Tue, Feb 25, 2014 at 3:20 PM, Evan Chan wrote:
>
>> The correct way to exclude dependencies in SBT is actually to d
other from different transitive dependencies.
>> >
>> > AFIAK we are only using the shade plug-in to deal with conflict
>> > resolution in the assembly jar. These are dealt with in sbt via the
>> > sbt assembly plug-in in an identical way. Is there a difference?
>>
>> I am bringing up the Sharder, because it is an awful hack, which is can't
>> be
>> used in real controlled deployment.
>>
>> Cos
>>
>> > [1]
>> https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=blob;f=bigtop-packages/src/common/spark/do-component-build;h=428540e0f6aa56cd7e78eb1c831aa7fe9496a08f;hb=master
>>
--
--
Evan Chan
Staff Engineer
e...@ooyala.com |
59 matches
Mail list logo