Hello,
I wish to write a custom logical plan rule that modifies the output schema and
grows the logical plan. The purpose of the rule is roughly to apply a
projection on top of DatasourceV2Relation depending on some condition:
case class MyRule extends Rule[LogicalPlan] {
override def apply(
Congratulation to both.
Holden, we need catch up.
Chester Chen
■ Senior Manager – Data Science & Engineering
3000 Clearview Way
San Mateo, CA 94402
[cid:image001.png@01D27678.9466E4D0]
From: Felix Cheung
Date: Tuesday, January 24, 2017 at 1:20 PM
To: Reynold Xin , "dev@spark.a
vote for Option 1.
1) Since 2.0 is major API, we are expecting some API changes,
2) It helps long term code base maintenance with short term pain on Java
side
3) Not quite sure how large the code base is using Java DataFrame APIs.
On Thu, Feb 25, 2016 at 3:23 PM, Reynold Xin wrote:
>
Jerry
I thought you should not create more than one SparkContext within one Jvm,
...
Chester
Sent from my iPhone
> On Dec 20, 2015, at 2:59 PM, Jerry Lam wrote:
>
> Hi Spark developers,
>
> I found that SQLContext.getOrCreate(sc: SparkContext) does not behave
>
For the 2nd use case, can you save the result for first 29 days, then just get
the last day result and add yourself ? This can be done outside of spark. Does
that work for you
Sent from my iPad
> On Nov 25, 2015, at 9:46 PM, Sachith Withana wrote:
>
> Hi folks!
>
> I'm wondering if Sparks
-cluster spark 1.3.1, but could not get spark
1.5.1 started. We upgrade the client to CDH5.4, then everything works.
There are API changes between Apache 2.4 and 2.6, not sure you can mix
match them.
Chester
On Fri, Nov 20, 2015 at 1:59 PM, Sandy Ryza wrote:
> To answer your fourth quest
. Company will have enough
time to upgrade cluster.
+1 for me as well
Chester
Sent from my iPad
> On Nov 19, 2015, at 2:14 PM, Reynold Xin wrote:
>
> I proposed dropping support for Hadoop 1.x in the Spark 2.0 email, and I
> think everybody is for that.
>
> https://issues.apac
+1
Test against CDH5.4.2 with hadoop 2.6.0 version using yesterday's code,
build locally.
Regression running in Yarn Cluster mode against few internal ML ( logistic
regression, linear regression, random forest and statistic summary) as well
Mlib KMeans. all seems to work fine.
Chester
O
Thanks for the ticket.
Chester
On Thu, Oct 22, 2015 at 1:15 PM, Steve Loughran
wrote:
>
> On 22 Oct 2015, at 19:32, Chester Chen wrote:
>
> Steven
> You summarized mostly correct. But there is a couple points I want
> to emphasize.
>
> Not eve
#x27;s hive-exec
and orga.apache.hadoop.hive hive-exec behave differently for the same
method.
Chester
On Thu, Oct 22, 2015 at 10:18 AM, Charmee Patel wrote:
> A similar issue occurs when interacting with Hive secured by Sentry.
> https://issues.apache.org/jira/browse/SPARK-904
back and estimation by doing this.
This is bit off the original topic.
I still think there is a bug related to the spark yarn client in case of
Kerberos + spark hive-exec dependency.
Chester
Sent from my iPad
> On Oct 22, 2015, at 12:05 AM, Doug Balog wrote:
>
>
>
hadoop cluster). The job submission actually failed in the client side.
Currently we get around this by replace the spark's hive-exec with
apache hive-exec.
Chester
On Wed, Oct 21, 2015 at 5:27 PM, Doug Balog wrote:
> See comments below.
>
> > On Oct 21, 2015, at 5:33 PM, Ches
found " + e); return }
case e: Exception => { logError("Unexpected Exception " + e)
throw new RuntimeException("Unexpected exception", e)
}
}
}
thanks
Chester
, 2015 at 2:44 PM, Ted Yu wrote:
> Have you set MAVEN_OPTS with the following ?
> -Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m
>
> Cheers
>
> On Sat, Oct 17, 2015 at 2:35 PM, Chester Chen
> wrote:
>
>> I was using jdk 1.7 and maven version is the same
skip, with mvn build,
it fails with
[ERROR] PermGen space -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging
I am giving up on this. Just using 1.5.2-SNAPSHOT for now.
Che
ncies path:
[warn] org.apache.spark:spark-network-common_2.10:1.5.1
((com.typesafe.sbt.pom.MavenHelper) MavenHelper.scala#L76)
[warn] +- org.apache.spark:spark-network-shuffle_2.10:1.5.1
[info] Packaging
/Users/chester/projects/alpine/apache/spark/launcher/target/scala-2.10/spark-launcher_2.10
release would build from
> 1.5.0 before moving to 1.5.1. Are you saying the 1.5.0 rc3 could build from
> 1.5.1 snapshot during release ? Or 1.5.0 rc3 would build from the last
> commit of 1.5.0 (before changing to 1.5.1 snapshot) ?
> >>>
> >>>
> >>>
>
s correct for the 1.5 branch, right? this doesn't mean that the
>>> next RC would have this value. You choose the release version during
>>> the release process.
>>>
>>>> On Tue, Sep 1, 2015 at 2:40 AM, Chester Chen wrote:
>>>> Seems that Githu
On Sep 1, 2015, at 1:52 AM, Sean Owen wrote:
>
> That's correct for the 1.5 branch, right? this doesn't mean that the
> next RC would have this value. You choose the release version during
> the release process.
>
>> On Tue, Sep 1, 2015 at 2:40 AM, Chester Chen
Seems that Github branch-1.5 already changing the version to
1.5.1-SNAPSHOT,
I am a bit confused are we still on 1.5.0 RC3 or we are in 1.5.1 ?
Chester
On Mon, Aug 31, 2015 at 3:52 PM, Reynold Xin wrote:
> I'm going to -1 the release myself since the issue @yhuai identified is
Ashish and Steve
I am also working on the long running Yarn Spark Job. Just start to
focus on failure recovery. This thread of discussion is really helpful.
Chester
On Fri, Aug 28, 2015 at 12:53 AM, Ashish Rawat
wrote:
> Thanks Steve. I had not spent many brain cycles on analysing
Congratulations to All.
DB and Sandy, great works !
On Wed, Jun 17, 2015 at 3:12 PM, Matei Zaharia
wrote:
> Hey all,
>
> Over the past 1.5 months we added a number of new committers to the
> project, and I wanted to welcome them now that all of their respective
> forms, accounts, etc are in. J
I put the design requirements and description in the commit comment. So I
will close the PR. please refer the following commit
https://github.com/AlpineNow/spark/commit/5b336bbfe92eabca7f4c20e5d49e51bb3721da4d
On Mon, May 25, 2015 at 3:21 PM, Chester Chen wrote:
> All,
> I have c
helps the discussion
Chester
On Fri, May 22, 2015 at 10:55 AM, Kevin Markey
wrote:
> Thanks. We'll look at it.
> I've sent another reply addressing some of your other comments.
> Kevin
>
>
> On 05/22/2015 10:27 AM, Marcelo Vanzin wrote:
>
> Hi Kevin,
>
&g
uster
or error messages directly in the application log.
I will put some design doc and actual code in my pull request later, as
Andrew requested. This PR is unlikely to get merge in, but it will show the
idea I am talking about here.
Thanks for listening and responding
Che
.
Thanks
Chester
Sent from my iPhone
> On May 13, 2015, at 7:22 PM, Patrick Wendell wrote:
>
> Hey Chester,
>
> Thanks for sending this. It's very helpful to have this list.
>
> The reason we made the Client API private was that it was never
> intende
nning job with additional spark
commands and interactions via this channel.
Chester
Sent from my iPad
On May 12, 2015, at 20:54, Patrick Wendell wrote:
> Hey Kevin and Ron,
>
> So is the main shortcoming of the launcher library the inability to
> get an app
Sounds like you are in Yarn-Cluster mode.
I created a JIRA SPARK-3913
<https://issues.apache.org/jira/browse/SPARK-3913> and PR
https://github.com/apache/spark/pull/2786
is this what you looking for ?
Chester
On Sat, May 2, 2015 at 10:32 PM, Yijie Shen
wrote:
> Hi,
>
> I
these Yarn
Client related class private ? Any possibilities make these Client classes
non-private ?
thanks
Chester
can you just replace "Duration.Inf" with a shorter duration ? how about
import scala.concurrent.duration._
val timeout = new Timeout(10 seconds)
Await.result(result.future, timeout.duration)
or
val timeout = new FiniteDuration(10, TimeUnit.SECONDS)
Await.resu
Reyonld,
Prof Canny gives me the slides yesterday I will posted the link to the
slides to both SF BIg Analytics and SF Machine Learning meetups.
Chester
Sent from my iPad
On Mar 12, 2015, at 22:53, Reynold Xin wrote:
> Thanks for chiming in, John. I missed your meetup last night -
Just in case you are in San Francisco, we are having a meetup by Prof John
Canny
http://www.meetup.com/SF-Big-Analytics/events/220427049/
Chester
Maybe you can ask prof john canny himself:-) as I invited him to give a talk
at Alpine data labs in March's meetup (SF big Analytics & SF machine learning
joined meetup) , 3/11. To be announced in next day or so.
Chester
Sent from my iPhone
> On Feb 9, 2015, at 4:48 PM, "
gen-idea should work. I use it all the time. But use the approach that works
for you
Sent from my iPad
On Nov 18, 2014, at 11:12 PM, "Yiming \(John\) Zhang" wrote:
> Hi Chester, thank you for your reply. But I tried this approach and it
> failed. It seems that there are
For sbt
You can simplify run
sbt/sbt gen-idea
To generate the IntelliJ idea project module for you. You can the just open the
generated project, which includes all the needed dependencies
Sent from my iPhone
> On Nov 18, 2014, at 8:26 PM, Chen He wrote:
>
> Thank you Yiming. It is helpful.
ActorSystem, if it returns with
correct identified message; then you can act on it, otherwise, ...
hope this helps
Chester
On Wed, Oct 15, 2014 at 1:38 PM, Matthew Cheah
wrote:
> What's happening when I do this is that the Worker tries to get the Master
> actor by calling context.act
We were using it until recently, we are talking to our customers and see if
we can get off it.
Chester
Alpine Data Labs
On Tue, Sep 9, 2014 at 10:59 AM, Sean Owen wrote:
> FWIW consensus from Cloudera folk seems to be that there's no need or
> demand on this end for YARN alpha.
t. So
far it does what we wants .
Hope this helps
Chester
Sent from my iPhone
> On Aug 29, 2014, at 2:36 AM, Archit Thakur wrote:
>
> including u...@spark.apache.org.
>
>
>> On Fri, Aug 29, 2014 at 2:03 PM, Archit Thakur
>> wrote:
>> Hi,
>>
>&
Mridul,
Thanks for the suggestion.
I just updated the build today and changed the yarn/alpha/pom.xml to
1.1.1-SNAPSHOT
then the command worked.
I will create a JIRA and PR for it.
Chester
On Thu, Aug 21, 2014 at 8:03 AM, Chester @work
wrote:
> Do we have Jenkins te
been updated properly in
> the 1.1 branch.
>
> Just change version to '1.1.1-SNAPSHOT' for yarn/alpha/pom.xml (to
> make it same as any other pom).
>
>
> Regards,
> Mridul
>
>
>> On Thu, Aug 21, 2014 at 5:09 AM, Chester Chen wrote:
>> I just upda
Just tried on master branch, and the master branch works fine for yarn-alpha
On Wed, Aug 20, 2014 at 4:39 PM, Chester Chen wrote:
> I just updated today's build and tried branch-1.1 for both yarn and
> yarn-alpha.
>
> For yarn build, this command seem to work fine.
&
ideas
Chester
᚛ |branch-1.1|$ *sbt/sbt -Pyarn-alpha -Dhadoop.version=2.0.5-alpha
projects*
Using /Library/Java/JavaVirtualMachines/1.6.0_51-b11-457.jdk/Contents/Home
as default JAVA_HOME.
Note, this will be overridden by -java-home if it is set.
[info] Loading project definition from
/Users/ch
Works for me as well:
git branch
branch-0.9
branch-1.0
* master
Chesters-MacBook-Pro:spark chester$ git pull --rebase
remote: Counting objects: 578, done.
remote: Compressing objects: 100% (369/369), done.
remote: Total 578 (delta 122), reused 418 (delta 71)
Receiving objects: 100
citly. In fact
> I think you can just call to ClientBase for this? PR it, I say.
>
> On Thu, Jul 17, 2014 at 3:24 PM, Chester Chen
> wrote:
> > val knownDefMRAppCP: Seq[String] =
> > getFieldValue[String, Seq[String]](classOf[MRJobConfig],
> >
> &
the compile error, and are you setting yarn.version? the
> > default is to use hadoop.version, but that defaults to 1.0.4 and there
> > is no such YARN.
> >
> > Unless I missed it, I only see compile errors in yarn-stable, and you
> > are trying to compile vs YARN
ers/chester/projects/spark/
[info]assembly
[info]bagel
[info]catalyst
[info]core
[info]examples
[info]graphx
[info]hive
[info]mllib
[info]oldDeps
[info]repl
[info]spark
[info]sql
[info]streaming
[info]streaming-flume
[i
Hmm
looks like a Build script issue:
I run the command with :
sbt/sbt clean *yarn/*test:compile
but errors came from
[error] 40 errors found
[error] (*yarn-stable*/compile:compile) Compilation failed
Chester
On Wed, Jul 16, 2014 at 5:18 PM, Chester Chen wrote:
> Hi, Sandy
>
>
OME.
Note, this will be overridden by -java-home if it is set.
[info] Loading project definition from
/Users/chester/projects/spark/project/project
[info] Loading project definition from
/Users/chester/.sbt/0.13/staging/ec3aa8f39111944cc5f2/sbt-pom-reader/project
[warn] Multiple resolvers havin
Sung chung from alpine data labs presented the random Forrest implementation at
Spark summit 2014. The work will be open sourced and contributed back to MLLib.
Stay tuned
Sent from my iPad
On Jul 11, 2014, at 6:02 AM, Egor Pahomov wrote:
> Hi, I have intern, who wants to implement some ML
ctive query jobs.
This gives me some thing to start with. I will try to with Akka first. Will
let community know once we got somewhere.
thanks
Chester
On Sun, Jun 29, 2014 at 11:07 PM, Reynold Xin wrote:
> This isn't exactly about Spark itself, more about how an application on
>
a approach ? Alternatives ?
* Is there a way to get Spark's Akka host and port from Yarn Resource
Manager to Yarn Client ?
Any suggestions welcome
Thanks
Chester
Based on typesafe config maintainer's response, with latest version of
typeconfig, the double quote is no longer needed for key like
spark.speculation, so you don't need code to strip the quotes
Chester
Alpine data labs
Sent from my iPhone
On Mar 12, 2014, at 2:50 PM, Aaron David
with the same build.scala.
We have being use this setup for last 6 months. The build includes different
versions of Hadoop as well as spark. Hope this helps
Chester
Sent from my iPhone
On Feb 25, 2014, at 4:36 PM, Sandy Ryza wrote:
> To perhaps restate what some have said, Maven is by
53 matches
Mail list logo