I noticed today that our data types APIs (org.apache.spark.sql.types) are
actually DeveloperApis, which means they can be changed from one feature
release to another. In reality these APIs have been there since the
original introduction of the DataFrame API in Spark 1.3, and has not seen
any breaki
Hi All,
I am running some spark scala code on zeppelin on CDH 5.5.1 (Spark version
1.5.0). I customized the Spark interpreter to use org.apache.spark.
serializer.KryoSerializer as spark.serializer. And in the dependency I
added Kyro-3.0.3 as following:
com.esotericsoftware:kryo:3.0.3
When I wro
Hi All,
I am running some spark scala code on zeppelin on CDH 5.5.1 (Spark version
1.5.0). I customized the Spark interpreter to use
org.apache.spark.serializer.KryoSerializer as spark.serializer. And in the
dependency I added Kyro-3.0.3 as following:
com.esotericsoftware:kryo:3.0.3
When I wrot
Hi, Nico,
It sounds like you hit a bug in Phoenix Connector. Our general JDBC
connector already fixed it, I think.
Thanks,
Xiao
2016-10-10 15:29 GMT-07:00 Nico Pappagianis :
> Hi Xiao, when I try that it gets past spark's sql parser then errors out
> at the phoenix sql parser.
>
> org.apache.p
I think it is really important to ensure that someone with a good
understanding of Kafka is empowered around this component with a formal
voice around - but I don't have much dev experience with our Kafka
connectors so I can't speak to the specifics around it personally.
More generally, I also fee
If someone wants to tell me that it's OK and "The Apache Way" for
Kafka and Flink to have a proposal process that ends in a lazy
majority, but it's not OK for Spark to have a proposal process that
ends in a non-lazy consensus...
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+P
There is a larger issue to keep in mind, and that is that what you are
proposing is a procedure that, as far as I am aware, hasn't previously been
adopted in an Apache project, and thus is not an easy or exact fit with
established practices that have been blessed as "The Apache Way". As such,
we n
I'm not a fan of the SEP acronym. Besides it prior established meaning of
"Somebody else's problem", the are other inappropriate or offensive
connotations such as this Australian slang that often gets shortened to
just "sep": http://www.urbandictionary.com/define.php?term=Seppo
On Sun, Oct 9, 20
If I'm correctly understanding the kind of voting that you are talking
about, then to be accurate, it is only the PMC members that have a vote,
not all committers:
https://www.apache.org/foundation/how-it-works.html#pmc-members
On Mon, Oct 10, 2016 at 12:02 PM, Cody Koeninger wrote:
> I think th
HI, Nico,
We use back ticks to quote it. For example,
CUSTOM_ENTITY.`z02`
Thanks,
Xiao Li
2016-10-10 12:49 GMT-07:00 Nico Pappagianis :
> Hello,
>
> *Some context:*
> I have a Phoenix tenant-specific view named CUSTOM_ENTITY."z02" (Phoenix
> tables can have quotes to specify case-sensitivity)
Hello,
*Some context:*
I have a Phoenix tenant-specific view named CUSTOM_ENTITY."z02" (Phoenix
tables can have quotes to specify case-sensitivity). I am attempting to
write to this table using Spark via a scala script. I am performing the
following read successfully:
val table = """CUSTOM_ENTITY
Updated on github,
https://github.com/koeninger/spark-1/blob/SIP-0/docs/spark-improvement-proposals.md
I believe I've touched on all feedback with the exception of naming,
and API vs Strategy.
Do we want a straw poll on naming?
Matei, are your concerns about api vs strategy addressed if we add a
This is an interesting process proposal; I think it could work well.
-It's got the flavour of the ASF incubator; maybe some of the processes there:
mentor, regular reporting in could help, in particular, help stop the -1 at the
end of the work
-it may also aid collaboration to have a medium live
Agreed with this. As I said before regarding who submits: it's not a normal ASF
process to require contributions to only come from committers. Committers are
of course the only people who can *commit* stuff. But the whole point of an
open source project is that anyone can *contribute* -- indeed,
That seems reasonable to me.
I do not want to see lazy consensus used on one of these proposals
though, I want a clear outcome, i.e. call for a vote, wait at least 72
hours, get three +1s and no vetos.
On Mon, Oct 10, 2016 at 2:15 PM, Ryan Blue wrote:
> Proposal submission: I think we should k
Proposal submission: I think we should keep this as open as possible. If
there is a problem with too many open proposals, then we should tackle that
as a fix rather than excluding participation. Perhaps it will end up that
way, but I think it's worth trying a more open model first.
Majority vs con
Funny, someone from my team talked to me about that idea yesterday.
We use SparkLauncher, but it just calls spark-submit that calls other
scripts that starts a new Java program that tries to submit (in our case in
cluster mode - driver is started in the Spark cluster) and exit.
That make it a chall
I think this is closer to a procedural issue than a code modification
issue, hence why majority. If everyone thinks consensus is better, I
don't care. Again, I don't feel strongly about the way we achieve
clarity, just that we achieve clarity.
On Mon, Oct 10, 2016 at 2:02 PM, Ryan Blue wrote:
>
Sorry, I missed that the proposal includes majority approval. Why majority
instead of consensus? I think we want to build consensus around these
proposals and it makes sense to discuss until no one would veto.
rb
On Mon, Oct 10, 2016 at 11:54 AM, Ryan Blue wrote:
> +1 to votes to approve propos
I think the main value is in being honest about what's going on. No
one other than committers can cast a meaningful vote, that's the
reality. Beyond that, if people think it's more open to allow formal
proposals from anyone, I'm not necessarily against it, but my main
question would be this:
If
+1 to votes to approve proposals. I agree that proposals should have an
official mechanism to be accepted, and a vote is an established means of
doing that well. I like that it includes a period to review the proposal
and I think proposals should have been discussed enough ahead of a vote to
surviv
Just folks who don't want to use spark-submit, no real use-cases I've seen
yet.
I didn't know about SparkLauncher myself and I don't think there are any
official docs on that or launching spark as an embedded library for tests.
On Mon, Oct 10, 2016 at 11:09 AM Matei Zaharia
wrote:
> What are th
What are the main use cases you've seen for this? Maybe we can add a page to
the docs about how to launch Spark as an embedded library.
Matei
> On Oct 10, 2016, at 10:21 AM, Russell Spitzer
> wrote:
>
> I actually had not seen SparkLauncher before, that looks pretty great :)
>
> On Mon, Oct
I actually had not seen SparkLauncher before, that looks pretty great :)
On Mon, Oct 10, 2016 at 10:17 AM Russell Spitzer
wrote:
> I'm definitely only talking about non-embedded uses here as I also use
> embedded Spark (cassandra, and kafka) to run tests. This is almost always
> safe since every
I'm definitely only talking about non-embedded uses here as I also use
embedded Spark (cassandra, and kafka) to run tests. This is almost always
safe since everything is in the same JVM. It's only once we get to
launching against a real distributed env do we end up with issues.
Since Pyspark uses
I have also 'embedded' a Spark driver without much trouble. It isn't that
it can't work.
The Launcher API is ptobably the recommended way to do that though.
spark-submit is the way to go for non programmatic access.
If you're not doing one of those things and it is not working, yeah I think
peopl
I've done this for some pyspark stuff. I didn't find it especially
problematic.
On Mon, Oct 10, 2016 at 12:58 PM, Reynold Xin wrote:
> How are they using it? Calling some main function directly?
>
>
> On Monday, October 10, 2016, Russell Spitzer
> wrote:
>
>> I've seen a variety of users attemp
How are they using it? Calling some main function directly?
On Monday, October 10, 2016, Russell Spitzer
wrote:
> I've seen a variety of users attempting to work around using Spark Submit
> with at best middling levels of success. I think it would be helpful if the
> project had a clear statemen
I've seen a variety of users attempting to work around using Spark Submit
with at best middling levels of success. I think it would be helpful if the
project had a clear statement that submitting an application without using
Spark Submit is truly for experts only or is unsupported entirely.
I know
Hi all,
I have a spark job that takes about an hour to run, in the end it completes
all the task, then the job just hangs and does nothing (it writes to s3 as
the last step, which also gets completed, all files appear on s3).
any ideas how to debug this?
see the thread dump below:
"Attach Lis
Hi All
Is there any way to schedule the ever running spark in such a way that it
comes up on its own , after the cluster maintenance?
--
Thanks
Deepak
www.bigdatabig.com
www.keosha.net
Yes, users suggesting SIPs is a good thing and is explicitly called
out in the linked document under the Who? section. Formally proposing
them, not so much, because of the political realities.
Yes, implementation strategy definitely affects goals. There are all
kinds of examples of this, I'll pi
Hi
I use gradle and I don't think it really has "provided" but I was able to google
and create the following file but the same error still persist.
group 'com.company'version '1.0-SNAPSHOT'
apply plugin: 'java'apply plugin: 'idea'
repositories {mavenCentral()mavenLocal()}
configurations {
Yes I agree. I'm not sure how important this is anyway. It's a little
annoying but easy to work around.
On Mon, 10 Oct 2016 at 09:01 Reynold Xin wrote:
> I just took a quick look and set a target version on the JIRA. But Pete I
> think the primary problem with the JIRA and pull request is that i
I just took a quick look and set a target version on the JIRA. But Pete I
think the primary problem with the JIRA and pull request is that it really
just argues (or implements) opening up a private API, which is a valid
point but there are a lot more that needs to be done before making some
private
35 matches
Mail list logo