date:20170619

Re: Output Committers for S3

2017-06-19 Thread sririshindra

Is there anything similar to s3 connector for Google cloud storage? Since Google cloud Storage is also an object store rather than a file system, I imagine the same problem that the s3 connector is trying to solve arises with google cloud storage as well. Thanks, rishi -- View this message in c

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-19 Thread Xiao Li

Found another bug about the case preserving of column names of persistent views. This regression was introduced in 2.2. https://issues.apache.org/jira/browse/SPARK-21150 Thanks, Xiao 2017-06-19 8:03 GMT-07:00 Liang-Chi Hsieh : > > I mean it is not a bug has been fixed before this feature added.

the meaning of partition column and bucket column please?

2017-06-19 Thread ??????????

Hi all, The code of Column has member named isPartition and isBucket. What is the meanibg of them please? And when should set them as true please? Thank you advanced. Fei Shao

Re: the scheme in stream reader

2017-06-19 Thread ??????????

Hi , I have submitted a JIRA for this issue. The link is https://issues.apache.org/jira/browse/SPARK-21147 thanks Fei Shao ---Original--- From: "Michael Armbrust" Date: 2017/6/20 03:06:49 To: "??"<1427357...@qq.com>; Cc: "user";"dev"; Subject: Re: the scheme in stream reader The sock

Unsubscribe

2017-06-19 Thread praba karan

Sent from Yahoo Mail on Android

Re: [build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

2017-06-19 Thread shane knapp

i've updated the two ubuntu workers (amp-jenkins-staging-01 and -02), and am still twiddling my thumbs and waiting for centos packages to be released. i'm guessing we'll have those some time today, and will update everyone then. On Mon, Jun 19, 2017 at 11:02 AM, shane knapp wrote: > ok, we're in

Re: cannot call explain or show on dataframe in structured streaming addBatch dataframe

2017-06-19 Thread Michael Armbrust

There is a little bit of weirdness to how we override the default query planner to replace it with an incrementalizing planner. As such, calling any operation that changes the query plan (such as a LIMIT) would cause it to revert to the batch planner and return the wrong answer. We should fix thi

Re: the scheme in stream reader

2017-06-19 Thread Michael Armbrust

The socket source can't know how to parse your data. I think the right thing would be for it to throw an exception saying that you can't set the schema here. Would you mind opening a JIRA ticket? If you are trying to parse data from something like JSON then you should use from_json` on the value

Re: [build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

2017-06-19 Thread shane knapp

ok, we're in a holding pattern as the centos packages haven't been released yet. once they're out i'll update this thread and start rebooting. On Mon, Jun 19, 2017 at 10:52 AM, shane knapp wrote: > jenkins is affected: > > https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt > https://a

[build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

2017-06-19 Thread shane knapp

jenkins is affected: https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt https://access.redhat.com/security/vulnerabilities/stackguard i'm shutting down jenkins, applying patches and rebooting immediately. ETA unknown. hopefully quick. i'll update here when i find out. -

Re: Question: why is Externalizable used?

2017-06-19 Thread Reynold Xin

I responded on the ticket. On Mon, Jun 19, 2017 at 2:36 AM, Sean Owen wrote: > Just wanted to call attention to this question, mostly because I'm curious: > https://github.com/apache/spark/pull/18343#issuecomment-309388668 > > Why is Externalizable (+ KryoSerializable) used instead of Serializa

cannot call explain or show on dataframe in structured streaming addBatch dataframe

2017-06-19 Thread assaf.mendelson

Hi all, I am playing around with structured streaming and looked at the code for ConsoleSink. I see the code has: data.sparkSession.createDataFrame( data.sparkSession.sparkContext.parallelize(data.collect()), data.schema) .show(numRowsToShow, isTruncated) } I was wondering why it does

Re: Output Committers for S3

2017-06-19 Thread Ryan Blue

I agree, the problem is that Spark is trying to be safe and avoid the direct committer. We also modify Spark to avoid its logic. We added a property that causes Spark to always use the output committer if the destination is in S3. Our committers are also slightly different and will get an AmazonS3

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-19 Thread Liang-Chi Hsieh

I mean it is not a bug has been fixed before this feature added. Of course kryo serializer with 2000+ partitions are working before this feature. Koert Kuipers wrote > If a feature added recently breaks using kryo serializer with 2000+ > partitions then how can it not be a regression? I mean I u

Unsubscribe

2017-06-19 Thread vijendra rana

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-19 Thread Koert Kuipers

If a feature added recently breaks using kryo serializer with 2000+ partitions then how can it not be a regression? I mean I use kryo with more than 2000 partitions all the time, and it worked before. Or was I simply not hitting this bug because there are other conditions that also need to be satis

Question: why is Externalizable used?

2017-06-19 Thread Sean Owen

Just wanted to call attention to this question, mostly because I'm curious: https://github.com/apache/spark/pull/18343#issuecomment-309388668 Why is Externalizable (+ KryoSerializable) used instead of Serializable? and should the first two always go together?

Re: Output Committers for S3

Re: [VOTE] Apache Spark 2.2.0 (RC4)

the meaning of partition column and bucket column please?

Re: the scheme in stream reader

Unsubscribe

Re: [build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

Re: cannot call explain or show on dataframe in structured streaming addBatch dataframe

Re: the scheme in stream reader

Re: [build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

[build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

Re: Question: why is Externalizable used?

cannot call explain or show on dataframe in structured streaming addBatch dataframe

Re: Output Committers for S3

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Unsubscribe

Re: [VOTE] Apache Spark 2.2.0 (RC4)

Question: why is Externalizable used?

17 matches

Site Navigation

Mail list logo

Footer information