Is there anything similar to s3 connector for Google cloud storage?
Since Google cloud Storage is also an object store rather than a file
system, I imagine the same problem that the s3 connector is trying to solve
arises with google cloud storage as well.
Thanks,
rishi
--
View this message in c
Found another bug about the case preserving of column names of persistent
views. This regression was introduced in 2.2.
https://issues.apache.org/jira/browse/SPARK-21150
Thanks,
Xiao
2017-06-19 8:03 GMT-07:00 Liang-Chi Hsieh :
>
> I mean it is not a bug has been fixed before this feature added.
Hi all,
The code of Column has member named isPartition and isBucket.
What is the meanibg of them please?
And when should set them as true please?
Thank you advanced.
Fei Shao
Hi ,
I have submitted a JIRA for this issue.
The link is
https://issues.apache.org/jira/browse/SPARK-21147
thanks
Fei Shao
---Original---
From: "Michael Armbrust"
Date: 2017/6/20 03:06:49
To: "??"<1427357...@qq.com>;
Cc: "user";"dev";
Subject: Re: the scheme in stream reader
The sock
Sent from Yahoo Mail on Android
i've updated the two ubuntu workers (amp-jenkins-staging-01 and -02),
and am still twiddling my thumbs and waiting for centos packages to be
released.
i'm guessing we'll have those some time today, and will update everyone then.
On Mon, Jun 19, 2017 at 11:02 AM, shane knapp wrote:
> ok, we're in
There is a little bit of weirdness to how we override the default query
planner to replace it with an incrementalizing planner. As such, calling
any operation that changes the query plan (such as a LIMIT) would cause it
to revert to the batch planner and return the wrong answer. We should fix
thi
The socket source can't know how to parse your data. I think the right
thing would be for it to throw an exception saying that you can't set the
schema here. Would you mind opening a JIRA ticket?
If you are trying to parse data from something like JSON then you should
use from_json` on the value
ok, we're in a holding pattern as the centos packages haven't been released yet.
once they're out i'll update this thread and start rebooting.
On Mon, Jun 19, 2017 at 10:52 AM, shane knapp wrote:
> jenkins is affected:
>
> https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
> https://a
jenkins is affected:
https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
https://access.redhat.com/security/vulnerabilities/stackguard
i'm shutting down jenkins, applying patches and rebooting immediately.
ETA unknown. hopefully quick. i'll update here when i find out.
-
I responded on the ticket.
On Mon, Jun 19, 2017 at 2:36 AM, Sean Owen wrote:
> Just wanted to call attention to this question, mostly because I'm curious:
> https://github.com/apache/spark/pull/18343#issuecomment-309388668
>
> Why is Externalizable (+ KryoSerializable) used instead of Serializa
Hi all,
I am playing around with structured streaming and looked at the code for
ConsoleSink.
I see the code has:
data.sparkSession.createDataFrame(
data.sparkSession.sparkContext.parallelize(data.collect()), data.schema)
.show(numRowsToShow, isTruncated)
}
I was wondering why it does
I agree, the problem is that Spark is trying to be safe and avoid the
direct committer. We also modify Spark to avoid its logic. We added a
property that causes Spark to always use the output committer if the
destination is in S3.
Our committers are also slightly different and will get an AmazonS3
I mean it is not a bug has been fixed before this feature added. Of course
kryo serializer with 2000+ partitions are working before this feature.
Koert Kuipers wrote
> If a feature added recently breaks using kryo serializer with 2000+
> partitions then how can it not be a regression? I mean I u
If a feature added recently breaks using kryo serializer with 2000+
partitions then how can it not be a regression? I mean I use kryo with more
than 2000 partitions all the time, and it worked before. Or was I simply
not hitting this bug because there are other conditions that also need to
be satis
Just wanted to call attention to this question, mostly because I'm curious:
https://github.com/apache/spark/pull/18343#issuecomment-309388668
Why is Externalizable (+ KryoSerializable) used instead of Serializable?
and should the first two always go together?
17 matches
Mail list logo