Hi Ryan and Steve,
Thanks very much for your reply.
I was finally able to get Ryan's repo work for me by changing the output
committer to FileOutputFormat instead of ParquetOutputCommitter in spark as
Steve suggested.
However, It is not working for append mode while saving the data frame.
Seems like a great idea to do?
On Fri, Jun 16, 2017 at 12:03 PM, Russell Spitzer wrote:
> I considered adding this to DataSource APIV2 ticket but I didn't want to
> be first :P Do you think there will be any issues with opening up the
> partitioning as well?
>
> On Fri, Jun 16, 2017 at 11:58 AM
I considered adding this to DataSource APIV2 ticket but I didn't want to be
first :P Do you think there will be any issues with opening up the
partitioning as well?
On Fri, Jun 16, 2017 at 11:58 AM Reynold Xin wrote:
> Perhaps we should extend the data source API to support that.
>
>
> On Fri, J
Perhaps we should extend the data source API to support that.
On Fri, Jun 16, 2017 at 11:37 AM, Russell Spitzer wrote:
> I've been trying to work with making Catalyst Cassandra partitioning
> aware. There seem to be two major blocks on this.
>
> The first is that DataSourceScanExec is unable to
I've been trying to work with making Catalyst Cassandra partitioning aware.
There seem to be two major blocks on this.
The first is that DataSourceScanExec is unable to learn what the underlying
partitioning should be from the BaseRelation it comes from. I'm currently
able to get around this by us
I created https://issues.apache.org/jira/browse/SPARK-21123. PR is welcome.
On Thu, Jun 15, 2017 at 10:55 AM, Shixiong(Ryan) Zhu <
shixi...@databricks.com> wrote:
> Good catch. These are file source options. Could you submit a PR to fix
> the doc? Thanks!
>
> On Thu, Jun 15, 2017 at 10:46 AM, Men
Hello,
I have encountered some situation just like what is described above. I am
running a Spark Streaming Application with 2 executors, 16 cores and 10G
memory for each executor and the input topic Kafka has 64 partitions.
My code are like this:
Kafka