date:20191217

how to get partition column info in Data Source V2 writer

2019-12-17 Thread aakash aakash

Hi Spark dev folks,

First of all kudos on this new Data Source v2, API looks simple and it
makes easy to develop a new data source and use it.

With my current work, I am trying to implement a new data source V2 writer
with Spark 2.3 and I was wondering how I will get the info about partition
by columns. I see that it has been passed to Data Source V1 from
DataFrameWriter but not for V2.


Thanks,
Aakash

Re: how to get partition column info in Data Source V2 writer

2019-12-17 Thread Andrew Melo

Hi Aakash

On Tue, Dec 17, 2019 at 12:42 PM aakash aakash 
wrote:

> Hi Spark dev folks,
>
> First of all kudos on this new Data Source v2, API looks simple and it
> makes easy to develop a new data source and use it.
>
> With my current work, I am trying to implement a new data source V2 writer
> with Spark 2.3 and I was wondering how I will get the info about partition
> by columns. I see that it has been passed to Data Source V1 from
> DataFrameWriter but not for V2.
>

Not directly related to your Q, but just so you're aware, the DSv2 API
evolved from 2.3->2.4 and then again for 2.4->3.0.

Cheers
Andrew


>
>
> Thanks,
> Aakash
>

Re: how to get partition column info in Data Source V2 writer

2019-12-17 Thread aakash aakash

Thanks Andrew!

It seems there is a drastic change in 3.0, going through it.

-Aakash

On Tue, Dec 17, 2019 at 11:01 AM Andrew Melo  wrote:

> Hi Aakash
>
> On Tue, Dec 17, 2019 at 12:42 PM aakash aakash 
> wrote:
>
>> Hi Spark dev folks,
>>
>> First of all kudos on this new Data Source v2, API looks simple and it
>> makes easy to develop a new data source and use it.
>>
>> With my current work, I am trying to implement a new data source V2
>> writer with Spark 2.3 and I was wondering how I will get the info about
>> partition by columns. I see that it has been passed to Data Source V1 from
>> DataFrameWriter but not for V2.
>>
>
> Not directly related to your Q, but just so you're aware, the DSv2 API
> evolved from 2.3->2.4 and then again for 2.4->3.0.
>
> Cheers
> Andrew
>
>
>>
>>
>> Thanks,
>> Aakash
>>
>

Re: [VOTE] SPARK 3.0.0-preview2 (RC2)

2019-12-17 Thread Sean Owen

Same result as last time. It all looks good and tests pass for me on
Ubuntu with all profiles enables (Hadoop 3.2 + Hive 2.3), building
from source.
'pyspark-3.0.0.dev2.tar.gz' appears to be the desired python artifact name, yes.
+1

On Tue, Dec 17, 2019 at 12:36 AM Yuming Wang  wrote:
>
> Please vote on releasing the following candidate as Apache Spark version 
> 3.0.0-preview2.
>
> The vote is open until December 20 PST and passes if a majority +1 PMC votes 
> are cast, with
> a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.0.0-preview2
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v3.0.0-preview2-rc2 (commit 
> bcadd5c3096109878fe26fb0d57a9b7d6fdaa257):
> https://github.com/apache/spark/tree/v3.0.0-preview2-rc2
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.0.0-preview2-rc2-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1338/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.0.0-preview2-rc2-docs/
>
> The list of bug fixes going into 3.0.0 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12339177
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with an out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 3.0.0?
> ===
>
> The current list of open tickets targeted at 3.0.0 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target 
> Version/s" = 3.0.0
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

how to get partition column info in Data Source V2 writer

Re: how to get partition column info in Data Source V2 writer

Re: how to get partition column info in Data Source V2 writer

Re: [VOTE] SPARK 3.0.0-preview2 (RC2)

4 matches

Site Navigation

Mail list logo

Footer information