Hi, I am wondering if it's a bug or not.
I do have a lot of json files, where they have some columns that are all
"null" on.
I start spark with
from pyspark import pandas as ps
import re
import numpy as np
import os
import pandas as pd
from pyspark import SparkContext, SparkConf
from pyspark.sq
(Are you suggesting this is a regression, or is it a general question? here
we're trying to figure out whether there are critical bugs introduced in
3.2.1 vs 3.2.0)
On Fri, Jan 21, 2022 at 1:58 PM Bjørn Jørgensen
wrote:
> Hi, I am wondering if it's a bug or not.
>
> I do have a lot of json files
[x] -1 Do not release this package because, deletes all my columns with
only Null in it.
I have opened https://issues.apache.org/jira/browse/SPARK-37981 for this
bug.
fre. 21. jan. 2022 kl. 21:45 skrev Sean Owen :
> (Are you suggesting this is a regression, or is it a general question?
> here
(Bjorn - unless this is a regression, it would not block a release, even if
it's a bug)
On Fri, Jan 21, 2022 at 5:09 PM Bjørn Jørgensen
wrote:
> [x] -1 Do not release this package because, deletes all my columns with
> only Null in it.
>
> I have opened https://issues.apache.org/jira/browse/SPAR
Ok, but deleting users' data without them knowing it is never a good idea.
That's why I give this RC -1.
lør. 22. jan. 2022 kl. 00:16 skrev Sean Owen :
> (Bjorn - unless this is a regression, it would not block a release, even
> if it's a bug)
>
> On Fri, Jan 21, 2022 at 5:09 PM Bjørn Jørgensen
Continue on the ticket - I am not sure this is established. We would block
a release for critical problems that are not regressions. This is not a
data loss / 'deleting data' issue even if valid.
You're welcome to provide feedback but votes are for the PMC.
On Fri, Jan 21, 2022 at 5:24 PM Bjørn Jø
I closed the ticket as a duplicate of SPARK-29444
This behavior is neither a bug nor a regression and there is already a
documented writer (or global) option that be can be used to modify it.
On 1/22/22 00:47, Sean Owen wrote:
> Continue on the ticket - I am not sure this is established. We would
On Fri, Jan 21, 2022 at 6:48 PM Sean Owen wrote:
> Continue on the ticket - I am not sure this is established. We would block
> a release for critical problems that are not regressions. This is not a
> data loss / 'deleting data' issue even if valid.
> You're welcome to provide feedback but votes
+1 with same result as last time.
On Thu, Jan 20, 2022 at 9:59 PM huaxin gao wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 3.2.1. The vote is open until 8:00pm Pacific time January 25 and passes if
> a majority +1 PMC votes are cast, with a minimum of 3 +1 v