roduce invalid results. My intuition is yes, because
>>> different users have different levels of tolerance for different kinds of
>>> errors. I’d expect these sorts of configurations to be set up at an
>>> infrastructure level, e.g. to maintain consistent standards throughout
t;> situations that could produce invalid results. My intuition is yes, because
>>> different users have different levels of tolerance for different kinds of
>>> errors. I’d expect these sorts of configurations to be set up at an
>>> infrastructure level, e.g. to maintain co
le organization.
>>
>>
>>
>> *From: *Gengliang Wang
>> *Date: *Thursday, August 1, 2019 at 3:07 AM
>> *To: *Marco Gaido
>> *Cc: *Wenchen Fan , Hyukjin Kwon <
>> gurwls...@gmail.com>, Russell Spitzer , Ryan
>> Blue , Reynold Xin , Matt Che
m: *Gengliang Wang
> *Date: *Thursday, August 1, 2019 at 3:07 AM
> *To: *Marco Gaido
> *Cc: *Wenchen Fan , Hyukjin Kwon ,
> Russell Spitzer , Ryan Blue ,
> Reynold Xin , Matt Cheah ,
> Takeshi Yamamuro , Spark dev list <
> dev@spark.apache.org>
> *Subject: *Re: [Discuss] Fo
ido
Cc: Wenchen Fan , Hyukjin Kwon ,
Russell Spitzer , Ryan Blue ,
Reynold Xin , Matt Cheah , Takeshi
Yamamuro , Spark dev list
Subject: Re: [Discuss] Follow ANSI SQL on table insertion
Hi all,
Let me explain a little bit on the proposal.
By default, we follow the store assignmen
arranted to do so.
>>
>>
>>
>> -Matt Cheah
>>
>>
>>
>> *From: *Reynold Xin
>> *Date: *Wednesday, July 31, 2019 at 9:58 AM
>> *To: *Matt Cheah
>> *Cc: *Russell Spitzer , Takeshi Yamamuro <
>> linguin@gmail.com>, Ge
w...@databricks.com >, Ryan Blue <
> >rb...@netflix.com
> >, Spark dev list < dev@spark.apache.org >, Hyukjin Kwon < gurwls...@gmail.com
> >, Wenchen Fan < cloud0...@gmail.com >
> *Subject:* Re: [Discuss] Follow ANSI SQL on table insertion
>
>
>
Date: Wednesday, July 31, 2019 at 9:58 AM
To: Matt Cheah
Cc: Russell Spitzer , Takeshi Yamamuro
, Gengliang Wang , Ryan
Blue , Spark dev list , Hyukjin Kwon
, Wenchen Fan
Subject: Re: [Discuss] Follow ANSI SQL on table insertion
Matt what do you mean by maximizing 3, while allowing not
n Fan < cloud0...@gmail.com >
> *Cc:* Russell Spitzer < russell.spit...@gmail.com >, Takeshi Yamamuro <
> linguin@gmail.com
> >, Gengliang Wang < gengliang.w...@databricks.com >, Ryan Blue <
> >rb...@netflix.com
> >, Spark dev list < dev@
perhaps the behavior can be flagged
by the destination writer at write time.
-Matt Cheah
From: Hyukjin Kwon
Date: Monday, July 29, 2019 at 11:33 PM
To: Wenchen Fan
Cc: Russell Spitzer , Takeshi Yamamuro
, Gengliang Wang , Ryan
Blue , Spark dev list
Subject: Re: [Discuss] Follow ANSI SQL
>From my look, +1 on the proposal, considering ASCI and other DBMSes in
general.
2019년 7월 30일 (화) 오후 3:21, Wenchen Fan 님이 작성:
> We can add a config for a certain behavior if it makes sense, but the most
> important thing we want to reach an agreement here is: what should be the
> default behavior
We can add a config for a certain behavior if it makes sense, but the most
important thing we want to reach an agreement here is: what should be the
default behavior?
Let's explore the solution space of table insertion behavior first:
At compile time,
1. always add cast
2. add cast following the A
I understand spark is making the decisions, i'm say the actual final effect
of the null decision would be different depending on the insertion target
if the target has different behaviors for null.
On Mon, Jul 29, 2019 at 5:26 AM Wenchen Fan wrote:
> > I'm a big -1 on null values for invalid cas
> I'm a big -1 on null values for invalid casts.
This is why we want to introduce the ANSI mode, so that invalid cast fails
at runtime. But we have to keep the null behavior for a while, to keep
backward compatibility. Spark returns null for invalid cast since the first
day of Spark SQL, we can't
I'm a big -1 on null values for invalid casts. This can lead to a lot of
even more unexpected errors and runtime behavior since null is
1. Not allowed in all schemas (Leading to a runtime error anyway)
2. Is the same as delete in some systems (leading to data loss)
And this would be dependent on
Hi, all
+1 for implementing this new store cast mode.
>From a viewpoint of DBMS users, this cast is pretty common for INSERTs and
I think this functionality could
promote migrations from existing DBMSs to Spark.
The most important thing for DBMS users is that they could optionally
choose this mod
Hi Ryan,
Thanks for the suggestions on the proposal and doc.
Currently, there is no data type validation in table insertion of V1. We
are on the same page that we should improve it. But using UpCast is from
one extreme to another. It is possible that many queries are broken after
upgrading to Spar
I don't agree with handling literal values specially. Although Postgres
does it, I can't find anything about it in the SQL standard. And it
introduces inconsistent behaviors which may be strange to users:
* What about something like "INSERT INTO t SELECT float_col + 1.1"?
* The same insert with a d
I don’t think this is a good idea. Following the ANSI standard is usually
fine, but here it would *silently corrupt data*.
>From your proposal doc, ANSI allows implicitly casting from long to int
(any numeric type to any other numeric type) and inserts NULL when a value
overflows. That would drop
I have heard about many complaints about the old table insertion behavior.
Blindly casting everything will leak the user mistake to a late stage of
the data pipeline, and make it very hard to debug. When a user writes
string values to an int column, it's probably a mistake and the columns are
misor
20 matches
Mail list logo