> On Oct. 20, 2018, 8:57 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/join45.q.out
> > Line 717 (original), 717 (patched)
> > <https://reviews.apache.org/r/69019/diff/2/?file=2101649#file2101649line717>
> >
> >     As discussed this should have been
> >     (struct(cast (_col0 as double), cast(_col2 as double))) IN (const 
> > struct(100.0D,100.0D), const struct(101.0D,101.0D), const 
> > struct(102.0D,102.0D))
> 
> Zoltan Haindrich wrote:
>     * column has type string.
>     * the IN statement may have the same type for all element in this case 
> it's int...
>     * in this case to change the left hand side to double: it may work
>     
>     
>     but what should happen in case the IN operands contain 1 int, 1 decimal 
> and 1 string ?
>     
>     note: during UDF evaluations there is a Set with all the values; and 
> "contains" is used - so if the inferred type is not a numeric: 
>     
>     I've just checked that even the standard IN doesn't work like this:
>     
>     ```
>     create table t (a string);
>     
>     insert into t values ('1'),('x'),('2.0');
>     
>     select * from t where a in (1.0,'x',2);
>     1
>     2.0
>     -- it doesn't return 'x' because it's casted to double
>     
>     ```
>     
>     I'm starting to think that it would be better to do this with the IN 
> unwinded into ORs: so that we could do the one-on-one constant checks and 
> then pointlookupoptimizer might collapse them if the types are the same - in 
> this case I think we would not loose 'x' in the above case - and it would 
> also make this whole recursive typecheck unneccessary.
> 
> Ashutosh Chauhan wrote:
>     We shall strive to match semantics which is already there. In Hive, for 
> expr str_col = 12, we get cast of double on both sides. So, by extensions IN 
> clause should do the same IFF all types are same. If types aren't same I 
> agree what you are proposing ie. keep it as unwinded ORs.

Added logic to unwind ors earlier; it fixes the double casts by relys on the 
existing logic for equals; and makes the wierd type `(1.0,'x',2)` issue go away 
as well as `struct(null,1)` cases


- Zoltan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69019/#review209827
-----------------------------------------------------------


On Oct. 20, 2018, 7:51 a.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69019/
> -----------------------------------------------------------
> 
> (Updated Oct. 20, 2018, 7:51 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Vineet Garg.
> 
> 
> Bugs: HIVE-20617
>     https://issues.apache.org/jira/browse/HIVE-20617
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> For IN expressions the types were never corrected; and pointlookupoptimizer 
> was probably leaving behind fields already which were uncomparable; 
> HIVE-20296 exposed it further by changing the minimal number from  32 to 2.
> 
> This change generalizes the retyping of constants to also run it for the IN 
> operator ; and also for struct-s.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/type/HiveChar.java 29dc06dca1 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java e7d71595c7 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
> 4968d16876 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java 
> c274fd7cc9 
>   ql/src/test/queries/clientpositive/in_typecheck_char.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/in_typecheck_pointlook.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/in_typecheck_varchar.q PRE-CREATION 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out f6c3c5642e 
>   ql/src/test/results/clientpositive/in_typecheck2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck_char.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck_pointlook.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck_varchar.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/infer_const_type.q.out e1d7de5422 
>   ql/src/test/results/clientpositive/join45.q.out 47aaf7d0ab 
>   ql/src/test/results/clientpositive/join47.q.out 4d9e937815 
>   ql/src/test/results/clientpositive/llap/dec_str.q.out 554031e952 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out f240468558 
>   ql/src/test/results/clientpositive/llap/lineage3.q.out cf38816127 
>   ql/src/test/results/clientpositive/llap/vectorization_13.q.out 4ce654f960 
>   ql/src/test/results/clientpositive/llap/vectorization_6.q.out a2f730beca 
>   ql/src/test/results/clientpositive/llap/vectorization_8.q.out 21ce7b8ebd 
>   ql/src/test/results/clientpositive/llap/vectorization_short_regress.q.out 
> 7f1c6a295e 
>   ql/src/test/results/clientpositive/mapjoin47.q.out 294dd69de5 
>   ql/src/test/results/clientpositive/parquet_vectorization_13.q.out 
> 0efce98b55 
>   ql/src/test/results/clientpositive/parquet_vectorization_6.q.out 0bb6888364 
>   ql/src/test/results/clientpositive/parquet_vectorization_8.q.out 957bd7b264 
>   ql/src/test/results/clientpositive/ppd_udf_col.q.out 814fb5afcf 
>   ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out acf91bf178 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out 
> 3812239343 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out 
> 6108457aad 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out 
> 3352dedc58 
>   ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 
> f5a4c9ad86 
>   ql/src/test/results/clientpositive/spark/subquery_scalar.q.out b3252f5415 
>   ql/src/test/results/clientpositive/spark/vectorization_13.q.out 34ec9c42dd 
>   ql/src/test/results/clientpositive/spark/vectorization_6.q.out 5679bb8cfa 
>   ql/src/test/results/clientpositive/spark/vectorization_short_regress.q.out 
> 231dea6de3 
>   ql/src/test/results/clientpositive/vectorization_13.q.out 8897f8427f 
>   ql/src/test/results/clientpositive/vectorization_6.q.out 8dedb63e7d 
>   ql/src/test/results/clientpositive/vectorization_8.q.out d81df76a2f 
> 
> 
> Diff: https://reviews.apache.org/r/69019/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>

Reply via email to