> On Oct. 19, 2018, 9:54 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java
> > Line 1172 (original), 1181 (patched)
> > <https://reviews.apache.org/r/69019/diff/1/?file=2097832#file2097832line1195>
> >
> >     This TODO will be good to resolve.
> >     We have type already so we can return null constant of appropriate type 
> > here, no?

sure; seemed unrelated - and I'm was not sure if it would havock in the q.out-s 
or not :)
I hope not - fixed


> On Oct. 19, 2018, 9:54 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java
> > Lines 1292-1294 (original), 1416-1418 (patched)
> > <https://reviews.apache.org/r/69019/diff/1/?file=2097832#file2097832line1434>
> >
> >     Not sure this logic is incorrect. We shall not coerce the length of 
> > constant to be same as length of type. 
> >     
> >     Comparisons for 2 char types happens on stripped values. So, if 
> > constant is of smaller length then this probably won't be a problem but is 
> > unnecessary. However, if constant is longer, looks like HiveChar will 
> > truncate it and then comparison likely will be wrong. Better is to create 
> > constant char of same length as of string.

I've added a method which returns null if the value is not represntable as the 
target char(n) type. I feel that it also solves this problem.


> On Oct. 19, 2018, 9:54 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/in_typecheck1.q
> > Lines 7 (patched)
> > <https://reviews.apache.org/r/69019/diff/1/?file=2097833#file2097833line7>
> >
> >     Add: 
> >     select * from ax where t = 'a         ';
> >     select * from ax where t = 'a          ';
> >     select * from ax where t = 'a          d';
> >     
> >     RHS constant is of length 10,11,12. I expect first and second to return 
> > 2 rows, while third to return 0 rows.
> >     
> >     When t is varchar all 3 should return 0 rows.

there was an issue with the last statement:

`select * from ax where t = 'a          d'` was returning 2 rows even without 
the current patch - fixed.


> On Oct. 19, 2018, 9:54 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/in_typecheck1.q.out
> > Lines 42 (patched)
> > <https://reviews.apache.org/r/69019/diff/1/?file=2097838#file2097838line42>
> >
> >     This doesn't look correct. For char/varchar comparisons doesn't 
> > preserve trailing spaces. e.g., 
> >     create table t1 (a char(3), b varchar(4));
> >     insert into t1 values ('a','b');
> >     select * from t1 where b = 'b';
> >     a       b
> >     select * from t1 where a = 'a';
> >     a   b
> >     
> >     Got this one on both postgres and oracle.
> >     whereas this change would return empty results in Hive i believe

Yeah these are somewhat odd at first...I think the same codepath is in action 
which is during printout.

The comparision is done without spaces; all the char/varchar test now seem to 
be doiung good - so I think we don't have any incorrect results.

Actually psql also displays `char(3)` with 2 trailing whitespaces if the value 
is for example 'a'.


> On Oct. 19, 2018, 9:54 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/join45.q.out
> > Line 717 (original), 717 (patched)
> > <https://reviews.apache.org/r/69019/diff/1/?file=2097841#file2097841line717>
> >
> >     comparison of str and integer is done in double. See e.g, 
> > infer_const_type.q.out where there is (UDFToDouble(str) = 1234.0D
> >     However, inside IN we are now casting constant to string. 
> >     Granted casting to double in this scenario is debatable but Hive had 
> > this behavior since beginning and it need to be consistent inside IN or 
> > with direct comparison.

restored original behaviour; interestingly - this was the sideeffect that I've 
set a new arraylist with the same contet to a desc...it might probably worth 
investigating later


- Zoltan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69019/#review209778
-----------------------------------------------------------


On Oct. 15, 2018, 8:05 a.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69019/
> -----------------------------------------------------------
> 
> (Updated Oct. 15, 2018, 8:05 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-20617
>     https://issues.apache.org/jira/browse/HIVE-20617
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> For IN expressions the types were never corrected; and pointlookupoptimizer 
> was probably leaving behind fields already which were uncomparable; 
> HIVE-20296 exposed it further by changing the minimal number from  32 to 2.
> 
> This change generalizes the retyping of constants to also run it for the IN 
> operator ; and also for struct-s.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
> 4968d16876c5c9cc36ec9a3ec48c2740c2c67dcd 
>   ql/src/test/queries/clientpositive/in_typecheck1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/in_typecheck2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 
> 5727f0a65c6e4736f41017e4e962d932dedbd6bd 
>   ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out 
> 43cb5ab89fdebde8be168d7837d8e54a38f4d10b 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out 
> 2073c6b802a1ae0ff4228a86f18ec366ff92ab02 
>   ql/src/test/results/clientpositive/in_typecheck1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/infer_const_type.q.out 
> 4129bd0c715635eb83c0c0d248eb43a5779c7be9 
>   ql/src/test/results/clientpositive/join45.q.out 
> 77dbaa2cd8b5be7158545c696b30dc1068238f91 
>   ql/src/test/results/clientpositive/join47.q.out 
> 2536f7f4b6e9295d1177632b7f32f0b66974e3a4 
>   ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out 
> e61300b5c853eb733d4443c047344e3fc6fe0ff3 
>   ql/src/test/results/clientpositive/llap/dec_str.q.out 
> 3b7f92d735db79f7c4c0d96debe7fff8e3c05f11 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out 
> bc1f97dd49ccc905d5e32d5a02a62bb692d444a6 
>   ql/src/test/results/clientpositive/llap/lineage3.q.out 
> cf388161272002ad6097839ceaea2bcfbcf9b7ef 
>   ql/src/test/results/clientpositive/llap/mapjoin_hint.q.out 
> 3c6270a05240097e9645e664ed30e0568052d98e 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out 
> feadbcd874818a78e2cc30b86cdebc1e4cb6a04f 
>   ql/src/test/results/clientpositive/llap/vectorization_13.q.out 
> 398cb56915f1b24b7c4dc325b60cb114d7ff2b8c 
>   ql/src/test/results/clientpositive/llap/vectorization_6.q.out 
> 70542ac7bd69e46098cf8158cae347e6c896c5b2 
>   ql/src/test/results/clientpositive/llap/vectorization_8.q.out 
> 662409d4f148cc3da5c4f788ddf59c6f40ede572 
>   ql/src/test/results/clientpositive/llap/vectorization_short_regress.q.out 
> a59a586144fc9dc14ac2fa87177c189feae47402 
>   ql/src/test/results/clientpositive/mapjoin47.q.out 
> c42094d7858fa70626a5184485a13fdacd45be7c 
>   ql/src/test/results/clientpositive/parquet_vectorization_13.q.out 
> e60548cb779826c0e50b5086cabe0b1408f4f182 
>   ql/src/test/results/clientpositive/parquet_vectorization_6.q.out 
> 85b075666f9cdcd15b2adbab07d665b8def863c0 
>   ql/src/test/results/clientpositive/parquet_vectorization_8.q.out 
> c089aab12554caa6ea38d1a0d90abf0c050b8ba0 
>   ql/src/test/results/clientpositive/ppd_udf_col.q.out 
> dfc2d0465d49f7c77452113d3791104a05aa42ef 
>   ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out 
> e61300b5c853eb733d4443c047344e3fc6fe0ff3 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out 
> 78a2428cfbd508c768100896955b8853a9bd2a50 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out 
> 362d19c39b60ec2193fc73098787723042f53aab 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out 
> b10b550009d383f84f19ac72207a9c94b12c497d 
>   ql/src/test/results/clientpositive/spark/subquery_scalar.q.out 
> af325200ba7ae7aa5ab0aa42ecf714eaa4afb122 
>   ql/src/test/results/clientpositive/spark/vectorization_13.q.out 
> a49738ecaac6324d529954ca14b952b4898e33dd 
>   ql/src/test/results/clientpositive/spark/vectorization_6.q.out 
> 8f7d3f2530da2afc10e8dfb23339568279d8c454 
>   ql/src/test/results/clientpositive/spark/vectorization_short_regress.q.out 
> 3844c79e1a3e1d37b8b93e4f39bc9b52992888f3 
>   ql/src/test/results/clientpositive/vectorization_13.q.out 
> d2f34481371122b2d6a33578a68310b739eb023f 
>   ql/src/test/results/clientpositive/vectorization_6.q.out 
> 99d917fdfa101cba42a1f91b5ac39d9401584a7d 
>   ql/src/test/results/clientpositive/vectorization_8.q.out 
> b9b0c8af17b13cebb2301012fadf3395786b4f62 
> 
> 
> Diff: https://reviews.apache.org/r/69019/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>

Reply via email to