> On April 16, 2015, 6:51 p.m., Jimmy Xiang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java, line 
> > 1235
> > <https://reviews.apache.org/r/33171/diff/2/?file=930355#file930355line1235>
> >
> >     Why do we need this checking? Does this mean we can't use the default 
> > partition name for none dynamic partition?
> 
> Chaoyu Tang wrote:
>     No, that is basically to allow the default partition (from dynamic 
> partition) automatically skip the type check. Currently the default partition 
> value from dynamic partition insert is actually from HiveConf 
> hive.exec.default.partition.name in type of string with default value 
> "__HIVE_DEFAULT_PARTITION__", but it is used for any type of partition column.
> 
> Jimmy Xiang wrote:
>     The node is also not put into the map, any side effect? The default name 
> can change, right? For example, before rename and after rename. How big will 
> be the performance gain?

1. The astExprNodeMap is an empty Map which is passed to getPartExprNodeDesc to 
get all the partitions whose value needs to be validated. For any partition 
with defaultPartitionName, it should not be put into this map, so it won't get 
type checked in validatePartColumnType method. 
In validatePartColumnType, there are actually two maps, partSpec which contains 
all the partVals including defaultName, and astExprNodeMap which only contains 
the partVal nodes to be typechecked. Use astExprNodeMap to convert any partVal 
in partSpec which needs typecheck/conversion.

The code might be quite hard to be read since it passes a collection as a 
parameter to a method, and modifiy its element in the method.

2. yes, default name can be change (or set) through seting HiveConf property 
hive.exec.default.partition.name. The defaultPartitonName is the result from 
dynamic partition insert. For example, during dynamic partiton insert, if there 
is no value(null) found to be a partition column value, the system will replace 
it with this defaultPartitionName. User usually won't rename a partition to the 
defaultPartitonName.

3. Skipping type check for the defaultPartitionName is not mainly for the 
performance gain. Since we know one defaultPartitionName is used in the system 
for differnt types of partition column, it is meanless to check its type. For 
example, if the partition column type is int, while the defaultParitionName is 
a string but not in the number format say HIVE_DEFAULT_PARTITION. Typecheking 
this defaultPartitionName for type int will fail unless every time you turn the 
validation off.


- Chaoyu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33171/#review80275
-----------------------------------------------------------


On April 16, 2015, 9:38 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33171/
> -----------------------------------------------------------
> 
> (Updated April 16, 2015, 9:38 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Szehon Ho, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10307
>     https://issues.apache.org/jira/browse/HIVE-10307
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Data types like TinyInt, SmallInt, BigInt or Decimal can be expressed as 
> literals with postfix like Y, S, L, or BD appended to the number. These 
> literals work in most Hive queries, but do not when they are used as 
> partition column value. This patch is to address the issue of number literals 
> used in partition specification.
> Highlights of the changes:
> 1. Validate, convert and normalize the partVal in partSpec to match its 
> column type when hive.partition.check.column.type is set to true (default). 
> It not only applies to opertion insert which used to be controlled by 
> "hive.typecheck.on.insert", but also for other partition operations (e.g. 
> alter table .. partition, partition statistics etc). The 
> hive.typecheck.on.insert is now removed.
> 2. Convert and normalize legacy partition column data by using alter table 
> partition .. rename with hive.partition.check.old.column.type.in.rename set 
> to true. this property only allows the partVal in old PartSpec to skip the 
> type check, conversion in partition rename.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e138800 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
> 19234b5 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 
> e8066be 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 8302067 
>   ql/src/test/queries/clientpositive/alter_partition_coltype.q 8c9945c 
>   ql/src/test/queries/clientpositive/partition_coltype_literals.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/partition_type_check.q c9bca99 
>   ql/src/test/results/clientnegative/archive_partspec1.q.out da4817c 
>   ql/src/test/results/clientnegative/archive_partspec5.q.out c18de52 
>   ql/src/test/results/clientpositive/partition_coltype_literals.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/partition_timestamp.q.out bc6ab10 
>   ql/src/test/results/clientpositive/partition_timestamp2.q.out 365df69 
> 
> Diff: https://reviews.apache.org/r/33171/diff/
> 
> 
> Testing
> -------
> 
> 1. Manaully tests covering various number literals (Y, S, L, BD)
> 2. new qfile test (partition_coltype_literals.q)
> 3. Precommit build
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>

Reply via email to