[ https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sahil Takiar updated HIVE-15396: -------------------------------- Summary: Basic Stats are not collected when for managed tables with LOCATION specified (was: Basic Stats are not collected when running INSERT INTO commands on s3a) > Basic Stats are not collected when for managed tables with LOCATION specified > ----------------------------------------------------------------------------- > > Key: HIVE-15396 > URL: https://issues.apache.org/jira/browse/HIVE-15396 > Project: Hive > Issue Type: Bug > Components: Hive > Reporter: Sahil Takiar > Assignee: Sahil Takiar > Attachments: HIVE-15396.1.patch > > > {{numRows}} is not collected when running {{INSERT ... INTO ...}} commands > against tables backed by S3 (and maybe even other blobstores). > The COLUMN_STATS_ACCURATE={"BASIC_STATS":"true"} entry is missing from the > {{describe extended}} output. > Repro steps: > {code} > hive> drop table s3_table; > OK > Time taken: 1.87 seconds > hive> create table s3_table (col int) location > 's3a://[bucket-name]/stats-test/'; > OK > Time taken: 3.069 seconds > hive> insert into s3_table values (1), (2), (3); > WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the > future versions. Consider using a different execution engine (i.e. spark, > tez) or using Hive 1.X releases. > Query ID = stakiar_20161208160105_fb3df340-d5fb-4ad6-8776-4f3cae02216d > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Job running in-process (local Hadoop) > 2016-12-08 16:01:12,741 Stage-1 map = 0%, reduce = 0% > 2016-12-08 16:01:16,759 Stage-1 map = 100%, reduce = 0% > Ended Job = job_local688636529_0004 > Stage-4 is selected by condition resolver. > Stage-3 is filtered out by condition resolver. > Stage-5 is filtered out by condition resolver. > Loading data to table default.s3_table > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 23.0 seconds > hive> select * from s3_table; > OK > 1 > 2 > 3 > Time taken: 0.096 seconds, Fetched: 3 row(s) > hive> describe extended s3_table; > OK > col int > Detailed Table Information Table(tableName:s3_table, dbName:default, > owner:stakiar, createTime:1481241657, lastAccessTime:0, retention:0, > sd:StorageDescriptor(cols:[FieldSchema(name:col, type:int, comment:null)], > location:s3a://[bucket-name]/stats-test, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:{serialization.format=1}), bucketCols:[], sortCols:[], > parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], > skewedColValueLocationMaps:{}), storedAsSubDirectories:false), > partitionKeys:[], parameters:{transient_lastDdlTime=1481241687, totalSize=6, > numFiles=1}, viewOriginalText:null, viewExpandedText:null, > tableType:MANAGED_TABLE) > Time taken: 0.037 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)