Re: Hive orc use case

2016-09-26 Thread Amey Barve
Thanks Alan, Your comment answers my question :) I will start looking into HiveEndPoint api's. Regards, Amey Barve On 26 September 2016 at 23:50, Alan Gates wrote: > As long as there is a spare worker thread this should be picked up within > a few seconds. It’s true you can’t force it to happ

Re: HDFS small files to Sequence file using Hive

2016-09-26 Thread Arun Patel
Thanks Dudu and Gopal. I tried HAR files and it works. I want to use Sequence file because I want to expose data using a table (filename and content columns). *Can this be done for HAR files?* This is what I am doing to create a sequencefile: create external table raw_files (raw_data string) l

Re: Hive orc use case

2016-09-26 Thread Alan Gates
As long as there is a spare worker thread this should be picked up within a few seconds. It’s true you can’t force it to happen immediately if other compactions are happening, but that’s by design so that compaction work doesn’t take take too many resources. Alan. > On Sep 26, 2016, at 11:07,

Re: Hive orc use case

2016-09-26 Thread Mich Talebzadeh
alter table payees compact 'minor'; Compaction enqueued. OK It queues compaction but there is no way I can force it to do compaction immediately? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Hive orc use case

2016-09-26 Thread Alan Gates
alter table compact forces a compaction. See https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionCompact Alan. > On Sep 26, 2016, at 10:41, Mich Talebzadeh wrote: > > Can the temporary table be a solution to the original thread owner issue

Re: Hive orc use case

2016-09-26 Thread Mich Talebzadeh
Can the temporary table be a solution to the original thread owner issue? Hive streaming for example from Flume to Hive is interesting but the issue is that one ends up with a fair bit of delta files due to transactional nature of ORC table and I know that Spark will not be able to open the table

Re: Hive orc use case

2016-09-26 Thread Alan Gates
ORC does not store data row by row. It decomposes the rows into columns, and then stores pointer to those columns, as well as a number of indices and statistics, in a footer of the file. Due to the footer, in the simple case you cannot read the file before you close it or append to it. We did

Configure hiveserver2 logs

2016-09-26 Thread kishore kumar
Hi Hive Users, Using hive 1.2 version, I am connecting hiveserver2 via jdbc connection, could any one suggest me how to configure log file using this link https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients I set these 3 parameters values - hive.server2.logging.operation.enabl

Re: odd behavior of ntile

2016-09-26 Thread Rex X
Is this a bug of hive? On Sun, Sep 25, 2016 at 11:29 PM, Rex X wrote: > Hi All, > > I run following hive > > create table2 as > select > id, > ntile(6) over (partition by city order by price) as price_tile, > ntile(3) over (partition by city order by discount) as discount_tile, > ntile(6) over

Re: populating Hive table periodically from files on HDFS

2016-09-26 Thread Eugene Koifman
you are correct, delta files will be generated From: Mich Talebzadeh mailto:mich.talebza...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Monday, September 26, 2016 at 1:01 AM To: user mailto:user@hive.apache.org>> Subject: Re: popu

Re: MSCK not adding the missing partitions to Hive Metastore when the partition names are not in lowercase

2016-09-26 Thread Sushil Ks
There's a typo while sharing the example, it's not *mypartitiondate *instead its *mypartition *when it got added. On Mon, Sep 26, 2016 at 5:32 PM, Sushil Ks wrote: > Hi, > >I feel thers is a bug while running MSCK REPAIR TABLE > EXTERNAL_TABLE_NAME on Hive 1.2.1, all the partition that are n

MSCK not adding the missing partitions to Hive Metastore when the partition names are not in lowercase

2016-09-26 Thread Sushil Ks
Hi, I feel thers is a bug while running MSCK REPAIR TABLE EXTERNAL_TABLE_NAME on Hive 1.2.1, all the partition that are not present in the metastore are being listed but not added if the partition names are not in lowercase, in other words if an external path has a camel case based name and val

Re: Hive orc use case

2016-09-26 Thread Mich Talebzadeh
I have not encountered this case before. However, you can create a temporary table in Hive put all writes into it, read the rows as needed, and finally append data from the temporary table to ORC once reads are done. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=

Hive orc use case

2016-09-26 Thread Amey Barve
Hi All, I have an use case where I need to append either 1 or many rows to orcFile as well as read 1 or many rows from it. I observed that I cannot read rows from OrcFile unless I close the OrcFile's writer, is this correct? Why doesn't write actually flush the rows to the orcFile, is there any

Re: populating Hive table periodically from files on HDFS

2016-09-26 Thread Mich Talebzadeh
Thanks Eugene, My table in Hive happens to be ORC so making it bucketed and transactional is trivial. However, there is an underlying concern of mine. Hive transactional table will generate a lot of delta (if periodically appended). At least this is my understanding is what is going to happen.