Re: Hive orc use case

2016-09-26 Thread Amey Barve
Thanks Alan, Your comment answers my question :) I will start looking into HiveEndPoint api's. Regards, Amey Barve On 26 September 2016 at 23:50, Alan Gates wrote: > As long as there is a spare worker thread this should be picked up within > a few seconds. It’s true you can’t force it to happ

Re: Hive orc use case

2016-09-26 Thread Alan Gates
As long as there is a spare worker thread this should be picked up within a few seconds. It’s true you can’t force it to happen immediately if other compactions are happening, but that’s by design so that compaction work doesn’t take take too many resources. Alan. > On Sep 26, 2016, at 11:07,

Re: Hive orc use case

2016-09-26 Thread Mich Talebzadeh
alter table payees compact 'minor'; Compaction enqueued. OK It queues compaction but there is no way I can force it to do compaction immediately? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Hive orc use case

2016-09-26 Thread Alan Gates
alter table compact forces a compaction. See https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionCompact Alan. > On Sep 26, 2016, at 10:41, Mich Talebzadeh wrote: > > Can the temporary table be a solution to the original thread owner issue

Re: Hive orc use case

2016-09-26 Thread Mich Talebzadeh
Can the temporary table be a solution to the original thread owner issue? Hive streaming for example from Flume to Hive is interesting but the issue is that one ends up with a fair bit of delta files due to transactional nature of ORC table and I know that Spark will not be able to open the table

Re: Hive orc use case

2016-09-26 Thread Alan Gates
ORC does not store data row by row. It decomposes the rows into columns, and then stores pointer to those columns, as well as a number of indices and statistics, in a footer of the file. Due to the footer, in the simple case you cannot read the file before you close it or append to it. We did

Re: Hive orc use case

2016-09-26 Thread Mich Talebzadeh
I have not encountered this case before. However, you can create a temporary table in Hive put all writes into it, read the rows as needed, and finally append data from the temporary table to ORC once reads are done. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=

Hive orc use case

2016-09-26 Thread Amey Barve
Hi All, I have an use case where I need to append either 1 or many rows to orcFile as well as read 1 or many rows from it. I observed that I cannot read rows from OrcFile unless I close the OrcFile's writer, is this correct? Why doesn't write actually flush the rows to the orcFile, is there any