Re: [Thriftserver2] Controlling number of tasks

2016-08-03 Thread Takeshi Yamamuro
Hi, HiveThriftserver2 itself has no such functionality. Have you tried adaptive execution in spark? https://issues.apache.org/jira/browse/SPARK-9850 I have not used this yet though, it seems this experimental feature is to tune #tasks depending on partition size. // maropu On Thu, Aug 4, 2016 a

Re: [Thriftserver2] Controlling number of tasks

2016-08-03 Thread Chanh Le
I believe there is no way to reduce tasks by Hive using coalesce because when It come to Hive just read the files and depend on number of files you put into. So The way to did was coalesce at the ELT layer put a small number of files as possible reduce IO time for reading file. > On Aug 3, 201

Re: [Thriftserver2] Controlling number of tasks

2016-08-03 Thread ayan guha
What I understand is you have a source location where files are dropped and never removed? If that is the case, you may want to keep a track of which files are already processed by your program and read only the "new" files. On 3 Aug 2016 22:03, "Yana Kadiyska" wrote: > Hi folks, I have an ETL pi