Re: query resulting in many small output files causes timeout error in Hue

2013-11-21 Thread Eric Chu
(Adding hue-users back since this issues only affects Hue but not CLI) The problem is that most users (analysts) wouldn't see this problem until after they have run the query once. Often these queries take considerable time. To ask them to then run the query again with "create table as" wastes tim

Re: query resulting in many small output files causes timeout error in Hue

2013-11-21 Thread Tim
Or setting reducers to 1 and doing a GROUP BY all columns forces a single file too. Tim, Sent from my iPhone (which makes terrible auto-correct spelling mistakes) > On 21 Nov 2013, at 18:27, Eric Chu wrote: > > Hi, > > We often have map-only queries that result in a large number of small outp

Re: query resulting in many small output files causes timeout error in Hue

2013-11-21 Thread Tim
Hey Eric I know this isnt the fix you're looking for but in the spirit of pragmatic workarounds... What happens if you CREATE TABLE copy AS SELECT * FROM orig? I used to use that with very early Hue versions. Cheers, Tim, Sent from my iPhone (which makes terrible auto-correct spelling mistakes)

query resulting in many small output files causes timeout error in Hue

2013-11-21 Thread Eric Chu
Hi, We often have map-only queries that result in a large number of small output files (in the thousands). Although this doesn't affect CLI, when users try to view/download the query result in Hue, Hue would time out in trying to read all these small files. We tried to set the following properties