On 7/31/12 9:23 PM, "Owen O'Malley" wrote:
>On Mon, Jul 30, 2012 at 11:38 PM, Namit Jain wrote:
>
>> That would be difficult. The % done can be estimated from the data
>>already
>> read.
>>
>
>I'm confused. Wouldn't the maximum size of the data remaining over the
>maximum size of the original
On Mon, Jul 30, 2012 at 11:38 PM, Namit Jain wrote:
> That would be difficult. The % done can be estimated from the data already
> read.
>
I'm confused. Wouldn't the maximum size of the data remaining over the
maximum size of the original query give a reasonable approximation of the
amount of wo
On 7/31/12 12:01 PM, "Owen O'Malley" wrote:
>On Mon, Jul 30, 2012 at 9:12 PM, Namit Jain wrote:
>
>> The total number of bytes of the input will be used to determine whether
>> to not launch a map-reduce job for this
>> query. That was in my original mail.
>>
>> However, given any complex whe
On Mon, Jul 30, 2012 at 9:12 PM, Namit Jain wrote:
> The total number of bytes of the input will be used to determine whether
> to not launch a map-reduce job for this
> query. That was in my original mail.
>
> However, given any complex where condition and the lack of column
> statistics in hive
The total number of bytes of the input will be used to determine whether
to not launch a map-reduce job for this
query. That was in my original mail.
However, given any complex where condition and the lack of column
statistics in hive, we cannot determine the
number of bytes that would be needed t
It supports table sampling also.
select * from src TABLESAMPLE (BUCKET 1 OUT OF 40 ON key);
select * from src TABLESAMPLE (0.25 PERCENT);
But there is no sampling option specifying number of bytes. This can be
done in another issue.
2012/7/31 Owen O'Malley
> On Sat, Jul 28, 2012 at 6:17 PM, Na
On Sat, Jul 28, 2012 at 6:17 PM, Navis류승우 wrote:
> I was thinking of timeout for fetching, 2000msec for example. How about
> that?
>
Instead of time, which requires launching the query and letting it timeout,
how about determining the number of bytes that would need to be fetched to
the local bo
This can be a follow-up to HIVE-2925.
Navis, if you want, I can work on it.
On 7/29/12 7:58 PM, "Namit Jain" wrote:
>I like Navis's idea. The timeout can be configurable.
>
>
>On 7/29/12 6:47 AM, "Navis류승우" wrote:
>
>>I was thinking of timeout for fetching, 2000msec for example. How about
>>th
I like Navis's idea. The timeout can be configurable.
On 7/29/12 6:47 AM, "Navis류승우" wrote:
>I was thinking of timeout for fetching, 2000msec for example. How about
>that?
>
>2012년 7월 29일 일요일에 Edward Capriolo님이 작성:
>> If where condition is too complex , selecting specific columns seems
>simple
I was thinking of timeout for fetching, 2000msec for example. How about
that?
2012년 7월 29일 일요일에 Edward Capriolo님이 작성:
> If where condition is too complex , selecting specific columns seems
simple
> enough and useful.
>
> On Saturday, July 28, 2012, Namit Jain wrote:
>> Currently, hive does not la
If where condition is too complex , selecting specific columns seems simple
enough and useful.
On Saturday, July 28, 2012, Namit Jain wrote:
> Currently, hive does not launch map-reduce jobs for the following queries:
>
> select * from where (limit )?
>
> This behavior is not configurable, and
Currently, hive does not launch map-reduce jobs for the following queries:
select * from where (limit )?
This behavior is not configurable, and cannot be altered.
HIVE-2925 wants to extend this behavior. The goal is not to spawn map-reduce
jobs for the following queries:
Select from where
12 matches
Mail list logo