Re: The build-in indexes in ORC file does not work.

2016-03-19 Thread Mich Talebzadeh
; Sent with Good (www.good.com) > -- > *From:* Joseph > *Sent:* Wednesday, March 16, 2016 9:46:25 AM > *To:* user > *Cc:* user; user > *Subject:* Re: Re: The build-in indexes in ORC file does not work. > > > terminal_type =0, 260,000,000 rows, almost cov

Re: The build-in indexes in ORC file does not work.

2016-03-19 Thread Jörn Franke
How much data are you querying? What is the query? How selective it is supposed to be? What is the block size? > On 16 Mar 2016, at 11:23, Joseph wrote: > > Hi all, > > I have known that ORC provides three level of indexes within each file, file > level, stripe level, and row level. > The fi

RE: The build-in indexes in ORC file does not work.

2016-03-19 Thread Wietsma, Tristan A.
hadoop 2.7.2,hive 1.2.1 Joseph From: Jörn Franke<mailto:jornfra...@gmail.com> Date: 2016-03-16 20:27 To: Joseph<mailto:wxy81...@sina.com> CC: user<mailto:user@spark.apache.org>; user<mailto:u...@hive.apache.org> Subject: Re: The build-

Re: Re: The build-in indexes in ORC file does not work.

2016-03-19 Thread Joseph
le number is 800, each of them is about 51M. my query statement is : select count(*) from gprs where terminal_type = 25080; select * from gprs where terminal_type = 25080; In the gprs table, the "terminal_type" column's value is in [0, 25066] Joseph From: Jörn Franke Date: 2016-0

Re: The build-in indexes in ORC file does not work.

2016-03-18 Thread Jörn Franke
minal_type = 25080; > select * from gprs where terminal_type = 25080; > > In the gprs table, the "terminal_type" column's value is in [0, 25066] > > Joseph > > From: Jörn Franke > Date: 2016-03-16 19:26 > To: Joseph > CC: user; user > Subject: Re

Re: The build-in indexes in ORC file does not work.

2016-03-16 Thread Mich Talebzadeh
Hi, The parameters that control the stripe, row group are configurable via the ORC creation script CREATE TABLE dummy ( ID INT , CLUSTERED INT , SCATTERED INT , RANDOMISED INT , RANDOM_STRING VARCHAR(50) , SMALL_VC VARCHAR(10) , PADDING VARCHAR(10) ) CLUSTERED BY (ID) INT