Re: The build-in indexes in ORC file does not work.

2016-03-19 Thread Mich Talebzadeh
; Sent with Good (www.good.com) > -- > *From:* Joseph > *Sent:* Wednesday, March 16, 2016 9:46:25 AM > *To:* user > *Cc:* user; user > *Subject:* Re: Re: The build-in indexes in ORC file does not work. > > > terminal_type =0, 260,000,000 rows, almost cov

Re: The build-in indexes in ORC file does not work.

2016-03-19 Thread Jörn Franke
How much data are you querying? What is the query? How selective it is supposed to be? What is the block size? > On 16 Mar 2016, at 11:23, Joseph wrote: > > Hi all, > > I have known that ORC provides three level of indexes within each file, file > level, stripe level, and row level. > The fi

RE: The build-in indexes in ORC file does not work.

2016-03-19 Thread Wietsma, Tristan A.
hadoop 2.7.2,hive 1.2.1 Joseph From: Jörn Franke<mailto:jornfra...@gmail.com> Date: 2016-03-16 20:27 To: Joseph<mailto:wxy81...@sina.com> CC: user<mailto:user@spark.apache.org>; user<mailto:u...@hive.apache.org> Subject: Re: The build-

Re: Re: The build-in indexes in ORC file does not work.

2016-03-19 Thread Joseph
le number is 800, each of them is about 51M. my query statement is : select count(*) from gprs where terminal_type = 25080; select * from gprs where terminal_type = 25080; In the gprs table, the "terminal_type" column's value is in [0, 25066] Joseph From: Jörn Franke Date: 2016-0

Re: The build-in indexes in ORC file does not work.

2016-03-18 Thread Jörn Franke
minal_type = 25080; > select * from gprs where terminal_type = 25080; > > In the gprs table, the "terminal_type" column's value is in [0, 25066] > > Joseph > > From: Jörn Franke > Date: 2016-03-16 19:26 > To: Joseph > CC: user; user > Subject: Re

Re: The build-in indexes in ORC file does not work.

2016-03-16 Thread Mich Talebzadeh
Hi, The parameters that control the stripe, row group are configurable via the ORC creation script CREATE TABLE dummy ( ID INT , CLUSTERED INT , SCATTERED INT , RANDOMISED INT , RANDOM_STRING VARCHAR(50) , SMALL_VC VARCHAR(10) , PADDING VARCHAR(10) ) CLUSTERED BY (ID) INT

The build-in indexes in ORC file does not work.

2016-03-16 Thread Joseph
Hi all, I have known that ORC provides three level of indexes within each file, file level, stripe level, and row level. The file and stripe level statistics are in the file footer so that they are easy to access to determine if the rest of the file needs to be read at all. Row level indexes i