Thanks Dudu. So if I need actual date I will use view. Regarding partition column: I can create 2 external tables based on the same data with integer or string column partition and see which one is more convenient for our use.
On Sat, Jun 4, 2016 at 2:20 PM, Markovitz, Dudu <dmarkov...@paypal.com> wrote: > I’m not aware of an option to do what you request in the external table > definition but you might want to that using a view. > > > > P.s. > > I seems to me that defining the partition column as a string would be more > user friendly than integer, e.g. – > > > > select * from threads_test where yyyymmdd like ‘2016%’ – year 2016; > > select * from threads_test where yyyymmdd like ‘201603%’ –- March 2016; > > select * from threads_test where yyyymmdd like ‘______01’ -- first of > every month; > > > > > > > > > > > > $ hdfs dfs -ls -R /tmp/threads_test > > drwxr-xr-x - cloudera supergroup 0 2016-06-04 10:45 > /tmp/threads_test/20160604 > > -rw-r--r-- 1 cloudera supergroup 136 2016-06-04 10:45 > /tmp/threads_test/20160604/data.txt > > > > $ hdfs dfs -cat /tmp/threads_test/20160604/data.txt > > {"url":"www.blablabla.com > ","pageType":"pg1","addDate":"2016-05-17T02:10:44.527","postDate":"2016-05-16T02:08:55","postText":"YadaYada"} > > > > > ---------------------------------------------------------------------------------------------------- > > > > > > hive> add jar /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar; > > > > hive> > > create external table threads_test > > ( > > url string > > ,pagetype string > > ,adddate string > > ,postdate string > > ,posttext string > > ) > > partitioned by (yyyymmdd string) > > row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' > > location '/tmp/threads_test' > > ; > > > > hive> alter table threads_test add partition (yyyymmdd=20160604) location > '/tmp/threads_test/20160604'; > > > > hive> select * from threads_test; > > > > www.blablabla.com pg1 2016-05-17T02:10:44.527 > 2016-05-16T02:08:55 YadaYada 20160604 > > > > hive> > > create view threads_test_v > > as > > select url > > ,pagetype > > ,cast (concat_ws(' ',substr (adddate ,1,10),substr (adddate > ,12)) as timestamp) as adddate > > ,cast (concat_ws(' ',substr (postdate,1,10),substr > (postdate,12)) as timestamp) as postdate > > ,posttext > > > > from threads_test > > ; > > > > hive> select * from threads_test_v; > > > > www.blablabla.com pg1 2016-05-17 02:10:44.527 2016-05-16 > 02:08:55 YadaYada > > > > > > *From:* Igor Kravzov [mailto:igork.ine...@gmail.com] > *Sent:* Saturday, June 04, 2016 8:13 PM > *To:* user@hive.apache.org > *Subject:* Convert date in string format to timestamp in table definition > > > > Hi, > > > > I have 2 dates in Json file defined like this > > "addDate": "2016-05-17T02:10:44.527", > > "postDate": "2016-05-16T02:08:55", > > > > Right now I define external table based on this file like this: > > CREATE external TABLE threads_test > > (url string, > > pagetype string, > > adddate string, > > postdate string, > > posttext string) > > partitioned by (yyyymmdd int) > > ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' > > location 'my location'; > > > > is it possible to define these 2 dates as timestamp? > > Do I need to change date format in the file? is it possible to specify > date format in table definition? > > Or I better off with string? > > > > Thanks in advance. >