what's your table definition? http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table
See ROW FORMAT Thanks Yongqiang On Fri, Mar 18, 2011 at 3:33 PM, Severance, Steve <ssevera...@ebay.com> wrote: > One more question. I have everything working except a Map<String,String>. > > I understand that the whole Map will be physically stored as a single Text > object in the RCFile. > > I have had considerable trouble setting up the delimiters for this Map. > > I want to have > MAP KEYS TERMINATED BY '=' > COLLECTION ITEMS TERMINATED BY '&' > > Hive doesn't seem to want to take that. I have also tried using the ascii OCT > codes. > > What do I need to setup to make this Map work? > > Thanks. > > Steve > > -----Original Message----- > From: yongqiang he [mailto:heyongqiang...@gmail.com] > Sent: Thursday, March 17, 2011 5:09 PM > To: user@hive.apache.org > Subject: Re: Building Custom RCFiles > > Yes. It is the same with normal hive tables. > > thanks > yongqiang > On Thu, Mar 17, 2011 at 4:54 PM, Severance, Steve <ssevera...@ebay.com> wrote: >> Thanks Yongqiang. >> >> So for more complex types like map do I just setup a >> >> ROW FORMAT DELIMITED KEYS TERMINATED BY '|' etc... >> >> Thanks. >> >> Steve >> >> -----Original Message----- >> From: yongqiang he [mailto:heyongqiang...@gmail.com] >> Sent: Thursday, March 17, 2011 4:35 PM >> To: user@hive.apache.org >> Subject: Re: Building Custom RCFiles >> >> A side note, in hive, we make all columns saved as Text internally >> (even the column's type is int or double etc). And with some >> experiments, string is more friendly to compression. But it needs CPU >> to decode to its original type. >> >> Thanks >> Yongqiang >> On Thu, Mar 17, 2011 at 4:04 PM, yongqiang he <heyongqiang...@gmail.com> >> wrote: >>> You need to customize Hive's ColumnarSerde (maybe functions in >>> LazySerde)'s serde and deserialize function (depends you want to read >>> or write.). And the main thing is that you need to use your own type >>> def (not LazyInt/LazyLong). >>> >>> If your type is int or long (not double/float), casting it to string >>> only wastes some CPU, but can save you more spaces. >>> >>> Thanks >>> Yongqiang >>> On Thu, Mar 17, 2011 at 3:48 PM, Severance, Steve <ssevera...@ebay.com> >>> wrote: >>>> Hi, >>>> >>>> >>>> >>>> I am working on building a MR job that generates RCFiles that will become >>>> partitions of a hive table. I have most of it working however only strings >>>> (Text) are being deserialized inside of Hive. The hive table is specified >>>> to >>>> use a columnarserde which I thought should allow the writable types stored >>>> in the RCFile to be deserialized properly. >>>> >>>> >>>> >>>> Currently all numeric types (IntWritable and LongWritable) come back a >>>> null. >>>> >>>> >>>> >>>> Has anyone else seen anything like this or have any ideas? I would rather >>>> not convert all my data to strings to use RCFile. >>>> >>>> >>>> >>>> Thanks. >>>> >>>> >>>> >>>> Steve >>> >> >