You can use your custom mapreduce code. Just check the record type and if xml then preprocess to avoid new lines.
Regards Bejoy KS Sent from handheld, please excuse typos. -----Original Message----- From: iwannaplay games <funnlearnfork...@gmail.com> Date: Tue, 20 Nov 2012 14:29:18 To: <user@hive.apache.org> Reply-To: user@hive.apache.org Subject: Re: populating xml data in hive How to preprocess data where millions of records are there out of which only few thousands contain xml data On 11/20/12, Nitin Pawar <nitinpawar...@gmail.com> wrote: > Hive currently supports only new line as record separator. If you got > newline in in column values then you will need to preprocess your data and > remove new line from column values > On Nov 20, 2012 1:30 PM, "iwannaplay games" <funnlearnfork...@gmail.com> > wrote: > >> Hi All, >> >> I have a csv file ( separated by |) where data is like >> >> id data >> date >> 1 apple >> 24-nov-2011 >> 2 mango >> 26-nov-2011 >> 3 <?xml version="1.0" encoding="utf-8"?> >> <a>fruits</a> >> 28-nov-2011 >> 4 papaya >> 30-nov-2011 >> >> >> Since id=3 has new line in data field hive takes only first >> line and treats second line as different row.I want my full xml field >> to be taken inside data in hive table . >> >> it seems hive doesnt support lines terminated by '|' >> >> How to treat xml data in hive >> >> Thanks & Regards >> Prabhjot >> >