Re: issue regarding importing hive tables from one cluster to another.

2012-09-08 Thread Jagat Singh
Hive structure information is in metastore which is by default in Derby database ( which I doubt if you would be having) or in mysql or something. Point your hive to mysql and try. --- Sent from Mobile , short and crisp. On 09-Sep-2012 5:29 AM, "yogesh dhari" wrote: > Hi all, > > I ha

issue regarding importing hive tables from one cluster to another.

2012-09-08 Thread yogesh dhari
Hi all, I have switched to new hdfs cluster from old cluster ( all machines from old cluster are not connected to new cluster in any manner) I brought edit and fsimage including ( dfs.name.dir and dfs.data.dir )from old cluster and put it over new cluster and every files and data are showing

Re: How to load csv data into HIVE

2012-09-08 Thread praveenesh kumar
Yup, Bejoy is correct :-) Just use hadoop streaming, for what it can do best --->>> Cleaning, Transformations and Validations, in just simple steps. Regards, Praveenesh On Sat, Sep 8, 2012 at 6:03 PM, Bejoy KS wrote: > Hi Chuck > > I believe Praveenesh was adding his thought to the discussion o

Re: How to load csv data into HIVE

2012-09-08 Thread Bejoy KS
Hi Chuck I believe Praveenesh was adding his thought to the discussion on preprocessing the data using mapreduce itself. If you go with hadoop streaming you can use the python script in the mapper and that will do the preprocessing parallely on large volume data. Then this preprocessed data can

RE: Handling arrays returned by json_tuple ??

2012-09-08 Thread Connell, Chuck
Something else... If json_tuple cannot select elements in an array, that means that JSON objects within an array are essentially "frozen" within their array. So if I had {"text1" : "smith", "array1" : [{json-object},{json-object}]} {"text1" : "jones", "array1" : [{json-object},{json-object}]} I

RE: How to load csv data into HIVE

2012-09-08 Thread Connell, Chuck
I would like to hear more about this "hadoop streaming to Hive" idea. I have used streaming jobs as mappers, with a python script as map.py. Are you saying that such a streaming mapper can load its output into Hive? Can you send some example code? Hive wants to load "files" not individual lines/

Re: How to load csv data into HIVE

2012-09-08 Thread praveenesh kumar
You can use hadoop streaming that would be much faster... Just run your cleaning shell script logic in map phase and it will be done in just few minutes. That will keep the data in HDFS. Regards, Praveenesh On Fri, Sep 7, 2012 at 8:37 PM, Sandeep Reddy P wrote: > Hi, > Thank you all for your he