Thanks. I would like to use the cloudera demo (vmware) vm to test the actual performance of this. https://ccp.cloudera.com/display/SUPPORT/Cloudera%27s+Hadoop+Demo+VM
It only has 2 vcores it seems. What setup would get the best performance on such a hive query with possibly a more complicated select - maybe a join in the select ? Should I set up multiple such demo vms and connect them or can I increase the number of cores for that vm somehow and perhaps other hadoop settings ? Id perfer the second so the parallel processes can communicate through shared memory on my 16 core machine rather than likely slower vnics. On Wed, Feb 15, 2012 at 11:19 AM, <bejoy...@yahoo.com> wrote: > ** > Hi John > Yes Insert is parallel in default for hive. Hive QL gets transformed to > mapreduce jobs and hence definitely it is parallel. The only case it is not > parallel is when you have just 1 reducer . It is just reading and > processing the input files and in parallel using map reduce jobs from the > source table data dir and writes the desired output files to the > destination table dir. > Hive is just an abstraction over map reduce and can't be compared against > a db in terms of features. Almost every data processing operation is just > some map reduce jobs. > Regards > Bejoy K S > > From handheld, Please excuse typos. > ------------------------------ > *From: * John B <johnb4...@gmail.com> > *Date: *Wed, 15 Feb 2012 10:59:09 -0500 > *To: *<user@hive.apache.org> > *ReplyTo: * user@hive.apache.org > *Subject: *parallel inserts ? > > Other sql datbases typically can parallelize selects but are unable to > automatically parallelize inserts. > > With the most recent stable hiveql will the following statement have the > --insert-- automatically parallelized ? > > INSERT OVERWRITE TABLE pv_gender > SELECT pv_users.gender > FROM pv_users > > > I understand there is now 'insert into ..select from' syntax. Is the insert > part of that statement automatically parallelized ? > > What is the highest insert speed anybody has seen - and I am not talking > about imports I mean inserts from one table to another ? > >