Good to hear that.
On Thu, Aug 22, 2013 at 9:02 AM, 闫昆 <yankunhad...@gmail.com> wrote: > thanks all i move lzo index to hive directory is work fine . > thanks > > > 2013/8/22 Rajesh Balamohan <rajesh.balamo...@gmail.com> > >> Create the LZO index after moving the file to hive directory (i.e after >> executing your LOAD DATA* statement). Index file is needed only during job >> execution and if its not present in the same directory, it would not split >> the large file. >> >> >> On Thu, Aug 22, 2013 at 7:11 AM, 闫昆 <yankunhad...@gmail.com> wrote: >> >>> In hive i use SET >>> mapreduce.input.fileinputformat.split.maxsize=134217728; but not effect and >>> i found when use >>> >>> LOAD DATA INPATH '/data_split/data_rowkey.lzo' >>> >>> OVERWRITE INTO TABLE data_zh >>> >>> The hdfs data move to hive directory i CREATE EXTERNAL TABLE but issue >>> is data_rowkey.lzo.index is also exist hdfs /data_split/ directory >>> .actually data move to hive directory , index file in hdfs directory ,they >>> are not in the same directory >>> >>> >>> 2013/8/22 Sanjay Subramanian <sanjay.subraman...@wizecommerce.com> >>> >>>> Hi >>>> >>>> Try this setting in your hive query >>>> >>>> SET mapreduce.input.fileinputformat.split.maxsize=<some bytes>; >>>> >>>> If u set this value "low" then the MR job will use this size to split >>>> the input LZO files and u will get multiple mappers (and make sure the >>>> input LZO files are indexed I.e. .LZO.INDEX files are created) >>>> >>>> sanjay >>>> >>>> >>>> From: Edward Capriolo <edlinuxg...@gmail.com> >>>> Reply-To: "user@hive.apache.org" <user@hive.apache.org> >>>> Date: Wednesday, August 21, 2013 10:43 AM >>>> To: "user@hive.apache.org" <user@hive.apache.org> >>>> Subject: Re: only one mapper >>>> >>>> LZO files are only splittable if you index them. Sequence files >>>> compresses with LZO are splittable without being indexed. >>>> >>>> Snappy + SequenceFile is a better option then LZO. >>>> >>>> >>>> On Wed, Aug 21, 2013 at 1:39 PM, Igor Tatarinov <i...@decide.com>wrote: >>>> >>>>> LZO files are combinable so check your max split setting. >>>>> >>>>> http://mail-archives.apache.org/mod_mbox/hive-user/201107.mbox/%3c4e328964.7000...@gmail.com%3E >>>>> >>>>> igor >>>>> decide.com >>>>> >>>>> >>>>> >>>>> On Wed, Aug 21, 2013 at 2:17 AM, 闫昆 <yankunhad...@gmail.com> wrote: >>>>> >>>>>> hi all when i use hive >>>>>> hive job make only one mapper actually my file split 18 block my >>>>>> block size is 128MB and data size 2GB >>>>>> i use lzo compression and create file.lzo and make index >>>>>> file.lzo.index >>>>>> i use hive 0.10.0 >>>>>> >>>>>> Total MapReduce jobs = 1 >>>>>> Launching Job 1 out of 1 >>>>>> Number of reduce tasks is set to 0 since there's no reduce operator >>>>>> Cannot run job locally: Input Size (= 2304560827) is larger than >>>>>> hive.exec.mode.local.auto.inputbytes.max (= 134217728) >>>>>> Starting Job = job_1377071515613_0003, Tracking URL = >>>>>> http://hydra0001:8088/proxy/application_1377071515613_0003/ >>>>>> Kill Command = /opt/module/hadoop-2.0.0-cdh4.3.0/bin/hadoop job >>>>>> -kill job_1377071515613_0003 >>>>>> Hadoop job information for Stage-1: number of mappers: 1; number of >>>>>> reducers: 0 >>>>>> 2013-08-21 16:44:30,237 Stage-1 map = 0%, reduce = 0% >>>>>> 2013-08-21 16:44:40,495 Stage-1 map = 2%, reduce = 0%, Cumulative >>>>>> CPU 6.81 sec >>>>>> 2013-08-21 16:44:41,710 Stage-1 map = 2%, reduce = 0%, Cumulative >>>>>> CPU 6.81 sec >>>>>> 2013-08-21 16:44:42,919 Stage-1 map = 2%, reduce = 0%, Cumulative >>>>>> CPU 6.81 sec >>>>>> 2013-08-21 16:44:44,117 Stage-1 map = 3%, reduce = 0%, Cumulative >>>>>> CPU 9.95 sec >>>>>> 2013-08-21 16:44:45,333 Stage-1 map = 3%, reduce = 0%, Cumulative >>>>>> CPU 9.95 sec >>>>>> 2013-08-21 16:44:46,530 Stage-1 map = 5%, reduce = 0%, Cumulative >>>>>> CPU 13.0 sec >>>>>> >>>>>> -- >>>>>> >>>>>> In the Hadoop world, I am just a novice, explore the entire Hadoop >>>>>> ecosystem, I hope one day I can contribute their own code >>>>>> >>>>>> YanBit >>>>>> yankunhad...@gmail.com >>>>>> >>>>>> >>>>> >>>> >>>> CONFIDENTIALITY NOTICE >>>> ====================== >>>> This email message and any attachments are for the exclusive use of the >>>> intended recipient(s) and may contain confidential and privileged >>>> information. Any unauthorized review, use, disclosure or distribution is >>>> prohibited. If you are not the intended recipient, please contact the >>>> sender by reply email and destroy all copies of the original message along >>>> with any attachments, from your computer system. If you are the intended >>>> recipient, please be advised that the content of this message is subject to >>>> access, review and disclosure by the sender's Email System Administrator. >>>> >>> >>> >>> >>> -- >>> >>> In the Hadoop world, I am just a novice, explore the entire Hadoop >>> ecosystem, I hope one day I can contribute their own code >>> >>> YanBit >>> yankunhad...@gmail.com >>> >>> >> >> >> -- >> ~Rajesh.B >> > > > > -- > > In the Hadoop world, I am just a novice, explore the entire Hadoop > ecosystem, I hope one day I can contribute their own code > > YanBit > yankunhad...@gmail.com > > -- ~Rajesh.B