LZO files are only splittable if you index them. Sequence files compresses with LZO are splittable without being indexed.
Snappy + SequenceFile is a better option then LZO. On Wed, Aug 21, 2013 at 1:39 PM, Igor Tatarinov <[email protected]> wrote: > LZO files are combinable so check your max split setting. > > http://mail-archives.apache.org/mod_mbox/hive-user/201107.mbox/%[email protected]%3E > > igor > decide.com > > > > On Wed, Aug 21, 2013 at 2:17 AM, 闫昆 <[email protected]> wrote: > >> hi all when i use hive >> hive job make only one mapper actually my file split 18 block my block >> size is 128MB and data size 2GB >> i use lzo compression and create file.lzo and make index file.lzo.index >> i use hive 0.10.0 >> >> Total MapReduce jobs = 1 >> Launching Job 1 out of 1 >> Number of reduce tasks is set to 0 since there's no reduce operator >> Cannot run job locally: Input Size (= 2304560827) is larger than >> hive.exec.mode.local.auto.inputbytes.max (= 134217728) >> Starting Job = job_1377071515613_0003, Tracking URL = >> http://hydra0001:8088/proxy/application_1377071515613_0003/ >> Kill Command = /opt/module/hadoop-2.0.0-cdh4.3.0/bin/hadoop job -kill >> job_1377071515613_0003 >> Hadoop job information for Stage-1: number of mappers: 1; number of >> reducers: 0 >> 2013-08-21 16:44:30,237 Stage-1 map = 0%, reduce = 0% >> 2013-08-21 16:44:40,495 Stage-1 map = 2%, reduce = 0%, Cumulative CPU >> 6.81 sec >> 2013-08-21 16:44:41,710 Stage-1 map = 2%, reduce = 0%, Cumulative CPU >> 6.81 sec >> 2013-08-21 16:44:42,919 Stage-1 map = 2%, reduce = 0%, Cumulative CPU >> 6.81 sec >> 2013-08-21 16:44:44,117 Stage-1 map = 3%, reduce = 0%, Cumulative CPU >> 9.95 sec >> 2013-08-21 16:44:45,333 Stage-1 map = 3%, reduce = 0%, Cumulative CPU >> 9.95 sec >> 2013-08-21 16:44:46,530 Stage-1 map = 5%, reduce = 0%, Cumulative CPU >> 13.0 sec >> >> -- >> >> In the Hadoop world, I am just a novice, explore the entire Hadoop >> ecosystem, I hope one day I can contribute their own code >> >> YanBit >> [email protected] >> >> >
