Re: Compaction after bulk-load

2015-07-30 Thread Laurent H
ring weekend >> when >> the write load negligable. >> >> After reading the documentation, its not clear how many HFiles are created >> once bulk-load finishes - is it one HFile per reducer? My question is, is >> it recommended to run major compaction after bulk-load

Re: Compaction after bulk-load

2015-07-30 Thread Laurent H
are 10 region servers & I can schedule compaction during weekend when > the write load negligable. > > After reading the documentation, its not clear how many HFiles are created > once bulk-load finishes - is it one HFile per reducer? My question is, is > it recommended to run major c

Re: Compaction after bulk-load

2015-07-30 Thread Krishna
There are 10 region servers & I can schedule compaction during weekend when the write load negligable. After reading the documentation, its not clear how many HFiles are created once bulk-load finishes - is it one HFile per reducer? My question is, is it recommended to run major compaction a

Re: Compaction after bulk-load

2015-07-30 Thread Laurent H
It's a very big treatment "Major Compaction". We use bulk loading and we've put one major at 2 A.M and it rocks ! -- Laurent HATIER - Consultant Big Data & Business Intelligence chez CapGemini fr.linkedin.com/pub/laurent-hatier/25/36b/a86/ 2015-0

Re: Compaction after bulk-load

2015-07-30 Thread Ted Yu
How many region servers do you have in the cluster ? Would there be concurrent write load on the cluster if you choose to run major compaction ? I ask this because the concurrent write would be slowed down by the major compaction and compacting 10 TB of data would take some time. Cheers On Wed,

Compaction after bulk-load

2015-07-29 Thread Krishna
Hi, I am planning to bulk-load about 10 TB of data to a table pre-split with 30 regions with max region file size configured to 10 GB. Is it recommended that I run a major compaction when bulk-loading finishes? How many HFiles does the reducer create?