Unfortunately at the moment I have to make everything run in standalone
mode on a 8 cores machine with 16 GB of RAM.
The good news is that the "mapreduce" side is able to terminate thanks to
it's Apache Flink implementation that manages efficiently the memory (and
if there's not enough memory it will serialize things on the disk).
My HBase version is 0.98.6.1-hadoop2 with default settings except for:

export HBASE_OPTS="-XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70
-XX:+CMSParallelRemarkEnabled"
export HBASE_HEAPSIZE=2000

The load of my pc is quite high because I use Flink (like Spark) to write
data into it using TableOutputFormat with all 8 cores but the data I'm
trying to add is not so big (about 6GB).
The problem is that at some point HBase server stop responding without
logging anything.
Do you think there's any possibility to avoid bulk uploading in my case?
I was thinking to use it because once produced the HFiles there's should be
only the HBase import process running and I don't have to bother the server
with WAL management and GC stuff..

Thanks for the support,
Flavio


On Wed, Apr 8, 2015 at 5:00 PM, Ted Yu <[email protected]> wrote:

> You may have read http://hbase.apache.org/book.html#arch.bulk.load
>
> bq. using TableOutputFormat from client makes my HBase stop working
>
> Can you give us more information (hbase release, load on your cluster, log
> snippet for crashed server, etc) ?
>
> Thanks
>
> On Wed, Apr 8, 2015 at 7:52 AM, Flavio Pompermaier <[email protected]>
> wrote:
>
> > Hi all,
> >
> > I have a non-mapreduce process  that produce a lot of data that I want to
> > import into HBase through programmatically bulk loading because using
> > TableOutputFormat from client makes my HBase stop working (too many
> writes
> > in parallel I think).
> >
> > How can I create the necessary HFiles starting from my data (their name,
> > size, etc) and then bulk load them into HBase programmatically? Do I need
> > to use the ProtobufUtil.bulkLoadHFile or SecureBulkLoadClient, right?
> >
> > Best,
> > Flavio
> >
>

Reply via email to