After using the CsvBulkLoader successfully for a few days, I’m getting some strange behavior this morning.
I ran the job on a fairly small ingest of data (around 1/2 billion rows). It seemed to complete successfully. I see this in the logs: Phoenix MapReduce Import Upserts Done=208011725 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=10 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=19384594049 File Output Format Counters Bytes Written=2752760990 15/07/01 06:47:51 INFO mapreduce.CsvBulkLoadTool: Loading HFiles from /tmp/0551154c-f430-4c18-9023-99529d409b20/FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID 15/07/01 06:47:51 WARN hbase.HBaseConfiguration: Config option "hbase.regionserver.lease.period" is deprecated. Instead, use "hbase.client.scanner.timeout.period" 15/07/01 06:47:51 WARN mapreduce.LoadIncrementalHFiles: Skipping non-directory hdfs://<servername>:8020/tmp/0551154c-f430-4c18-9023-99529d409b20/FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID/_SUCCESS But then I see a whole bunch of logging like this: 15/07/01 06:47:51 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://<servername>:8020/tmp/0551154c-f430-4c18-9023-99529d409b20/FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID/0/04c6a2933a0140cc82af44c5fd363fbe first=MSNJ_B72390360_19040998\x00\xD5\x8B\xDF0 last=MSNJ_B76287792_19041270\x00\xD5\x90\xB4\xB0 And then I see a whole bunch of logging like this: 15/07/01 07:15:22 INFO client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=1650771 ms ago, cancelled=false, msg=row 'UWNJ_B7^@^@^@^@^@' on table 'FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID' at region=FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID,UWNJ_B7\x00\x00\x00\x00\x00,1435243688700.16034ad1ace3d63826f78a6ee59739e8., hostname=<hostname>,60020,1435614730978, seqNum=36 15/07/01 07:15:22 INFO client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=1650771 ms ago, cancelled=false, msg=row 'LKLND_357333_47965154^@�!�' on table 'FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID' at region=FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID,LKLND_357333_47965154\x00\xD5!\xF6\xB0,1435587831557.f484029371ebf7d4964c50c1c26057db., hostname=<hostname>,60020,1435614730978, seqNum=49 15/07/01 07:15:23 INFO client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=1650915 ms ago, cancelled=false, msg=row 'BLDW_B4^@^@^@^@^@' on table 'FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID' at region=FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID,BLDW_B4\x00\x00\x00\x00\x00,1435243688700.a682cdfe9ea3a66a3d7618b1bcff44da., hostname=<hostname>,60020,1435614730978, seqNum=31 15/07/01 07:15:23 INFO client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=1650942 ms ago, cancelled=false, msg=row 'HWD_B1^@^@^@^@^@' on table 'FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID' at region=FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID,HWD_B1\x00\x00\x00\x00\x00,1435243688700.3a7f62b30ec32ba08418963f6d61cf97., hostname=<hostname>,60020,1435614730978, seqNum=36 15/07/01 07:15:23 INFO client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=1650965 ms ago, cancelled=false, msg=row 'PWSA_B0096026210_80692026^@��'0' on table 'FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID' at region=FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID,PWSA_B0096026210_80692026\x00\xD4\xEA'0,1435583570527.b81a2f33040c6976ad97db8cb3869989., hostname=<hostname>,60020,1435614709440, seqNum=39 15/07/01 07:15:23 INFO client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=1651002 ms ago, cancelled=false, msg=row 'BPORT_B5^@^@^@^@^@' on table 'FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID' at region=FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID,BPORT_B5\x00\x00\x00\x00\x00,1435243688700.31868ec82e8e1baf543bc4667941e382., hostname=<hostname>,60020,1435614730978, seqNum=36 15/07/01 07:15:23 INFO client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=1651004 ms ago, cancelled=false, msg=row 'LCWS_B3^@^@^@^@^@' on table 'FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID' at region=FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID,LCWS_B3\x00\x00\x00\x00\x00,1435243688700.83dd717bb5ca6b5cbd5cdb7207d64350., hostname=<hostname>,60020,1435614710462, seqNum=33 15/07/01 07:15:23 INFO client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=1651079 ms ago, cancelled=false, msg=row 'KUB_B9^@^@^@^@^@' on table 'FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID' at region=FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID,KUB_B9\x00\x00\x00\x00\x00,1435243688700.11e5db7c638be936f0ebfb00cc3b8a4a., hostname=<hostname>,60020,1435614710462, seqNum=36 15/07/01 07:15:23 INFO client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=1651108 ms ago, cancelled=false, msg=row 'BYTN_B3^@^@^@^@^@' on table 'FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID' at region=FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID,BYTN_B3\x00\x00\x00\x00\x00,1435243688700.8282bf5111eafa15ca62a52759cf42fe., hostname=<hostname>60020,1435614721035, seqNum=36 15/07/01 07:15:23 INFO client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=1651539 ms ago, cancelled=false, msg=row 'NHOK_B7^@^@^@^@^@' on table 'FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID' at region=FMA.ER_KEYED_GZ_METERKEY_SPLIT_CUSTID,NHOK_B7\x00\x00\x00\x00\x00,1435243688700.3f40de795f5234f4681987f26275f565., hostname=<hostname>,60020,1435614721035, seqNum=39 Is this indicating bad regions? Can someone help me understand what might be going on? Thanks!