Hi Gwen,
The root cause of all io related problems seems to be file rename that
Camus does and underlying Hadoop MapR FS.
We are copying files from user volume to a day volume (rename does copy)
when mapper commits file to FS. Please refer to
http://answers.mapr.com/questions/162562/volume-issue
I suspect Camus job has issue because other process ( another separate
Map/Reduce Job) also write to same "time" (folders) bucket and it does not
have this issue at all (so far) when reading from other dependent Hive
job. This dependent Hive job only have issue with files created via camus
job (
Actually, the error you sent shows that its trying to read a TEXT file
as if it was Seq. Thats why I suspected a misconfiguration of some
sort.
Why do you suspect a race condition?
On Mon, Mar 2, 2015 at 5:19 PM, Bhavesh Mistry
wrote:
> Hi Gwen,
>
> We are using MapR (Sorry no Cloudera) distribu
Hi Gwen,
We are using MapR (Sorry no Cloudera) distribution.
I am suspecting it is code issue. I am in-processor review the code about
MultiOutputFormat class.
https://github.com/linkedin/camus/blob/master/camus-etl-kafka/src/main/java/com/linkedin/camus/etl/kafka/mapred/EtlMultiOutputFormat.j
Do you have the command you used to run Camus? and the config files?
Also, I noticed your file is on maprfs - you may want to check with
your vendor... I doubt Camus was extensively tested on that particular
FS.
On Mon, Mar 2, 2015 at 3:59 PM, Bhavesh Mistry
wrote:
> Hi Kakfa User Team,
>
> I ha