HI all,

I trying to do a cogroup with five relations that I load from cassandra
previously.

In single node and local casandra testing environment the script works fine
but when I try to execute in a cluster over AWS instances with only one
slave  in hadoop cluster and One seed cassandra node I have a timeout  with
a thirf socket.

Are there a param in  to increase this time or how can I fix this issue?


Thanks in advance


=================

this is the log:
==================

2014-02-27 16:17:13,653 [Thread-9] ERROR
org.apache.hadoop.security.UserGroupInformation -
PriviledgedActionException as:ec2-user
cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset

2014-02-27 16:17:13,654 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job null has failed! Stop running all dependent jobs

2014-02-27 16:17:13,654 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job null has failed! Stop running all dependent jobs

2014-02-27 16:17:13,658 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete

2014-02-27 16:17:13,668 [main] ERROR
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to
recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Could not
get input splits

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285)

at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)

at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)

at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)

at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)

at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)

at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)

at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)

at java.lang.Thread.run(Thread.java:744)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)

Caused by: java.io.IOException: Could not get input splits

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:197)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273)

... 15 more

Caused by: java.util.concurrent.ExecutionException:
java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset

at java.util.concurrent.FutureTask.report(FutureTask.java:122)

at java.util.concurrent.FutureTask.get(FutureTask.java:188)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:193)

... 16 more

Caused by: java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:308)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:230)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:215)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Caused by: org.apache.thrift.transport.TTransportException:
java.net.SocketException: Connection reset

at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)

at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362)

at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284)

at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191)

at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)

at
org.apache.cassandra.thrift.Cassandra$Client.recv_set_keyspace(Cassandra.java:599)

at
org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:586)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:281)

... 7 more

Caused by: java.net.SocketException: Connection reset

at java.net.SocketInputStream.read(SocketInputStream.java:196)

at java.net.SocketInputStream.read(SocketInputStream.java:122)

at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)

at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)

at java.io.BufferedInputStream.read(BufferedInputStream.java:334)

at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)

... 18 more


2014-02-27 16:17:13,672 [main] ERROR
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to
recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285)

at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)

at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)

at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)

at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)

at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)

at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)

at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)

at java.lang.Thread.run(Thread.java:744)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)

Caused by: java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:337)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:125)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273)

... 15 more

Caused by: org.apache.thrift.transport.TTransportException:
java.net.SocketException: Connection reset

at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)

at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362)

at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284)

at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191)

at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)

at
org.apache.cassandra.thrift.Cassandra$Client.recv_describe_local_ring(Cassandra.java:1277)

at
org.apache.cassandra.thrift.Cassandra$Client.describe_local_ring(Cassandra.java:1264)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:329)

... 17 more

Caused by: java.net.SocketException: Connection reset

at java.net.SocketInputStream.read(SocketInputStream.java:196)

at java.net.SocketInputStream.read(SocketInputStream.java:122)

at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)

at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)

at java.io.BufferedInputStream.read(BufferedInputStream.java:334)

at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)

... 28 more


2014-02-27 16:17:13,672 [main] ERROR
org.apache.pig.tools.pigstats.PigStatsUtil - 2 map reduce job(s) failed!

2014-02-27 16:17:13,672 [main] INFO
org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats
reported below may be incomplete

2014-02-27 16:17:13,675 [main] INFO
org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:


HadoopVersion PigVersion UserId StartedAt FinishedAt Features

1.2.1 0.10.0 ec2-user 2014-02-27 16:15:38 2014-02-27 16:17:13
HASH_JOIN,COGROUP,FILTER


Some jobs have failed! Stop running all dependent jobs


Job Stats (time in seconds):

JobId Alias Feature Outputs

job_local464993512_0001
action_counter,daily_action_counter,data_action_counter MAP_ONLY


Failed Jobs:

JobId Alias Feature Message Outputs

N/A
addedToCart_counter,addedToCart_money_counter,daily_addedToCart_counter,daily_addedToCart_money_counter,data_addedToCart_counter,data_addedToCart_money_counter,data_metrics1
COGROUP Message: org.apache.pig.backend.executionengine.ExecException:
ERROR 2118: org.apache.thrift.transport.TTransportException:
java.net.SocketException: Connection reset

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285)

at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)

at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)

at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)

at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)

at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)

at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)

at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)

at java.lang.Thread.run(Thread.java:744)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)

Caused by: java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:337)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:125)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273)

... 15 more

Caused by: org.apache.thrift.transport.TTransportException:
java.net.SocketException: Connection reset

at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)

at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362)

at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284)

at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191)

at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)

at
org.apache.cassandra.thrift.Cassandra$Client.recv_describe_local_ring(Cassandra.java:1277)

at
org.apache.cassandra.thrift.Cassandra$Client.describe_local_ring(Cassandra.java:1264)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:329)

... 17 more

Caused by: java.net.SocketException: Connection reset

at java.net.SocketInputStream.read(SocketInputStream.java:196)

at java.net.SocketInputStream.read(SocketInputStream.java:122)

at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)

at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)

at java.io.BufferedInputStream.read(BufferedInputStream.java:334)

at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)

... 28 more

 N/A
conversion_counter,conversion_money_counter,daily_conversion_counter,daily_conversion_money_counter,daily_sold_counter,daily_sold_money_counter,data_conversion_counter,data_conversion_money_counter,data_metrics2,data_sold_counter,data_sold_money_counter,sold_counter,sold_money_counter
COGROUP Message: org.apache.pig.backend.executionengine.ExecException:
ERROR 2118: Could not get input splits

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285)

at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)

at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)

at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)

at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)

at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)

at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)

at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)

at java.lang.Thread.run(Thread.java:744)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)

Caused by: java.io.IOException: Could not get input splits

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:197)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273)

... 15 more

Caused by: java.util.concurrent.ExecutionException:
java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset

at java.util.concurrent.FutureTask.report(FutureTask.java:122)

at java.util.concurrent.FutureTask.get(FutureTask.java:188)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:193)

... 16 more

Caused by: java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:308)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:230)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:215)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Caused by: org.apache.thrift.transport.TTransportException:
java.net.SocketException: Connection reset

at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)

at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)

at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362)

at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284)

at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191)

at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)

at
org.apache.cassandra.thrift.Cassandra$Client.recv_set_keyspace(Cassandra.java:599)

at
org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:586)

at
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:281)

... 7 more

Caused by: java.net.SocketException: Connection reset

at java.net.SocketInputStream.read(SocketInputStream.java:196)

at java.net.SocketInputStream.read(SocketInputStream.java:122)

at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)

at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)

at java.io.BufferedInputStream.read(BufferedInputStream.java:334)

at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)

... 18 more


Input(s):

Successfully read records from: "cassandra://metrics/action_counter"

Failed to read data from "cassandra://metrics/addedToCart_counter"

Failed to read data from "cassandra://metrics/addedToCart_money_counter"

Failed to read data from "cassandra://metrics/conversion_money_counter"

Failed to read data from "cassandra://metrics/sold_counter"

Failed to read data from "cassandra://metrics/conversion_counter"

Failed to read data from "cassandra://metrics/sold_money_counter"


Output(s):


Job DAG:

job_local464993512_0001 -> null,null,

null -> null,

null -> null,

null



2014-02-27 16:17:13,677 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Some jobs have failed! Stop running all dependent jobs

2014-02-27 16:17:13,678 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1066: Unable to open iterator for alias data_metrics

2014-02-27 16:17:13,679 [main] ERROR org.apache.pig.tools.grunt.Grunt -
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
open iterator for alias data_metrics

at org.apache.pig.PigServer.openIterator(PigServer.java:857)

at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:682)

at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)

at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)

at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)

at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)

at org.apache.pig.Main.run(Main.java:555)

at org.apache.pig.Main.main(Main.java:111)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

Caused by: java.io.IOException: Couldn't retrieve job.

at org.apache.pig.PigServer.store(PigServer.java:921)

at org.apache.pig.PigServer.openIterator(PigServer.java:832)

... 12 more

Reply via email to