HI all, I trying to do a cogroup with five relations that I load from cassandra previously.
In single node and local casandra testing environment the script works fine but when I try to execute in a cluster over AWS instances with only one slave in hadoop cluster and One seed cassandra node I have a timeout with a thirf socket. Are there a param in to increase this time or how can I fix this issue? Thanks in advance ================= this is the log: ================== 2014-02-27 16:17:13,653 [Thread-9] ERROR org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:ec2-user cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset 2014-02-27 16:17:13,654 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job null has failed! Stop running all dependent jobs 2014-02-27 16:17:13,654 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job null has failed! Stop running all dependent jobs 2014-02-27 16:17:13,658 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2014-02-27 16:17:13,668 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Could not get input splits at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) at java.lang.Thread.run(Thread.java:744) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260) Caused by: java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:197) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273) ... 15 more Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:193) ... 16 more Caused by: java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:308) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:230) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:215) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.cassandra.thrift.Cassandra$Client.recv_set_keyspace(Cassandra.java:599) at org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:586) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:281) ... 7 more Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ... 18 more 2014-02-27 16:17:13,672 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) at java.lang.Thread.run(Thread.java:744) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260) Caused by: java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:337) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:125) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273) ... 15 more Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_local_ring(Cassandra.java:1277) at org.apache.cassandra.thrift.Cassandra$Client.describe_local_ring(Cassandra.java:1264) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:329) ... 17 more Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ... 28 more 2014-02-27 16:17:13,672 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 2 map reduce job(s) failed! 2014-02-27 16:17:13,672 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete 2014-02-27 16:17:13,675 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 1.2.1 0.10.0 ec2-user 2014-02-27 16:15:38 2014-02-27 16:17:13 HASH_JOIN,COGROUP,FILTER Some jobs have failed! Stop running all dependent jobs Job Stats (time in seconds): JobId Alias Feature Outputs job_local464993512_0001 action_counter,daily_action_counter,data_action_counter MAP_ONLY Failed Jobs: JobId Alias Feature Message Outputs N/A addedToCart_counter,addedToCart_money_counter,daily_addedToCart_counter,daily_addedToCart_money_counter,data_addedToCart_counter,data_addedToCart_money_counter,data_metrics1 COGROUP Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) at java.lang.Thread.run(Thread.java:744) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260) Caused by: java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:337) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:125) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273) ... 15 more Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_local_ring(Cassandra.java:1277) at org.apache.cassandra.thrift.Cassandra$Client.describe_local_ring(Cassandra.java:1264) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:329) ... 17 more Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ... 28 more N/A conversion_counter,conversion_money_counter,daily_conversion_counter,daily_conversion_money_counter,daily_sold_counter,daily_sold_money_counter,data_conversion_counter,data_conversion_money_counter,data_metrics2,data_sold_counter,data_sold_money_counter,sold_counter,sold_money_counter COGROUP Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Could not get input splits at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) at java.lang.Thread.run(Thread.java:744) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260) Caused by: java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:197) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273) ... 15 more Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:193) ... 16 more Caused by: java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:308) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:230) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:215) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.cassandra.thrift.Cassandra$Client.recv_set_keyspace(Cassandra.java:599) at org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:586) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:281) ... 7 more Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ... 18 more Input(s): Successfully read records from: "cassandra://metrics/action_counter" Failed to read data from "cassandra://metrics/addedToCart_counter" Failed to read data from "cassandra://metrics/addedToCart_money_counter" Failed to read data from "cassandra://metrics/conversion_money_counter" Failed to read data from "cassandra://metrics/sold_counter" Failed to read data from "cassandra://metrics/conversion_counter" Failed to read data from "cassandra://metrics/sold_money_counter" Output(s): Job DAG: job_local464993512_0001 -> null,null, null -> null, null -> null, null 2014-02-27 16:17:13,677 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Some jobs have failed! Stop running all dependent jobs 2014-02-27 16:17:13,678 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias data_metrics 2014-02-27 16:17:13,679 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias data_metrics at org.apache.pig.PigServer.openIterator(PigServer.java:857) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:682) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:555) at org.apache.pig.Main.main(Main.java:111) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.io.IOException: Couldn't retrieve job. at org.apache.pig.PigServer.store(PigServer.java:921) at org.apache.pig.PigServer.openIterator(PigServer.java:832) ... 12 more