Hello All,

I'm a newbie to Spark and Cassandra. I try to run the spark demo within
dse-cassandra Portfoliodemo in a cluster env but cannot succeed.

This issue may not really coming from spark, but I am really not sure how
to investigate more on this. Please help me.

There are 5 centos servers in my cluster (all installed dse by yum). Here
is the status by nodetool:


   - server82 server80 server106 act as cassandra nodes.
   - server134 server136 act as analytics nodes with spark enable.

Datacenter: Cassandra
> =====================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address          Load       Tokens  Owns   Host ID
>           Rack
> UN  xxx.xxx.xxx.82   148.85 KB  256     27.9%
>  ee04410c-eea1-4016-a6d3-7b65dd599689  rack1
> UN  xxx.xxx.xxx.80   109.33 KB  256     36.5%
>  06ee3d5c-2e85-4231-89e2-3789f37bfce5  rack1
> UN  xxx.xxx.xxx.106  132.78 KB  256     35.3%
>  33c7c212-c528-4afe-abe1-197aa86dfc01  rack1
> Datacenter: Analytics
> =====================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address          Load       Tokens  Owns   Host ID
>           Rack
> UN  xxx.xxx.xxx.136  150.22 KB  1       0.0%
> d33a8c3a-1d15-4d8e-8d2b-b0324c4aafe5  rack1
> UN  xxx.xxx.xxx.134  160.02 KB  1       0.3%
> 89c53d92-f54f-4bea-aa1e-e93777983b4d  rack1


I've execute the data simulated steps by trigger program price on one of my
Cassandra node (server80)

Then I login to server136 to execute the spark program
*"10-day-loss-java.sh" *at analyze node, but the following error msg
appears:

14/12/25 16:18:51 WARN TaskSetManager: Lost task 0.0 in stage 4.0 (TID 2,
> 135.252.169.134): java.io.IOException: Exception during execution of SELECT
> "key", "column1", "value" FROM "PortfolioDemo"."Portfolios" WHERE
> token("key") > ? ALLOW FILTERING: Not enough replica available for query at
> consistency LOCAL_ONE (1 required but only 0 alive)


I then suspect that the keyspace "PortfolioDemo" replication strategy may
be incorrect. So I change it in cqlsh

*ALTER KEYSPACE "PortfolioDemo" WITH REPLICATION = { 'class' :
> 'NetworkTopologyStrategy', 'Cassandra' : 1 }; *


Then I execute the nodetool repair all the nodes although nodetool told me
Nothing to repaire with keyspace "PortfolioDemo".

$ *nodetool -h xxx.xxx.xxx.80 repair "PortfolioDemo"*

[2014-12-26 09:35:59,995] Nothing to repair for keyspace 'PortfolioDemo'

$* nodetool -h **xxx.xxx.xxx**.82 repair "PortfolioDemo"*
>
[2014-12-26 09:36:10,983] Nothing to repair for keyspace 'PortfolioDemo'

$ *nodetool -h **xxx.xxx.xxx**.106 repair "PortfolioDemo"*
>
[2014-12-26 09:36:18,917] Nothing to repair for keyspace 'PortfolioDemo'

$ *nodetool -h **xxx.xxx.xxx**.136 repair "PortfolioDemo"*
>
[2014-12-26 09:36:26,155] Nothing to repair for keyspace 'PortfolioDemo'

$ *nodetool -h **xxx.xxx.xxx.**134 repair "PortfolioDemo"*
>
[2014-12-26 09:36:32,519] Nothing to repair for keyspace 'PortfolioDemo'


Anyway, now I can access the record in "PortfolioDemo.Portfolios" by select
statement on server136.

I execute the  *"10-day-loss-java.sh" *again on server136. Then the
following error msg appears instead:

Exception in thread "main" scala.collection.parallel.CompositeThrowable:
> Multiple exceptions thrown during a parallel computation:
> java.io.IOException: Failed to fetch splits of
> TokenRange(9190631453255400980,9149489230329032117,Set(),None) because
> there are no replicas for the keyspace in the current datacenter.

com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter$$anonfun$split$2.apply(ServerSideTokenRangeSplitter.scala:53)

com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter$$anonfun$split$2.apply(ServerSideTokenRangeSplitter.scala:49)

scala.Option.getOrElse(Option.scala:120)

com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter.split(ServerSideTokenRangeSplitter.scala:49)


Now I am not sure how to do any further investigation next. Would you
please help me on this ?

Merry Christmas to everyone.

Best regards
Zhang JiaQiang

Reply via email to