Doh - minutes after my question I saw the same from a couple of days ago
Indeed, using C* driver 3.0.0-rc1 seems to solve the issue
Jan
> On 22 Feb 2016, at 12:13, Jan Algermissen wrote:
>
> Hi,
>
> I am using
>
> Cassandra 2.1.5
> Spark 1.5.2
> Cassandra j
Hi,
I am using
Cassandra 2.1.5
Spark 1.5.2
Cassandra java-drive 3.0.0
Cassandra-Connector 1.5.0-RC1
All with scala 2.11.7
Nevertheless, I get the following error from my Spark job:
java.lang.NoSuchMethodError:
com.datastax.driver.core.TableMetadata.getIndexes()Ljava/util/List;
at com.datastax
Hi,
we are running a streaming job that processes about 500 events per 20s batches
and uses updateStateByKey to accumulate Web sessions (with a 30 Minute live
time).
The checkpoint intervall is set to 20xBatchInterval, that is 400s.
Cluster size is 8 nodes.
We are having trouble with the amou
Hi,
I am using Spark and the Cassandra-connector to save customer events for later
batch analysis.
Primary access pattern later on will be by time-slice
One way to save the events would be to create a C* row per day, for example,
and within that row store the events in decreasing time order.
Finally, I found the solution:
on the spark context you can set spark.executorEnv.[EnvironmentVariableName]
and these will be available in the environment of the executors
This is in fact documented, but somehow I missed it.
https://spark.apache.org/docs/latest/configuration.html#runtime-enviro
Hi,
I am using spark 1.4 M1 with the Cassandra Connector and run into a strange
error when using the spark shell.
This works:
sc.cassandraTable("events",
"bid_events").select("bid","type").take(10).foreach(println)
But as soon as I put a map() in there (or filter):
sc.cassandraTable("events
Hi,
I am starting a spark streaming job in standalone mode with spark-submit.
Is there a way to make the UNIX environment variables with which spark-submit
is started available to the processes started on the worker nodes?
Jan
Hi,
I am planning to process an event stream in the following way:
- write the raw stream through spark streaming to cassandra for later analytics
use cases
- ‘fork of’ the stream and do some stream analysis and make that information
available to build dashboards.
Since I am having ElasticSear