Hi,
Thanks for your recommendation.
I also opened a ticket to keep track @
https://issues.apache.org/jira/browse/CASSANDRA-11748
Hope this could brought someone's attention to take a look. Thanks.
Sincerely,
Michael Fong
-Original Message-
From: Michael Kjellman [mailto:mkjell...@inte
Yes, it is very simple to access Cassandra data using Spark shell.
Step 1: Launch the spark-shell with the spark-cassandra-connector package
$SPARK_HOME/bin/spark-shell --packages
com.datastax.spark:spark-cassandra-connector_2.10:1.5.0
Step 2: Create a DataFrame pointing to your Cassandra table
For COPY TO you can try increasing the page timeout or decreasing the page
size:
PAGETIMEOUT=10 - the page timeout in seconds for fetching results
PAGESIZE='1000' - the page size for fetching results
You can pass these options to the COPY command by adding "WITH
PAGETIMEOUT=1000;",
On Mon, May 9, 2016 at 2:48 PM, Drew Kutcharian wrote:
>
>
> What’s the 3.0.6 release date? Seems like the code has been frozen for a
> few days now. I ask because I want to install Cassandra on Ubuntu 16.04 and
> CASSANDRA-10853 is blocking it.
>
We've been holding it up to sync it with the 3.6
The most immediate work-around would be to nodetool disablehints around the
cluster before you load data. This would stop it snowballing from hints at
least.
On Tue, May 10, 2016 at 7:49 AM, Erik Forsberg wrote:
> I have this situation where a few (like, 3-4 out of 84) nodes misbehave.
> Very l
No - repair does not change token ownership. The up/down state of a node is
not related to token ownership.
On Tue, May 10, 2016 at 3:26 PM, Anubhav Kale
wrote:
> Hello,
>
>
>
> Suppose I have 3 nodes, and stop Cassandra on one of them. Then I run a
> repair. Will repair move the token ranges fr
Hello,
Suppose I have 3 nodes, and stop Cassandra on one of them. Then I run a repair.
Will repair move the token ranges from down node to other node ? In other words
in any situation, does repair operation ever change token ownership ?
Thanks !
I didn't read the whole thread last time around, please disregard my
comment about the java driver jira.
One other thought (hopefully relevant this time). Once we have
https://issues.apache.org/jira/browse/CASSANDRA-10783, you could write a
write a (*start*, *rows*) style paging UDF which would al
I understand that spark supports hdfs and standalone modes.
The recommendation from cassandra is that spark should be installed in
standalone mode in SMACK framework.
On 10 May 2016 at 16:24, Sruti S wrote:
> Not sure what is meant.. Spark can access HDFS. Why is it in standalone
> mode? Please
Not sure what is meant.. Spark can access HDFS. Why is it in standalone
mode? Please clarify.
On Tue, May 10, 2016 at 11:08 AM, Srini Sydney
wrote:
> I have a clarification based on your answer -
>
> spark is installed as standalone mode (not hdfs) in SMACK framework. Our
> data lake is in hdfs
I have a clarification based on your answer -
spark is installed as standalone mode (not hdfs) in SMACK framework. Our data
lake is in hdfs . How do we overcome this ?
- cheers sreeni
> On 10 May 2016, at 08:16, vincent gromakowski
> wrote:
>
> Maybe a SMACK stack would be a better optio
I think this request belongs in the java driver jira not the Cassandra jira.
https://datastax-oss.atlassian.net/projects/JAVA/
all the best,
Sebastián
On May 10, 2016 1:09 AM, "Lu, Boying" wrote:
> I filed a JIRA https://issues.apache.org/jira/browse/CASSANDRA-11741 to
> track this.
>
>
>
> *F
I have this situation where a few (like, 3-4 out of 84) nodes misbehave.
Very long GC pauses, dropping out of cluster etc.
This happens while loading data (via CQL), and analyzing metrics it
looks like on these few nodes, a lot of hints are being generated close
to the time when they start to
I have concern over using secondary index on field with low cardinality.
Lets say I have few billion rows and each row can be classified in 1000
category. Lets say we have 50 node cluster.
Now we want to fetch data for a single category using secondary index over
a category. And query is paginated
Hi,
already that copy to might not be the best way to do this. I’ll write a
small spark job.
Thanks
2016-05-10 10:36 GMT+02:00 Carlos Rolo :
> Hello,
>
> That is a lot of data to do an "COPY TO.
>
> If you want a fast way to export, and you're fine with Java, you can use
> Cassandra SSTableRead
Hi all,
Sorry, I tested with an old index jar. The cassandra-3.0.3 and
dsc-cassandra-3.0.3 packages are the same. The error happens in both, i
think we have fixed it and it will be included in next release (maybe
3.0.5.1).
1.- Full repair is very intensive, thats why your cluster is non responsive
Hello,
That is a lot of data to do an "COPY TO.
If you want a fast way to export, and you're fine with Java, you can use
Cassandra SSTableReader classes to read the sstables directly. Spark also
works.
Regards,
Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra
sry, sent early..
more errors:
/export.cql:9:Error for (4549395184516451179, 4560441269902768904):
NoHostAvailable - ('Unable to complete the operation against any
hosts', {: ConnectionException('Host has
been marked down or removed',)}) (will try again later attempt 1 of 5)
/export.cql:9:Error f
Hi,
i try to export data of a table (~15GB) using the cqlsh copy to. It fails
with „no host available“. If I try it with a smaller table everything works
fine.
The statistics of the big table:
SSTable count: 81
Space used (live): 14102945336
Space
Maybe a SMACK stack would be a better option for using spark with
Cassandra...
Le 10 mai 2016 8:45 AM, "Srini Sydney" a écrit :
> Thanks a lot..denise
>
> On 10 May 2016 at 02:42, Denise Rogers wrote:
>
>> It really depends how close you want to stay to the most current versions
>> of open sourc
20 matches
Mail list logo