Re: SSTable structure

2015-04-01 Thread Serega Sheypak
Hi bharat, you are talking about Cassandra 1.2.5 Does it fit Cassandra 2.1? Were there any significant changes to SSTable format and layout? Thank you, article is interesting. Hi jacob , HBase does it for example. http://hbase.apache.org/book.html#_hfile_format_2 It would be great to give general

Re: Multinode Cassandra and sstableloader

2015-04-01 Thread Alain RODRIGUEZ
>From Michael Laing - posted on the wrong thread : "We use Alain's solution as well to make major operational revisions. We have a "red team" and a "blue team in each AWS region, so we just add and drop datacenters to get where we want to be. Pretty simple." 2015-03-31 15:50 GMT+02:00 Alain ROD

Testing sstableloader between Cassandra 2.1 DSE and community edition 2.1

2015-04-01 Thread Serega Sheypak
Hi, I have 2 cassandra clusters. cluster1 is datastax community 2.1 cluster2 is datastax DSE I can run sstableloader from cluster1(Community) and stream data to cluster2 (DSE) But I get exception while streaming from cluster2 (DSE) to cluster1 (Community) The expection is: Could not retrieve e

Re: Testing sstableloader between Cassandra 2.1 DSE and community edition 2.1

2015-04-01 Thread Serega Sheypak
Sorry cluster1 community version is: ii cassandra 2.1.3 distributed storage system for structured data cluster2 DSE version is: ii dse-libcassandra4.6.2-1 The DataStax Enterprise package includes a production-certifie 2015-04-01 14:53 GMT+02:00 Serega Sheypak : >

[SECURITY ANNOUNCEMENT] CVE-2015-0225

2015-04-01 Thread Jake Luciani
CVE-2015-0225: Apache Cassandra remote execution of arbitrary code Severity: Important Vendor: The Apache Software Foundation Versions Affected: Cassandra 1.2.0 to 1.2.19 Cassandra 2.0.0 to 2.0.13 Cassandra 2.1.0 to 2.1.3 Description: Under its default configuration, Cassandra binds an unauthen

Frequent timeout issues

2015-04-01 Thread Amlan Roy
Hi, I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for bot

Re: Frequent timeout issues

2015-04-01 Thread Eric R Medley
Amlan, Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out? Regards, Eric R Medley > On Apr 1, 2015, at 9:03 AM, Amlan Roy wrote: > > Hi, > > I am new to Cassandra. I have setup a cluster with Ca

Datastax driver object mapper and union field

2015-04-01 Thread Craig Ching
Hi! We need to implement a union field in our cassandra data model and we're using the datastax Mapper. Anyone have any recommendations for doing this? I'm thinking something like: public class Value { int dataType; String valueAsString; double valueAsDouble; } If the Value is a String,

Re: Frequent timeout issues

2015-04-01 Thread Eric R Medley
Also, can you provide the table details and the consistency level you are using? Regards, Eric R Medley > On Apr 1, 2015, at 9:13 AM, Eric R Medley wrote: > > Amlan, > > Can you provide information on how much data is being written? Are any of the > columns really large? Are any writes succe

Re: Frequent timeout issues

2015-04-01 Thread Amlan Roy
Hi Eric, Thanks for the reply. Some columns are big but I see the issue even when I stop storing the big columns. Some of the writes are timing out, not all. Where can I find the number of writes to Cassandra? Regards, Amlan On 01-Apr-2015, at 7:43 pm, Eric R Medley wrote: > Amlan, > > Can

Re: Frequent timeout issues

2015-04-01 Thread Amlan Roy
Write consistency level is ONE. This is the describe output for one of the tables. CREATE TABLE event_data ( event text, week text, bucket int, date timestamp, unique text, adt int, age list, arrival list, bank text, bf double, cabin text, card text, carrier list, cb d

Re: Frequent timeout issues

2015-04-01 Thread Brice Dutheil
And the keyspace? What is the replication factor. Also how are the inserts done? On Wednesday, April 1, 2015, Amlan Roy wrote: > Write consistency level is ONE. > > This is the describe output for one of the tables. > > CREATE TABLE event_data ( > event text, > week text, > bucket int, >

Re: Frequent timeout issues

2015-04-01 Thread Eric R Medley
Are you seeing any exceptions in the cassandra logs? What are the loads on your servers? Have you monitored the performance of those servers? How many writes are you performing at a time? How many writes per seconds? Regards, Eric R Medley > On Apr 1, 2015, at 9:40 AM, Amlan Roy wrote: > > W

Re: Frequent timeout issues

2015-04-01 Thread Eric R Medley
Are HBase and Cassandra running on the same servers? Are the writes to each of these databases happening at the same time? Regards, Eric R Medley > On Apr 1, 2015, at 10:12 AM, Brice Dutheil wrote: > > And the keyspace? What is the replication factor. > > Also how are the inserts done? > >

Re: Frequent timeout issues

2015-04-01 Thread Amlan Roy
Replication factor is 2. CREATE KEYSPACE ct_keyspace WITH replication = { 'class': 'NetworkTopologyStrategy', 'DC1': '2' }; Inserts are happening from Storm using java driver. Using prepared statement without batch. On 01-Apr-2015, at 8:42 pm, Brice Dutheil wrote: > And the keyspace? What

Re: Frequent timeout issues

2015-04-01 Thread Brian O'Neill
Are you using the storm-cassandra-cql driver? (https://github.com/hmsonline/storm-cassandra-cql) If so, what version? Batching or no batching? -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42

Re: Frequent timeout issues

2015-04-01 Thread Amlan Roy
Using the datastax driver without batch. http://www.datastax.com/documentation/developer/java-driver/2.1/java-driver/whatsNew2.html On 01-Apr-2015, at 9:15 pm, Brian O'Neill wrote: > > Are you using the storm-cassandra-cql driver? > (https://github.com/hmsonline/storm-cassandra-cql) > > If s

Table design for historical data

2015-04-01 Thread Firdousi Farozan
Hi, My requirement is to design a table for historical state information (not exactly time-series). For ex: I have devices connecting and disconnecting to the management platform. I want to know the details such as (name, mac, os, image, etc.) for all devices connected to the management platform i

Re: Frequent timeout issues

2015-04-01 Thread Amlan Roy
Did not see any exception in cassandra.log and system.log. Monitored using JConsole. Did not see anything wrong. Do I need to see any specific info? Doing almost 1000 writes/sec. HBase and Cassandra are running on different clusters. For cassandra I have 6 nodes with 64GB RAM(Heap is at default

replace_address vs add+removenode

2015-04-01 Thread Ulrich Geilmann
Hi. The documentation suggests to use the replace_address startup parameter for replacing a dead node. However, it doesn't motivate why this is superior over adding a new node and retiring the dead one using nodetool removenode. I assume it can be more efficient since the new node can take over th

Re: Why select returns tombstoned results?

2015-04-01 Thread Benyi Wang
Unfortunately I'm using 2.1.2. Is it possible that I downgrade to 2.0.13 without wiping out the data? I'm worrying about if there is a bug in 2.1.2. On Tue, Mar 31, 2015 at 4:37 AM, Paulo Ricardo Motta Gomes < paulo.mo...@chaordicsystems.com> wrote: > What version of Cassandra are you running?

Re: Why select returns tombstoned results?

2015-04-01 Thread Benyi Wang
All servers are running ntpd. I guess the time should be synced across all servers. My dataset is too large to use sstable2json. It would take long time. I will try to repair to see if the issue is gone. On Tue, Mar 31, 2015 at 7:49 AM, Ken Hancock wrote: > Have you checked time sync across al

Re: replace_address vs add+removenode

2015-04-01 Thread Robert Coli
On Wed, Apr 1, 2015 at 9:26 AM, Ulrich Geilmann < ulrich.geilm...@freiheit.com> wrote: > I assume it can be more efficient since the new node can take over the > exact tokens of the dead node. Are there any other differences? > That's the reason. You get one streaming operation ("bootstrap a new

Re: Frequent timeout issues

2015-04-01 Thread Robert Coli
On Wed, Apr 1, 2015 at 8:37 AM, Amlan Roy wrote: > Replication factor is 2. > It is relatively unusual for people to use a replication factor of 2, for what it's worth. =Rob

Re: replace_address vs add+removenode

2015-04-01 Thread Anuj Wadehra
In both cases node needs to bootstrap and get data frm other nodes. Removenode has an additional cost as it will lead to additional redistribution of tokens such that all data resides on remaining nodes as per the replication strategy. On removenode, remaining nodes will stream data amongst them

Re: Table design for historical data

2015-04-01 Thread Eric R Medley
Firdousi, What kind of events would be stored in the table? Will you be writing an event when a device connects and another when it disconnects or will you write a single event after the device finally disconnects? Also, for your queries, do you want ad-hoc start and end times or do you have a

Re: Frequent timeout issues

2015-04-01 Thread Anuj Wadehra
Are you writing multiple cf at same time? Please run nodetool tpstats to make sure that FlushWriter etc doesnt have high All time blocked counts. A Blocked memtable FlushWriter may block/drop writes. If thats the case you may need to increase memtable flush writers..if u have many secondary ind

Re: Testing sstableloader between Cassandra 2.1 DSE and community edition 2.1

2015-04-01 Thread Michael Shuler
On 04/01/2015 08:10 AM, Serega Sheypak wrote: Sorry cluster1 community version is: ii cassandra 2.1.3 distributed storage system for structured data cluster2 DSE version is: ii dse-libcassandra4.6.2-1 The DataStax Enterprise package includes a production-ce

Re: Testing sstableloader between Cassandra 2.1 DSE and community edition 2.1

2015-04-01 Thread Serega Sheypak
Got it. 2015-04-01 20:39 GMT+02:00 Michael Shuler : > On 04/01/2015 08:10 AM, Serega Sheypak wrote: > >> Sorry >> cluster1 community version is: ii cassandra 2.1.3 >>distributed storage system for structured data >> cluster2 DSE version is: ii dse-libcassandra4

Re: How to store unique visitors in cassandra

2015-04-01 Thread Jim Ancona
Very interesting. I had saved your email from three years ago in hopes of an elegant answer. Thanks for sharing! Jim On Tue, Mar 31, 2015 at 8:16 AM, Alain RODRIGUEZ wrote: > People keep asking me if we finally found a solution (even if this is 3+ > years old) so I will just update this thread

Re: SSTable structure

2015-04-01 Thread Bharatendra Boddu
Hi Serega, Most of the content in the blog article is still relevant. After 1.2.5 (ic), there are only three new versions (ja, jb, ka) for SSTable format. Following are the changes in these versions. // ja (2.0.0): super columns are serialized as composites (note that there is no real for

Re: Cross-datacenter requests taking a very long time.

2015-04-01 Thread Bharatendra Boddu
What type of snitch are you using for cassandra.yaml: endpoint_snitch ? PropertyFileSnitch can improve performance. - bharat On Tue, Mar 31, 2015 at 1:59 PM, daemeon reiydelle wrote: > What is your replication factor? > > Any idea how much data has to be processed under the query? > > With that

Re: Table design for historical data

2015-04-01 Thread Firdousi Farozan
I will be writing an event when device connects. Probably a device never disconnects till current time, and I want to return that device for that time range. Device disconnect is used to mark the end time; Any query beyond that time should not return that device. Queries can have adhoc start and e

Exception while running cassandra stress client

2015-04-01 Thread ankit tyagi
Hi All, while running cassandra stress tool shipped with cassandra 2.0.4 version, i am getting following error *./bin/cassandra-stress user profile=./bin/test.yaml* *Application does not allow arbitrary arguments: user, profile=./bin/test.yaml* I am stuck on this and not able to find out why thi

log all the query statement

2015-04-01 Thread 鄢来琼
Hi all, Cassandra 2.1.2 is used in my project, but some node is down after executing query some statements. Could I configure the Cassandra to log all the executed statement? Hope the log file can be used to identify the problem. Thanks. Peter