Re: nodetool rebuild data size

2017-12-14 Thread Anshu Vajpayee
You will require to rebuild each node with nodetool rebuild command. it would be 60TB. On Thu, Dec 14, 2017 at 11:35 AM, Peng Xiao <2535...@qq.com> wrote: > Hi there, > > if we have a Cassandra DC1 with data size 60T,RF=3,then we rebuild a new > DC2(RF=3),how much data will

nodetool rebuild data size

2017-12-13 Thread Peng Xiao
Hi there, if we have a Cassandra DC1 with data size 60T,RF=3,then we rebuild a new DC2(RF=3),how much data will stream to DC2?20T or 60T? Thanks, Peng Xiao

Re: Different data size between datacenters

2017-08-07 Thread Chuck Reynolds
Keyspace has WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '3', 'us-east-productiondata': '3'} AND durable_writes = true; From: Jeff Jirsa Reply-To: "user@cassandra.apache.org" Date: Monday, August 7, 2

Re: Different data size between datacenters

2017-08-07 Thread Jeff Jirsa
ff Jirsa > *Reply-To: *"user@cassandra.apache.org" > *Date: *Monday, August 7, 2017 at 2:51 PM > *To: *cassandra > *Subject: *Re: Different data size between datacenters > > > > And when you say the data size is smaller, you mean per node? Or sum of > all nodes in

Re: Different data size between datacenters

2017-08-07 Thread Chuck Reynolds
Yes it’s the total size. Could it be that tombstones or data that nodes no longer own is not being copied/streamed to the data center in AWS? From: Jeff Jirsa Reply-To: "user@cassandra.apache.org" Date: Monday, August 7, 2017 at 2:51 PM To: cassandra Subject: Re: Different data si

Re: Different data size between datacenters

2017-08-07 Thread Jeff Jirsa
And when you say the data size is smaller, you mean per node? Or sum of all nodes in the datacenter? With 185 hosts in AWS vs 135 in your DC, I would expect your DC hosts to have 30% less data per host than AWS. If instead they have twice as much, it sounds like it's balancing by # of t

Re: Different data size between datacenters

2017-08-07 Thread Chuck Reynolds
So we have the default 256 in our datacenter and 128 in AWS. From: "ZAIDI, ASAD A" Reply-To: "user@cassandra.apache.org" Date: Monday, August 7, 2017 at 1:36 PM To: "user@cassandra.apache.org" Subject: RE: Different data size between datacenters Are you using

Re: Different data size between datacenters

2017-08-07 Thread Chuck Reynolds
Yes to the NetworkTopologyStrategy. From: Jeff Jirsa Reply-To: "user@cassandra.apache.org" Date: Monday, August 7, 2017 at 2:39 PM To: cassandra Subject: Re: Different data size between datacenters You're using NetworkTopologyStrategy and not SimpleStrategy, correct? On Mon,

Re: Different data size between datacenters

2017-08-07 Thread Jeff Jirsa
You're using NetworkTopologyStrategy and not SimpleStrategy, correct? On Mon, Aug 7, 2017 at 11:50 AM, Chuck Reynolds wrote: > I have a cluster that spans two datacenters running Cassandra 2.1.12. 135 > nodes in my data center and about 185 in AWS. > > > > The size of the second data center (A

RE: Different data size between datacenters

2017-08-07 Thread ZAIDI, ASAD A
Are you using same number of token/vnodes in both data centers? From: Chuck Reynolds [mailto:creyno...@ancestry.com] Sent: Monday, August 07, 2017 1:51 PM To: user@cassandra.apache.org Subject: Different data size between datacenters I have a cluster that spans two datacenters running Cassandra

Different data size between datacenters

2017-08-07 Thread Chuck Reynolds
I have a cluster that spans two datacenters running Cassandra 2.1.12. 135 nodes in my data center and about 185 in AWS. The size of the second data center (AWS) is quite a bit smaller. Replication is the same in both datacenters. Is there a logical explanation for this? thanks

Re: Replicated data size

2017-06-11 Thread Vladimir Yudovin
Hi Vasu, I'm not sure Cassandra can pro, but actually all volume of data inserted in on dc should go to other DC. You can check network traffic with any available tools. Best regards, Vladimir Yudovin, Winguzone - Cloud Cassandra Hosting On Sat, 10 Jun 2017 14:18:17 -0400 vasu

Replicated data size

2017-06-10 Thread vasu gunja
Hi All, I have a unique requirement from my management. Here is the details of it *Quick idea about my environment:* we have multi-DC( 2 dc's) setup 10 nodes each. keyspace RF of 3 each. We need to know how much data replicated across data centers on per day basis. is there anyway to calcula

Re: How to find total data size of a keyspace.

2017-03-05 Thread anuja jain
nodetool status or nodetool ring still gives the load of all keyspaces on cluster. On Tue, Feb 28, 2017 at 6:56 PM, Surbhi Gupta wrote: > Nodetool status key space_name . > On Tue, Feb 28, 2017 at 4:53 AM anuja jain wrote: > >> Hi, >> Using nodetool cfstats gives me dat

Re: How to find total data size of a keyspace.

2017-02-28 Thread Surbhi Gupta
Nodetool status key space_name . On Tue, Feb 28, 2017 at 4:53 AM anuja jain wrote: > Hi, > Using nodetool cfstats gives me data size of each table/column family and > nodetool ring gives me load of all keyspace in cluster but I need total > data size of one keyspace in the cluste

How to find total data size of a keyspace.

2017-02-28 Thread anuja jain
Hi, Using nodetool cfstats gives me data size of each table/column family and nodetool ring gives me load of all keyspace in cluster but I need total data size of one keyspace in the cluster. How can I get that?

SSTableWriter error: incorrect row data size

2015-10-09 Thread Eiti Kimura
) Exception in thread Thread[CompactionExecutor:6523,1,main] java.lang.AssertionError: incorrect row data size 568009715 written to /movile/cassandra-data/SBSPlatform/idx_config/SBSPlatform-idx_config-tmp-ic-715-Data.db; correct is 568010203 at org.apache.cassandra.io.sstable.SSTableWriter.append

Re: Data Size on each node

2015-09-07 Thread Ryan Svihla
perations like repair, bootstrap, >> decommission, ... faster) >> >> C*heers, >> >> Alain >> >> >> >> >> 2015-09-01 10:17 GMT+02:00 Sachin Nikam : >> >>> We currently have a Cassandra Cluster spread over 2 DC. The data size

Re: Data Size on each node

2015-09-04 Thread Alprema
ce operations like repair, bootstrap, > decommission, ... faster) > > C*heers, > > Alain > > > > > 2015-09-01 10:17 GMT+02:00 Sachin Nikam : > >> We currently have a Cassandra Cluster spread over 2 DC. The data size on >> each node of the cluster is 1.2TB with spinni

Re: Data Size on each node

2015-09-01 Thread Alain RODRIGUEZ
:17 GMT+02:00 Sachin Nikam : > We currently have a Cassandra Cluster spread over 2 DC. The data size on > each node of the cluster is 1.2TB with spinning disk. Minor and Major > compactions are slowing down our Read queries. It has been suggested that > replacing Spinning disks with SS

Data Size on each node

2015-09-01 Thread Sachin Nikam
We currently have a Cassandra Cluster spread over 2 DC. The data size on each node of the cluster is 1.2TB with spinning disk. Minor and Major compactions are slowing down our Read queries. It has been suggested that replacing Spinning disks with SSD might help. Has anybody done something similar

Meaning of "java.lang.AssertionError: incorrect row data size ..."

2013-09-08 Thread Jan Algermissen
ct') (estimated 3145728 bytes) ERROR [CompactionExecutor:2] 2013-09-07 13:46:27,163 CassandraDaemon.java (line 192) Exception in thread Thread[CompactionExecutor :2,1,main] java.lang.AssertionError: incorrect row data size 132289 written to /var/lib/cassandra/data/products/product/product

Re: AssertionError: Incorrect row data size

2013-07-30 Thread Pavel Kirienko
read_repair_chance=0.00 AND >>> > gc_grace_seconds=864000 AND >>> > read_repair_chance=0.10 AND >>> > replicate_on_write='true' AND >>> > populate_io_cache_on_flush='false' AND >>> > compaction={'

Re: AssertionError: Incorrect row data size

2013-07-30 Thread Pavel Kirienko
'false' AND >> > compaction={'class': 'SizeTieredCompactionStrategy'} AND >> > compression={'sstable_compression': 'SnappyCompressor'}; >> > >> > Column DATA contains blobs of size about 1..50MB, averag

Re: AssertionError: Incorrect row data size

2013-07-28 Thread Paul Ingalls
} AND >> > compression={'sstable_compression': 'SnappyCompressor'}; >> > >> > Column DATA contains blobs of size about 1..50MB, average size should be >> > something of 5MB. >> > >> > Sometimes this table expiriences huge w

Re: AssertionError: Incorrect row data size

2013-07-27 Thread Pavel Kirienko
compression={'sstable_compression': 'SnappyCompressor'}; > > > > Column DATA contains blobs of size about 1..50MB, average size should be > something of 5MB. > > > > Sometimes this table expiriences huge write loads for few hours, at such > times I

Re: AssertionError: Incorrect row data size

2013-07-26 Thread Paul Ingalls
blobs of size about 1..50MB, average size should be > something of 5MB. > > Sometimes this table expiriences huge write loads for few hours, at such > times I see suspicious things in logs: > > ERROR [CompactionExecutor:357] 2013-07-24 12:32:10,293 CassandraDaemon.java &g

AssertionError: Incorrect row data size

2013-07-26 Thread Pavel Kirienko
times this table expiriences huge write loads for few hours, at such times I see suspicious things in logs: ERROR [CompactionExecutor:357] 2013-07-24 12:32:10,293 CassandraDaemon.java (line 192) Exception in thread Thread[CompactionExecutor:357,1,main] java.lang.AssertionError: incorrect row data

Re: Recommended data size for Reads/Writes in Cassandra

2013-07-21 Thread aaron morton
hrift_max_message_length_in_mb), by > default it is 64m if I'm not mistaken. This is your limit. > > > On Thu, Jul 18, 2013 at 2:03 PM, hajjat wrote: > Hi, > > Is there a recommended data size for Reads/Writes in Cassandra? I tried > inserting 10 MB objects an

Re: Incorrect row data size

2013-07-19 Thread Paul Ingalls
0-Data.db (2237487 bytes) for commitlog position ReplayPosition(segmentId=1374260151415, position=10223602) ERROR [CompactionExecutor:4] 2013-07-19 19:19:51,969 CassandraDaemon.java (line 192) Exception in thread Thread[CompactionExecutor:4,1,main] java.lang.AssertionError: incorrect row data

Re: Recommended data size for Reads/Writes in Cassandra

2013-07-18 Thread Tyler Hobbs
iple columns/rows if >> necessary. >> >> >> On Thu, Jul 18, 2013 at 4:31 PM, Andrey Ilinykh wrote: >> >>> there is a limit of thrift message ( thrift_max_message_length_in_mb), >>> by default it is 64m if I'm not mistaken. This is your limit. >>

Re: Recommended data size for Reads/Writes in Cassandra

2013-07-18 Thread Mohammad Hajjat
ssary. > > > On Thu, Jul 18, 2013 at 4:31 PM, Andrey Ilinykh wrote: > >> there is a limit of thrift message ( thrift_max_message_length_in_mb), by >> default it is 64m if I'm not mistaken. This is your limit. >> >> >> On Thu, Jul 18, 2013 at 2:03 PM, hajj

Re: Recommended data size for Reads/Writes in Cassandra

2013-07-18 Thread Tyler Hobbs
efault it is 64m if I'm not mistaken. This is your limit. > > > On Thu, Jul 18, 2013 at 2:03 PM, hajjat wrote: > >> Hi, >> >> Is there a recommended data size for Reads/Writes in Cassandra? I tried >> inserting 10 MB objects and the latency I got was pretty h

Re: Recommended data size for Reads/Writes in Cassandra

2013-07-18 Thread Andrey Ilinykh
there is a limit of thrift message ( thrift_max_message_length_in_mb), by default it is 64m if I'm not mistaken. This is your limit. On Thu, Jul 18, 2013 at 2:03 PM, hajjat wrote: > Hi, > > Is there a recommended data size for Reads/Writes in Cassandra? I tried > inserting

Recommended data size for Reads/Writes in Cassandra

2013-07-18 Thread hajjat
Hi, Is there a recommended data size for Reads/Writes in Cassandra? I tried inserting 10 MB objects and the latency I got was pretty high. Also, I was never able to insert larger objects (say 50 MB) since Cassandra kept crashing when I tried that. Here is my experiment setup: I used two Large

Incorrect row data size

2013-07-18 Thread Paul Ingalls
hread Thread[CompactionExecutor:4,1,main] java.lang.AssertionError: incorrect row data size 72128792 written to /mnt/datadrive/lib/cassandra/data/fanzo/tweets_by_affiliation/fanzo-tweets_by_affiliation-tmp-ic-918-Data.db; correct is 72148465 at org.apache.cassandra.io.sstable.SSTableWriter.a

Re: Cassandra performance decreases drastically with increase in data size.

2013-06-03 Thread srmore
n2.nabble.com/Is-it-safe-to-stop-a-read-repair-and-any-suggestion-on-speeding-up-repairs-td6607367.html > > Thanks > > On May 29, 2013, at 9:32 PM, srmore wrote: > > Hello, > I am observing that my performance is drastically decreasing when my data > size grows. I have a 3

Re: Cassandra performance decreases drastically with increase in data size.

2013-05-30 Thread Aiman Parvaiz
wrote: > Hello, > I am observing that my performance is drastically decreasing when my data > size grows. I have a 3 node cluster with 64 GB of ram and my data size is > around 400GB on all the nodes. I also see that when I re-start Cassandra the > performance goes back to normal

Re: Cassandra performance decreases drastically with increase in data size.

2013-05-30 Thread Bryan Talbot
t;> >> On Wed, May 29, 2013 at 11:32 PM, srmore wrote: >> > Hello, >> > I am observing that my performance is drastically decreasing when my >> data >> > size grows. I have a 3 node cluster with 64 GB of ram and my data size >> is >> > arou

Re: Cassandra performance decreases drastically with increase in data size.

2013-05-30 Thread srmore
p://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 > > On Wed, May 29, 2013 at 11:32 PM, srmore wrote: > > Hello, > > I am observing that my performance is drastically decreasing when my data > > size grows. I have a 3 node cluster with 64 GB of ram and

Re: Cassandra performance decreases drastically with increase in data size.

2013-05-29 Thread Jonathan Ellis
32 PM, srmore wrote: > Hello, > I am observing that my performance is drastically decreasing when my data > size grows. I have a 3 node cluster with 64 GB of ram and my data size is > around 400GB on all the nodes. I also see that when I re-start Cassandra the > performance goes back to n

Cassandra performance decreases drastically with increase in data size.

2013-05-29 Thread srmore
Hello, I am observing that my performance is drastically decreasing when my data size grows. I have a 3 node cluster with 64 GB of ram and my data size is around 400GB on all the nodes. I also see that when I re-start Cassandra the performance goes back to normal and then again starts decreasing

Re: data size difference between supercolumn and regular column

2012-04-06 Thread Yiming Sun
Thanks for the advice, Maki, especially on the ulimit! Yes, we will play with the configuration and figure out some optimal sstable size. -- Y. On Wed, Apr 4, 2012 at 9:49 AM, Watanabe Maki wrote: > LeveledCompaction will use less disk space(load), but need more IO. > If your traffic is too hig

Re: data size difference between supercolumn and regular column

2012-04-04 Thread Watanabe Maki
LeveledCompaction will use less disk space(load), but need more IO. If your traffic is too high for your disk, you will have many pending compaction tasks, and large number of sstables which wait to be compacted. Also the default sstable_size_in_mb (5MB) will be too small for large data set. You

Re: data size difference between supercolumn and regular column

2012-04-04 Thread Yiming Sun
Cool, I will look into this new leveled compaction strategy and give it a try. BTW, Aaron, I think the last word of your message meant to say "compression", correct? -- Y. On Mon, Apr 2, 2012 at 9:37 PM, aaron morton wrote: > If you have a workload with overwrites you will end up with some data

Re: data size difference between supercolumn and regular column

2012-04-03 Thread Tamar Fraenkel
Do you have a good reference for maintenance scripts for Cassandra ring? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Apr 3, 2012 at 4:37 AM, aaron morton wr

Re: data size difference between supercolumn and regular column

2012-04-02 Thread aaron morton
If you have a workload with overwrites you will end up with some data needing compaction. Running a nightly manual compaction would remove this, but it will also soak up some IO so it may not be the best solution. I do not know if Leveled compaction would result in a smaller disk load for the

Re: data size difference between supercolumn and regular column

2012-04-02 Thread Yiming Sun
Yup Jeremiah, I learned a hard lesson on how cassandra behaves when it runs out of disk space :-S.I didn't try the compression, but when it ran out of disk space, or near running out, compaction would fail because it needs space to create some tmp data files. I shall get a tatoo that says keep

Re: data size difference between supercolumn and regular column

2012-04-01 Thread Jeremiah Jordan
Is that 80% with compression? If not, the first thing to do is turn on compression. Cassandra doesn't behave well when it runs out of disk space. You really want to try and stay around 50%, 60-70% works, but only if it is spread across multiple column families, and even then you can run into

Re: data size difference between supercolumn and regular column

2012-04-01 Thread Yiming Sun
Thanks Aaron. Well I guess it is possible the data files from sueprcolumns could've been reduced in size after compaction. This bring yet another question. Say I am on a shoestring budget and can only put together a cluster with very limited storage space. The first iteration of pushing data in

Re: data size difference between supercolumn and regular column

2012-03-31 Thread aaron morton
> does cassandra 1.0 perform some default compression? No. The on disk size depends to some degree on the work load. If there are a lot of overwrites or deleted you may have rows/columns that need to be compacted. You may have some big old SSTables that have not been compacted for a while.

Re: data size difference between supercolumn and regular column

2012-03-28 Thread Yiming Sun
Actually, after I read an article on cassandra 1.0 compression just now ( http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression), I am more puzzled. In our schema, we didn't specify any compression options -- does cassandra 1.0 perform some default compression? or is the data red

data size difference between supercolumn and regular column

2012-03-28 Thread Yiming Sun
Hi, We are trying to estimate the amount of storage we need for a production cassandra cluster. While I was doing the calculation, I noticed a very dramatic difference in terms of storage space used by cassandra data files. Our previous setup consists of a single-node cassandra 0.8.x with no rep

Re: "Final buffer length 4690 to accomodate data size of 2347 for RowMutation" error caused node death

2012-03-07 Thread Jonathan Ellis
ing out on the affected >> > node. >> > I gave up waiting for the init script to stop Cassandra and killed it >> > myself >> > after about 3 minutes, restarted it and it has been fine since. Anyone >> > seen >> > this before? >> > >&

Re: "Final buffer length 4690 to accomodate data size of 2347 for RowMutation" error caused node death

2012-03-07 Thread Thomas van Neerijnen
after about 3 minutes, restarted it and it has been fine since. Anyone > seen > > this before? > > > > Here is the error in the output.log: > > > > ERROR 10:51:44,282 Fatal exception in thread > > Thread[COMMIT-LOG-WRITER,5,main] > > java.lang.AssertionEr

Re: "Final buffer length 4690 to accomodate data size of 2347 for RowMutation" error caused node death

2012-02-24 Thread Jonathan Ellis
a.lang.AssertionError: Final buffer length 4690 to accomodate data size > of 2347 (predicted 2344) for RowMutation(keyspace='Player', > key='36336138643338652d366162302d343334392d383466302d356166643863353133356465', > modifications=[ColumnFamily(PlayerCity [SuperColumn

Re: "Final buffer length 4690 to accomodate data size of 2347 for RowMutation" error caused node death

2012-02-20 Thread aaron morton
andra and killed it myself > after about 3 minutes, restarted it and it has been fine since. Anyone seen > this before? > > Here is the error in the output.log: > > ERROR 10:51:44,282 Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main] > java.lang.AssertionError: Fina

"Final buffer length 4690 to accomodate data size of 2347 for RowMutation" error caused node death

2012-02-20 Thread Thomas van Neerijnen
] java.lang.AssertionError: Final buffer length 4690 to accomodate data size of 2347 (predicted 2344) for RowMutation(keyspace='Player', key='36336138643338652d366162302d343334392d383466302d356166643863353133356465', modifications=[ColumnFamily(PlayerCity [SuperColumn(owneditem_1019 []),SuperC

Re: Cassandra 0.6.3 ring not balance in terms of data size

2011-03-17 Thread Ching-Cheng Chen
>From OrderPreservingPartition.java public StringToken getToken(ByteBuffer key) { String skey; try { skey = ByteBufferUtil.string(key, Charsets.UTF_8); } catch (CharacterCodingException e) { throw new RuntimeException(

Re: Cassandra 0.6.3 ring not balance in terms of data size

2011-03-17 Thread Ali Ahsan
Please can any one give their comment on this On 03/17/2011 07:02 PM, Ali Ahsan wrote: Dear Aaron, We are little confused about OPP token.How to calculate OPP Token? Few of our column families have UUID as key and other's have integer as key.

Re: Cassandra 0.6.3 ring not balance in terms of data size

2011-03-17 Thread Ali Ahsan
Dear Aaron, We are little confused about OPP token.How to calculate OPP Token? Few of our column families have UUID as key and other's have integer as key. On 03/17/2011 04:22 PM, Ali Ahsan wrote: Below is the ouput of nodetool ring Address Status Load Range

Re: Cassandra 0.6.3 ring not balance in terms of data size

2011-03-17 Thread Ali Ahsan
Below is the ouput of nodetool ring Address Status Load Range Ring TuL8jLqs7uxLipP6 192.168.100.3 Up 89.91 GB JDtVOU0YVQ6MtBYA |<--| 192.168.100.4 Up 48

Re: Cassandra 0.6.3 ring not balance in terms of data size

2011-03-17 Thread aaron morton
With the Order Preserving Partitioner you are responsible for balancing the rows around the cluster, http://wiki.apache.org/cassandra/Operations?highlight=%28partitioner%29#Token_selection Was there a reason for using the ordered partitioner rather than the random one? What does the output

Cassandra 0.6.3 ring not balance in terms of data size

2011-03-17 Thread Ali Ahsan
Hi All We are running Cassandra 0.6.3,We have two node's with replication factor one and ordered partitioning.Problem we are facing at the moment all data is being send to one Cassandra node and its filling up quite rapidly and we are short of disk space.Unfortunately we have hardware constr

Re: Obscured question about data size in a Column Family

2010-12-09 Thread Jonathan Ellis
In <= 0.6 (but not 0.7) a row could not be larger than 2GB. 2GB is still the largest possible column value. On Thu, Dec 9, 2010 at 5:38 PM, Joshua Partogi wrote: > Hi there, > > Quoting an information in the wiki about Cassandra limitations > (http://wiki.apache.org/cassandra/CassandraLimitation

Obscured question about data size in a Column Family

2010-12-09 Thread Joshua Partogi
Hi there, Quoting an information in the wiki about Cassandra limitations ( http://wiki.apache.org/cassandra/CassandraLimitations): ... So all the data from a given columnfamily/key pair had to fit in memory, or 2GB ... Does this mean 1. A ColumnFamily can only be 2GB of data 2. A Column (key/pair

Re: data size

2010-11-22 Thread aaron morton
Is this from a clean install ? Have you been deleting data? Could this be your problem ? http://wiki.apache.org/cassandra/FAQ#i_deleted_what_gives If not you'll need to provide some more details, which version, what the files are on disk, what was the data you loaded etc. Hope that helps Aar

data size

2010-11-20 Thread 魏金仙
When I load 17GB(as nodetool ring shows) data to a Cassandra node which is clean before data loading, all the files in the data directory can have a size larger than 100GB. Is it normal?