RE: Questions on the count and multiple index behaviour in cassandra

2022-09-29 Thread Durity, Sean R via user
Aggregate queries (like count(*) ) are fine *within* a reasonably sized partition (under 100 MB in size). However, Cassandra is not the right tool if you want to do aggregate queries *across* partitions (unless you break up the work with something like Spark). Choosing the right partition key

Re: Questions on the count and multiple index behaviour in cassandra

2022-09-28 Thread Bowen Song via user
It sounds like you are misusing/abusing Cassandra. I've noticed the following Cassandra anti-patterns in your post: 1. Large or uneven partitions All rows in a table in a single partition is definitely an anti-pattern unless you only have a very small number of rows. 2. "SE

Re: Questions on the count and multiple index behaviour in cassandra

2022-09-28 Thread Stéphane Alleaume
e 9lakh records were part of a single partition key. > > When we tried a select count(*) query with that partition key, the query > was timing out. > > However, we were able to retrieve counts through multiple calls by > fetching only > 1 lakh records in each call. The only disa

Questions on the count and multiple index behaviour in cassandra

2022-09-28 Thread Karthik K
datatype. All these 9lakh records were part of a single partition key. When we tried a select count(*) query with that partition key, the query was timing out. However, we were able to retrieve counts through multiple calls by fetching only 1 lakh records in each call. The only disadvantage here is the

Fwd: Re: using zstd cause high memtable switch count

2021-02-28 Thread onmstester onmstester
No, i didn't backport that one. Thank you Sent using https://www.zoho.com/mail/ Forwarded message From: Kane Wilson To: Date: Mon, 01 Mar 2021 03:18:33 +0330 Subject: Re: using zstd cause high memtable switch count Forwarded me

Re: using zstd cause high memtable switch count

2021-02-28 Thread Kane Wilson
ore), only the memtable switch count was changed dramatically; > with lz4 it was less than 100 for a week, but with zstd it was more than > 1000. I don't understand how its related. > > P.S: Thank you guys for bringing zstd to Cassandra, it had a huge impact > on my use-case by r

using zstd cause high memtable switch count

2021-02-28 Thread onmstester onmstester
Hi, I'm using 3.11.2, just add the patch for zstd and changed table compression from default (LZ4) to zstd with level 1 and chunk 64kb, everything is fine (disk usage decreased by 40% and CPU usage is almost the same as before), only the memtable switch count was changed dramatically;

Re: Does repair count as read/writes

2020-10-25 Thread Ayub M
Thanks Erick. On Sun, Oct 25, 2020 at 6:45 AM Erick Ramirez wrote: > Not quite. Cassandra does a validation compaction for the merkle tree > calculation. And it streams SSTables instead of individual mutations from > one node to another to synchronise data between replicas. Cheers! > > -- Rega

Re: Does repair count as read/writes

2020-10-25 Thread Erick Ramirez
Not quite. Cassandra does a validation compaction for the merkle tree calculation. And it streams SSTables instead of individual mutations from one node to another to synchronise data between replicas. Cheers!

Does repair count as read/writes

2020-10-25 Thread Ayub M
Hello, when repairs are run in Cassandra, does the read and writes done for repair count in the read/write metrics? Repair has to read the table to build merkle tree, similarly when it has to do repair it has to write to the table, logically i feel it should. If so, is there any way to identify

Re: Some nodes has excessive GC Count compared to others

2020-09-02 Thread Oleksandr Shulgin
On Wed, Sep 2, 2020 at 7:54 PM Tobias Eriksson wrote: > Hi > > I am monitoring a 20+ node cluster, and 5 of them has an excessive GC > Count > > Typically the nodes has 40-60,000 GC runs, but a handful has 4,000,000 GC > runs > > And it is not temporary this is

Re: Some nodes has excessive GC Count compared to others

2020-09-02 Thread Shalom Sagges
I agree with Erick and believe it's most likely a hot partitions issue. I'd check "Compacted partition maximum bytes" in nodetool tablestats on those "affected" nodes and compare the result with the other nodes. I'd also check how the cpu_load is affected. From my experience, during excessive GC t

Re: Some nodes has excessive GC Count compared to others

2020-09-02 Thread Erick Ramirez
That would have been my first response too -- hot partitions. If you know the partition keys, you can quickly confirm it with nodetool getendpoints. Cheers! >

Re: Write count vs Local Write Count

2019-07-04 Thread Alain RODRIGUEZ
: $ for i in $(seq 1 3); do echo "Node $i"; ccm node$i nodetool tablestats tlp_labs | grep -i -e Table: -e Keyspace -e 'write count'; done Node 1 Keyspace : tlp_labs Write Count: 4 Table: products Local write count: 2 Table: services Local write count: 2 Node 2 Key

Write count vs Local Write Count

2019-06-27 Thread raja k
Hello, Can any one tell me the difference b/w Write Count vs Local write count from node tool tablestats output ? Below is what I see for one of my table  Write Count: 248214002  Write Latency: 0.07470789510093795 ms.   Local write count: 1183420   Local write latency: NaN ms Thanks,

varying results count query after alter keyspace system_distributed to NetworkTopologyStrategy

2019-04-17 Thread nheer...@hetnet.nl
n we did nodetool repair -full on each node. Also went fine. But now, when we do a count query on the system_distributed keyspace parent_repair_history or repair_history tables, we get varying results everytime we do this query, querying immediately after each other. Sometimes count is a b

RE: SSTable count in Nodetool tablestats(LevelCompactionStrategy)

2018-04-20 Thread Vishal1.Sharma
the compaction is complete, the count becomes equal. Regards, Vishal Sharma From: kurt greaves [mailto:k...@instaclustr.com] Sent: Friday, April 20, 2018 12:27 PM To: User Subject: Re: SSTable count in Nodetool tablestats(LevelCompactionStrategy) I'm currently investigating this issue on o

Re: SSTable count in Nodetool tablestats(LevelCompactionStrategy)

2018-04-19 Thread kurt greaves
he tables in my keyspace is using LevelCompactionStrategy and when > I used the nodetool tablestats keyspace.table_name command, I found some > mismatch in the count of SSTables displayed at 2 different places. Please > refer the attached image. > > > > The command is giv

SSTable count in Nodetool tablestats(LevelCompactionStrategy)

2018-04-17 Thread Vishal1.Sharma
Dear Community, One of the tables in my keyspace is using LevelCompactionStrategy and when I used the nodetool tablestats keyspace.table_name command, I found some mismatch in the count of SSTables displayed at 2 different places. Please refer the attached image. The command is giving SSTable

Re: Cassandra CF Level Metrics (Read, Write Count and Latency)

2017-09-01 Thread Chris Lohfink
.com> wrote: > okay, let me try it out > > On Thu, Aug 31, 2017 at 8:30 PM, Christophe Schmitz < > christo...@instaclustr.com> wrote: > >> Hi Jai, >> >> The ReadLatency MBean expose a few metrics, including the count one, >> which is the to

Re: Cassandra CF Level Metrics (Read, Write Count and Latency)

2017-08-31 Thread Jai Bheemsen Rao Dhanwada
okay, let me try it out On Thu, Aug 31, 2017 at 8:30 PM, Christophe Schmitz < christo...@instaclustr.com> wrote: > Hi Jai, > > The ReadLatency MBean expose a few metrics, including the count one, which > is the total read requests you are after. > See attached

Re: Cassandra CF Level Metrics (Read, Write Count and Latency)

2017-08-31 Thread Christophe Schmitz
Hi Jai, The ReadLatency MBean expose a few metrics, including the count one, which is the total read requests you are after. See attached screenshot Cheers, Christophe On 1 September 2017 at 09:21, Jai Bheemsen Rao Dhanwada < jaibheem...@gmail.com> wrote: > I did look at the doc

Re: Cassandra CF Level Metrics (Read, Write Count and Latency)

2017-08-31 Thread Jai Bheemsen Rao Dhanwada
Table keyspace= scope= > name= > With MetricName set to ReadLatency and WriteLatency > > Cheers, > > Christophe > > > > On 1 September 2017 at 09:08, Jai Bheemsen Rao Dhanwada < > jaibheem...@gmail.com> wrote: > >> Hello All, >> >> I a

Re: Cassandra CF Level Metrics (Read, Write Count and Latency)

2017-08-31 Thread Christophe Schmitz
, Christophe On 1 September 2017 at 09:08, Jai Bheemsen Rao Dhanwada < jaibheem...@gmail.com> wrote: > Hello All, > > I am looking to capture the CF level Read, Write count and Latency. As of > now I am using Telegraf plugin to capture the JMX metrics. > > What is the MBeans,

Cassandra CF Level Metrics (Read, Write Count and Latency)

2017-08-31 Thread Jai Bheemsen Rao Dhanwada
Hello All, I am looking to capture the CF level Read, Write count and Latency. As of now I am using Telegraf plugin to capture the JMX metrics. What is the MBeans, scope and metric to look for the CF level metrics?

Cassandra-count gives wrong results

2017-08-18 Thread Alain Rastoul
Hi, I use cassandra-count (github https://github.com/brianmhess/cassandra-count) to count records in a table, but I have wrong results. When I export data with cqlsh /copy to csv, I have 1M records in my test table, when I use cassandra-count I have different results for each node : build

Re: Incorrect quorum count in driver error logs

2017-06-26 Thread Rutvij Bhatt
not being able to reach a quorum of 4. This, to me, > was mysterious as none of my keyspaces have an RF > 3. That quorum count in > the error implied an RF of 6 or 7. > > I eventually forced that node out of the ring with "nodetool removenode > force". This seemed to most

Re: Incorrect quorum count in driver error logs

2017-06-26 Thread Hannu Kröger
y guess is that this is related > to https://issues.apache.org/jira/browse/CASSANDRA-6542. We are using 2.1.11. > > While this was happening, the driver in the application started logging error > messages about not being able to reach a quorum of 4. This, to me, was > mysterious a

Incorrect quorum count in driver error logs

2017-06-26 Thread Rutvij Bhatt
eing able to reach a quorum of 4. This, to me, was mysterious as none of my keyspaces have an RF > 3. That quorum count in the error implied an RF of 6 or 7. I eventually forced that node out of the ring with "nodetool removenode force". This seemed to mostly fix the issue, though there se

Re: Count limit

2017-06-21 Thread Vladimir Yudovin
Hi, >Some body told because the count return 1 row result He is right Best regards, Vladimir Yudovin, Winguzone - Cloud Cassandra Hosting On Wed, 21 Jun 2017 02:43:32 -0400 web master <socketman2...@gmail.com> wrote According to http://www.maigfrga.ntweb.co

RE: COUNT

2017-06-21 Thread ZAIDI, ASAD A
master [mailto:socketman2...@gmail.com] Sent: Wednesday, June 21, 2017 1:44 AM To: user@cassandra.apache.org Subject: COUNT I have this schema CREATE TABLE IF NOT EXISTS "inbox" ( "groupId" BIGINT, "createTime" TIMEUUID, "mailId"

COUNT

2017-06-20 Thread web master
,"mailId") )WITH CLUSTERING ORDER BY ("createTime" DESC); This table is frequency updated (250K per second) and each between 10-1000 new record is inserted in each "groupId" per day The problem is I want to count `Unread mails` that based on a TIMEUUID co

Count limit

2017-06-20 Thread web master
According to http://www.maigfrga.ntweb.co/counting-indexing-and- ordering-cassandra SELECT COUNT(*) FROM product limit 5000; must return no more than 5000 , but Why it don't works? and count whole number? Some body told because the count return 1 row result and some body told that it is

Re: nodetool tablestats reporting local read count of 0, incorrectly

2017-04-03 Thread Jeff Jirsa
On 2017-04-03 12:42 (-0700), Voytek Jarnot wrote: > Continuing to grasp at straws... > > Is it possible that indexing is modifying the read path such that the > tablestats/tablehistograms output is no longer trustworthy? I notice more > realistic "local read count" n

Re: nodetool tablestats reporting local read count of 0, incorrectly

2017-04-03 Thread Voytek Jarnot
Continuing to grasp at straws... Is it possible that indexing is modifying the read path such that the tablestats/tablehistograms output is no longer trustworthy? I notice more realistic "local read count" numbers on tables which do not utilize SASI. Would greatly appreciate an

Re: nodetool tablestats reporting local read count of 0, incorrectly

2017-04-03 Thread Voytek Jarnot
, Voytek Jarnot wrote: > Cassandra 3.9 > > Have a keyspace with 5 tables, one of which is exhibiting rather poor read > performance. In starting an attempt to get to the bottom of the issues, I > noticed that, when running nodetool tablestats against the keyspace, that > particula

nodetool tablestats reporting local read count of 0, incorrectly

2017-03-31 Thread Voytek Jarnot
Cassandra 3.9 Have a keyspace with 5 tables, one of which is exhibiting rather poor read performance. In starting an attempt to get to the bottom of the issues, I noticed that, when running nodetool tablestats against the keyspace, that particular table reports "Local read count: 0" on

Re: Count(*) is not working

2017-02-20 Thread Sylvain Lebresne
I guess I misspoke, sorry. It is true that count() as any other query is still governed by the read timeout and any count that has to process a lot of data will take a long time and will require a high timeout set to not timeout (true of every aggregation query as it happens). I guess I responded

Re: Count(*) is not working

2017-02-20 Thread Benjamin Roth
+1 I also encountered timeouts many many times (using DS DevCenter). Roughly this occured when count(*) > 1.000.000 2017-02-20 14:42 GMT+01:00 Edward Capriolo : > Seems worth it to file a bug since some here are under the impression it > almost always works and others are under the impr

Re: Count(*) is not working

2017-02-20 Thread Edward Capriolo
guess every time I've seen it it must have timed out due to tombstones. > > On 17 Feb. 2017 22:06, "Sylvain Lebresne" > wrote: > > On Fri, Feb 17, 2017 at 11:54 AM, kurt greaves > wrote: > >> if you want a reliable count, you should use spark. performing a c

Re: Count(*) is not working

2017-02-17 Thread kurt greaves
really... well that's good to know. it still almost never works though. i guess every time I've seen it it must have timed out due to tombstones. On 17 Feb. 2017 22:06, "Sylvain Lebresne" wrote: On Fri, Feb 17, 2017 at 11:54 AM, kurt greaves wrote: > if you want a rel

Re: Count(*) is not working

2017-02-17 Thread Sagar Jambhulkar
+1 for using spark for counts. On Feb 17, 2017 4:25 PM, "kurt greaves" wrote: > if you want a reliable count, you should use spark. performing a count (*) > will inevitably fail unless you make your server read timeouts and > tombstone fail thresholds ridiculous > > On

Re: Count(*) is not working

2017-02-17 Thread siddharth verma
This is a work around. For this warning, do review your data model once. Regards On Fri, Feb 17, 2017 at 4:36 PM, Sylvain Lebresne wrote: > On Fri, Feb 17, 2017 at 11:54 AM, kurt greaves > wrote: > >> if you want a reliable count, you should use spark. performing a count >>

Re: Count(*) is not working

2017-02-17 Thread Sylvain Lebresne
On Fri, Feb 17, 2017 at 11:54 AM, kurt greaves wrote: > if you want a reliable count, you should use spark. performing a count (*) > will inevitably fail unless you make your server read timeouts and > tombstone fail thresholds ridiculous > That's just not true. count(*) is pa

Re: Count(*) is not working

2017-02-17 Thread kurt greaves
if you want a reliable count, you should use spark. performing a count (*) will inevitably fail unless you make your server read timeouts and tombstone fail thresholds ridiculous On 17 Feb. 2017 04:34, "Jan" wrote: > Hi, > > could you post the output of nodetool cf

Re: Count(*) is not working

2017-02-16 Thread Jan
Hi, could you post the output of nodetool cfstats for the table? Cheers, Jan Am 16.02.2017 um 17:00 schrieb Selvam Raman: > I am not getting count as result. Where i keep on getting n number of > results below. > > Read 100 live rows and 1423 tombstone cells for query S

Re: Count(*) is not working

2017-02-16 Thread Selvam Raman
I am not getting count as result. Where i keep on getting n number of results below. Read 100 live rows and 1423 tombstone cells for query SELECT * FROM keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052) LIMIT 100 (see tombstone_warn_threshold) On Thu, Feb 16, 2017 at

Re: Count(*) is not working

2017-02-16 Thread Cogumelos Maravilha
With C* 3.10 cqlsh ip --request-timeout=60 Connected to x at 10.10.10.10:9042. [cqlsh 5.0.1 | Cassandra 3.10 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help. cqlsh> USE ; cqlsh:> SELECT count(*) from table; count - 3572579 On 02/16/2017 12:27 PM,

Re: Count(*) is not working

2017-02-16 Thread Jan Kesten
because per 100 rows that count c* had to read about 15 times rows that were deleted already. Apart from that, count(*) is almost always slow - and there is a default limit of 10.000 rows in a result. Do you really need the actual live count? To get a idea you can always look at nodetool

Re: Count(*) is not working

2017-02-16 Thread Selvam Raman
I am using cassandra 3.9. Primary Key: id text; On Thu, Feb 16, 2017 at 12:25 PM, Cogumelos Maravilha < cogumelosmaravi...@sapo.pt> wrote: > C* version please and partition key. > > On 02/16/2017 12:18 PM, Selvam Raman wrote: > > Hi, > > I want to know the total re

Re: Count(*) is not working

2017-02-16 Thread Cogumelos Maravilha
C* version please and partition key. On 02/16/2017 12:18 PM, Selvam Raman wrote: > Hi, > > I want to know the total records count in table. > > I fired the below query: > select count(*) from tablename; > > and i have got the below output > > Read 100 live

Count(*) is not working

2017-02-16 Thread Selvam Raman
Hi, I want to know the total records count in table. I fired the below query: select count(*) from tablename; and i have got the below output Read 100 live rows and 1423 tombstone cells for query SELECT * FROM keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Alexander Dejanovski
Shalom, you may have a high trace probability which could explain what you're observing : https://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsSetTraceProbability.html On Thu, Nov 10, 2016 at 3:37 PM Chris Lohfink wrote: > count(*) actually pages through all the data. So

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Alexander Dejanovski
Could you check the write count on a per table basis in order to check which specific table is actually receiving writes ? Check the OneMinuteRate metric in org.apache.cassandra.metrics:type=ColumnFamily,keyspace=*keyspace1*,scope= *standard1*,name=WriteLatency (Make sure you replace keyspace and

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Chris Lohfink
count(*) actually pages through all the data. So a select count(*) without a limit would be expected to cause a lot of load on the system. The hit is more than just IO load and CPU, it also creates a lot of garbage that can cause pauses slowing down the entire JVM. Some details here: http

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Shalom Sagges
Hi Alexander, I'm referring to Writes Count generated from JMX: [image: Inline image 1] The higher curve shows the total write count per second for all nodes in the cluster and the lower curve is the average write count per second per node. The drop in the end is the result of shutting dow

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Alexander Dejanovski
way for a select count(*) to increase your write count (if you are indeed talking about actual Cassandra writes, and not I/O operations). Cheers, On Thu, Nov 10, 2016 at 1:21 PM Shalom Sagges wrote: > Yes, I know it's obsolete, but unfortunately this takes time. > We're i

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Shalom Sagges
please advise the sender > immediately by reply email and delete this message. Thank you. > > > Hi Shalom, > > so not sure, but probably excessive memory consumption by this SELECT > causes C* to flush tables to free memory. > > Best regards, Vladimir Yudovin, > >

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Vladimir Yudovin
, 10 Nov 2016 03:36:59 -0500Shalom Sagges <shal...@liveperson.com> wrote Hi There! I'm using C* 2.0.14. I experienced a scenario where a "select count(*)" that ran every minute on a table with practically no results limit (yes, this should definitely be avoided),

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Shalom Sagges
03:36:59 -0500*Shalom Sagges > >* wrote > > Hi There! > > I'm using C* 2.0.14. > I experienced a scenario where a "select count(*)" that ran every minute > on a table with practically no results limit (yes, this should definitely > be avoided), caused a hu

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Vladimir Yudovin
hal...@liveperson.com> wrote Hi There! I'm using C* 2.0.14. I experienced a scenario where a "select count(*)" that ran every minute on a table with practically no results limit (yes, this should definitely be avoided), caused a huge increase in Cassandra writes t

Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Shalom Sagges
Hi There! I'm using C* 2.0.14. I experienced a scenario where a "select count(*)" that ran every minute on a table with practically no results limit (yes, this should definitely be avoided), caused a huge increase in Cassandra writes to around 150 thousand writes per second for

Re: Difference in token range count

2016-10-03 Thread techpyaasa .
repair keyspace_name1 >>> columnfamily_1) on one of data center I saw following print >>> >>> "Starting repair command #3, repairing *2647 ranges* for keyspace >>> keyspace_name1" >>> >>> The count of ranges , it is supposed to

Re: Difference in token range count

2016-09-30 Thread laxmikanth sadula
. Each data center has 9 nodes. >> vnodes enabled in all nodes. >> >> When I ran -local repair(./nodetool -local repair keyspace_name1 >> columnfamily_1) on one of data center I saw following print >> >> "Starting repair command #3, repairing *2647 ran

Re: Difference in token range count

2016-09-30 Thread Eric Stevens
air keyspace_name1 > columnfamily_1) on one of data center I saw following print > > "Starting repair command #3, repairing *2647 ranges* for keyspace > keyspace_name1" > > The count of ranges , it is supposed to be *2304*(256*9) as we have 9 nodes > in one data center right but wh

Difference in token range count

2016-09-30 Thread techpyaasa .
space keyspace_name1" The count of ranges , it is supposed to be *2304*(256*9) as we have 9 nodes in one data center right but why it is showing as 2647 ranges ?? Can someone please clarify why this difference in token ranges count? Thanks techpyaasa

Re: Approximate row count

2016-07-27 Thread Luke Jolly
t contain many of your rows. > > Chris Lohfink > > On Wed, Jul 27, 2016 at 1:44 PM, Luke Jolly wrote: > >> I have a table that I'm storing ad impression data in with every row >> being an impression. I want to get a count of total rows / impressions. I >>

Re: Approximate row count

2016-07-27 Thread Chris Lohfink
I'm storing ad impression data in with every row being > an impression. I want to get a count of total rows / impressions. I know > that there is in the ball park of 200-400 million rows in this table and > from my reading "Number of keys" in the output of cfstats should

Approximate row count

2016-07-27 Thread Luke Jolly
I have a table that I'm storing ad impression data in with every row being an impression. I want to get a count of total rows / impressions. I know that there is in the ball park of 200-400 million rows in this table and from my reading "Number of keys" in the output of cfst

Re: DTCS SSTable count issue

2016-07-11 Thread Alain RODRIGUEZ
ever and set it to something very close to 99%, the > estimated tombstone ratio isn’t that accurate) > > > > - Jeff > > > > > > *From: *Alain RODRIGUEZ > *Reply-To: *"user@cassandra.apache.org" > *Date: *Monday, July 11, 2016 at 1:05 PM

Re: DTCS SSTable count issue

2016-07-11 Thread Jason J. W. Williams
’t that accurate) > > > > - Jeff > > > > > > *From: *Alain RODRIGUEZ > *Reply-To: *"user@cassandra.apache.org" > *Date: *Monday, July 11, 2016 at 1:05 PM > *To: *"user@cassandra.apache.org" > *Subject: *Re: DTCS SSTable count issue &g

Re: DTCS SSTable count issue

2016-07-11 Thread Riccardo Ferrari
.7 > https://github.com/apache/cassandra/blob/cassandra-3.0/CHANGES.txt#L28. > Anyway, you can use it in any recent version as compactions strategies are > pluggable. > > What concerns me is that I have an high tombstone read count despite those >> are insert only tables. Compacti

Re: DTCS SSTable count issue

2016-07-11 Thread Jeff Jirsa
o isn’t that accurate) - Jeff From: Alain RODRIGUEZ Reply-To: "user@cassandra.apache.org" Date: Monday, July 11, 2016 at 1:05 PM To: "user@cassandra.apache.org" Subject: Re: DTCS SSTable count issue @Jeff Rather than being an alternative, isn't your compac

Re: DTCS SSTable count issue

2016-07-11 Thread Alain RODRIGUEZ
https://github.com/apache/cassandra/blob/cassandra-3.0/CHANGES.txt#L28. Anyway, you can use it in any recent version as compactions strategies are pluggable. What concerns me is that I have an high tombstone read count despite those > are insert only tables. Compacting the table make the tombsto

Divergence from Cassandra partition_count and partition keys count

2016-07-07 Thread Alexandre Santana
Hello, Im trying to use spark with cassandra and it was oddly generating several spark jobs because spark follow the guidelines generated by partitions_count and mean_partition_size. The problem is that I have a very small table (300MB) with only 16 distinct partition keys running on a single C* n

Re: DTCS SSTable count issue

2016-07-07 Thread Jeff Jirsa
almost certainly help address the growing sstable count.   From: Riccardo Ferrari Reply-To: "user@cassandra.apache.org" Date: Thursday, July 7, 2016 at 6:49 AM To: "user@cassandra.apache.org" Subject: DTCS SSTable count issue Hi everyone, This is my first quest

DTCS SSTable count issue

2016-07-07 Thread Riccardo Ferrari
DateTieredCompactionStrategy and are suffering of constantly growing SSTable count. I have the feeling this has something to do with the upgrade however I need some hint on how to debug this issue. Tables are created like: CREATE TABLE ( ... PRIMARY KEY (...) ) WITH CLUSTERING ORDER BY (...) AND

Re: SSTable count at 10K during repair (won't decrease)

2016-05-20 Thread Fabrice Facorat
hub.com/spotify/cassandra-reaper > > Problem : When we run a repair job sometimes the SSTable count goes to 10K > on one of nodes (not always the same node). The Reaper is smart enough to > postpone the repair on this node since the number of pending compactions is > > 20 but numbe

Re: SSTable count at 10K during repair (won't decrease)

2016-05-20 Thread Alain RODRIGUEZ
the node at once through https://github.com/arodrime/cassandra-tools/tree/master/rolling-ssh) Even If I set the compactionthroughput 0 (disable throttling) the SSTable > count stays around 10K. > Using SSD, if nodes are still happy this way, feel free to keep this throttle disabled. I saw many peo

SSTable count at 10K during repair (won't decrease)

2016-05-06 Thread Jean-Francois Gosselin
- Cassandra 2.1.13  - SSDs - LeveledCompactionStrategy    - Range repair (not incremental) with Spotify's Reaper https://github.com/spotify/cassandra-reaper Problem : When we run a repair job sometimes the SSTable count goes to 10K on one of nodes (not always the same node). The Reaper is

Re: Unable to reliably count keys on a thrift CF

2016-04-25 Thread Anuj Wadehra
Hi Carlos, Please check if the JIRA :  https://issues.apache.org/jira/browse/CASSANDRA-11467 fixes your problem. We had been facing row count issue with thrift cf / compact storage and this fixed it. Above is fixed in latest 2.1.14. Its a two line fix. So, you can also prepare a custom jar and

Re: Unable to reliably count keys on a thrift CF

2016-04-25 Thread Carlos Alonso
Hi Jens. Thanks for your response but my idea is to count different keys, so, if I understood correctly selecting WHERE key = #{key} won't give me any new key, right? Thanks! Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso> On 25 April 2016 at 09:22,

Re: Unable to reliably count keys on a thrift CF

2016-04-25 Thread Jens Rantil
t; > I've been struggling for the last days to find a reliable and stable way > to count keys in a thrift column family. > > My idea is to basically iterate the whole ring using the token function, > as documented here: > https://docs.datastax.com/en/cql/3.1/cql/cql_using/paging

Unable to reliably count keys on a thrift CF

2016-04-21 Thread Carlos Alonso
Hi guys. I've been struggling for the last days to find a reliable and stable way to count keys in a thrift column family. My idea is to basically iterate the whole ring using the token function, as documented here: https://docs.datastax.com/en/cql/3.1/cql/cql_using/paging_c.html in batch

Re: Proper use of COUNT

2016-04-19 Thread DuyHai Doan
t; > On Tue, Apr 19, 2016 at 4:56 PM, Jack Krupansky > wrote: > >> Sylvain & Tyler, this Jira is for a user reporting a timeout for SELECT >> COUNT(*) using 3.3: >> https://issues.apache.org/jira/browse/CASSANDRA-11566 >> >> I'll let one of you guys fo

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
for cqlsh? -- Jack Krupansky On Tue, Apr 19, 2016 at 4:56 PM, Jack Krupansky wrote: > Sylvain & Tyler, this Jira is for a user reporting a timeout for SELECT > COUNT(*) using 3.3: > https://issues.apache.org/jira/browse/CASSANDRA-11566 > > I'll let one of you guys foll

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
Sylvain & Tyler, this Jira is for a user reporting a timeout for SELECT COUNT(*) using 3.3: https://issues.apache.org/jira/browse/CASSANDRA-11566 I'll let one of you guys follow up on that. I mean, I thought it was timing out die to the amount of data, but you guys are saying that pagi

Re: Proper use of COUNT

2016-04-19 Thread Tyler Hobbs
erstand all of this so far, this means that for 3.x COUNT (and > other aggregate functions) are "safe but may be slow" (paraphrasing > Sylvain.) Is this for 3.0 and later or some other 3.x (or even some 2.x)? > I think count(*) started using paging internally in 2.1, but I'

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
t node responds with a row from token t, then by definition there will be no further rows returned from that node with a token less than t? And if I understand all of this so far, this means that for 3.x COUNT (and other aggregate functions) are "safe but may be slow" (paraphrasing Sylv

Re: Proper use of COUNT

2016-04-19 Thread Tyler Hobbs
On Tue, Apr 19, 2016 at 9:51 AM, Jack Krupansky wrote: > > 1. Another clarification: All of the aggregate functions, AVG, SUM, MIN, > MAX are in exactly the same boat as COUNT, right? > Yes. > > 2. Is the paging for COUNT, et al, done within the coordinator node? > Yes.

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
Thanks for that clarification, Sylvain. 1. Another clarification: All of the aggregate functions, AVG, SUM, MIN, MAX are in exactly the same boat as COUNT, right? 2. Is the paging for COUNT, et al, done within the coordinator node? 3. Does dedupe on the coordinator node consume memory

Re: Proper use of COUNT

2016-04-19 Thread Sylvain Lebresne
> > > Accept for relatively small or narrow queries, it seems to have a > propensity for timing out. > For recent enough version of C*, it shouldn't since it pages internally (it will be slow and as always be, but it shouldn't time out if some decent page size is used, which should be the default)

Re: Unable to perform COUNT(*) on CASSANDRA table

2016-04-18 Thread Steve Robenalt
The Kafka-Storm is running and the logs says, > it processed 500,000 records. > > > > But the table count(*) fails – timeout -- Do you know why ?? > > > > > > cqlsh:sandbox> select count(*) from events; > > OperationTimedOut: errors={}, last_host=10.226.6

Unable to perform COUNT(*) on CASSANDRA table

2016-04-18 Thread Lokesh Ceeba - Vendor
Hello Team The Kafka-Storm is running and the logs says, it processed 500,000 records. But the table count(*) fails – timeout -- Do you know why ?? cqlsh:sandbox> select count(*) from events; OperationTimedOut: errors={}, last_host=10.226.68.248 DDL –sett

Proper use of COUNT

2016-04-18 Thread Jack Krupansky
Based on a recent inquiry and a recent thread of my own, and the coming support for wide rows, I'll focus in on this question that I feel needs better documentation of recommended best practice: When can the COUNT(*) aggregate row-counting function be used? Accept for relatively small or n

Re: Can't select count(*)

2016-02-01 Thread Stefania Alborghetti
Regarding select count(*), the timeout is probably client side. Try changing the default connect timeout in cqlsh via --request-timeout. By default it is 10 seconds. Refer to "cqlsh --help" for more details but basically "cqlsh --request-timeout=30" should work. Regarding

Can't select count(*)

2016-01-31 Thread Ivan Zelensky
Hi all! I have a table with simple primary key (one field on primary key only), and ~1 million records. Table stored on single-node C* 2.2.4. Problem: when I'm trying to execute "SELECT count(*) FROM my_table;", operation is timed out. As I understand, 1 mln rows is not so big

Re: Estimated key count from nodetool tablestats

2016-01-24 Thread Chris Lohfink
It will give you an estimate of the number of partition keys. In newer versions it will merge a sketch of the keys and using HyperLogLog++ (p=13, sp=25) it will come up with an es

Estimated key count from nodetool tablestats

2016-01-24 Thread Jack Krupansky
Does the nodetool tablestats output line for "Number of keys (estimate)" indicate partition keys or CQL row primary keys (PK)? We currently don't have doc on this and I couldn't get a solid answer from a quick examination of the code. Since it is an estimate, roughly what is the nature of the est

Re: Why I can not do a "count(*) ... allow filtering " without facing operation timeout?

2015-09-04 Thread Sebastian Estevez
e the timeout > in cqlsh. > > /Tommy > > > On 2015-09-04 10:31, shahab wrote: > > Hi, > > This is probably a silly problem , but it is really serious for me. I have > a cluster of 3 nodes, with replication factor 2. But still I can not do a > simple "select

  1   2   3   4   >