Re: Partition size, limits, recommendations for tables where all columns are part of the primary key

2020-06-09 Thread Alex Ott
t; ) > > Obviously the total amount of disk space for this table must be more than > 32 bytes. In this situation, how should I be reasoning about partition > sizes (in terms of the 2B cell limit, and 100MB-400MB partition size > limit)? Additionally, are there othe

Partition size, limits, recommendations for tables where all columns are part of the primary key

2020-06-09 Thread Benjamin Christenson
100MB-400MB partition size limit)? Additionally, are there other limits / potential performance issues I should be concerned about? Ben Christenson Developer Kinetic Data, Inc. Your business. Your process. 651-556-0937 | ben.christen...@kineticdata.com www.kineticdata.com

Re: how to check C* partition size

2018-01-10 Thread Alain RODRIGUEZ
Hello, You can also graph metrics using Datadog / Grafana or any other monitoring tool. Look at the max / mean partition size I would say, see: http://cassandra.apache.org/doc/latest/operating/metrics.html#table-metrics. There is also a metric called 'EstimatedPartitionSizeHistogram' y

Re: how to check C* partition size

2018-01-08 Thread Ahmed Eljami
​>Nodetool tablestats gives you a general idea. Since C* 3.X :)

RE: how to check C* partition size

2018-01-08 Thread Meg Mara
Nodetool tablestats gives you a general idea. Meg Mara From: Peng Xiao [mailto:2535...@qq.com] Sent: Sunday, January 07, 2018 9:26 AM To: user Subject: how to check C* partition size Hi guys, Could anyone please help on this simple question? How to check C* partition size and related

Re: how to check C* partition size

2018-01-07 Thread Jeff Jirsa
nodetool cfstats nodetool cfhistograms -- Jeff Jirsa > On Jan 7, 2018, at 7:26 AM, Peng Xiao <2535...@qq.com> wrote: > > Hi guys, > > Could anyone please help on this simple question? > How to check C* partition size and related information. > looks nodetoo

how to check C* partition size

2018-01-07 Thread Peng Xiao
Hi guys, Could anyone please help on this simple question? How to check C* partition size and related information. looks nodetool ring only shows the token distribution. Thanks

Re: effect of partition size

2017-12-11 Thread Jeff Jirsa
Yes, that's LIKELY "better". On Mon, Dec 11, 2017 at 8:10 AM, Micha wrote: > ok, thanks for the answer. > > So the better approach here is to adjust the table schema to get the > partition size to around 100MB max. > This means using a partition key with

Re: effect of partition size

2017-12-11 Thread Micha
ok, thanks for the answer. So the better approach here is to adjust the table schema to get the partition size to around 100MB max. This means using a partition key with multiple parts and making more selects instead of one when querying the data (which may increase parallelism). Michael

Re: effect of partition size

2017-12-11 Thread Jeff Jirsa
There's a few, and there have been various proposals (some in progress) to deal with them. The two most obvious problems are: The primary problem for most people is that wide partitions cause JVM heap pressure on reads (CASSANDRA-11206, CASSANDRA-9754). This is because we break the wide partitions

effect of partition size

2017-12-11 Thread Micha
Hi, What are the effects of large partitions? I have a few tables which have partitions sizes as: 95% 24000 98% 42000 99% 85000 Max 82000 So, should I redesign the schema to get this max smaller or doesn't it matter much, since 99% of the partitions are <= 85000 ? Thanks for answerin

Re: How to obtain partition size

2017-03-13 Thread Oskar Kjellin
How about this tool? https://github.com/instaclustr/cassandra-sstable-tools > On 13 Mar 2017, at 17:56, Artur R wrote: > > Hello! > > I can't find where C* stores information about partitions size (if stores it > at all). > So, the questions; > > 1. How to obtain the size (in rows or in byt

How to obtain partition size

2017-03-13 Thread Artur R
Hello! I can't find where C* stores information about partitions size (if stores it at all). So, the questions; 1. How to obtain the size (in rows or in bytes - doesn't matter) of some particular partition? I know that there is *system.size_estimates* table with *mean_partition_size*, but it's o

Re: Metric to monitor partition size

2017-01-13 Thread Bryan Cheng
We're on 2.X so this information may not apply to your version, but you should see: 1) A log statement upon compaction, like "Writing large partition", including the primary partition key (see https://issues.apache.org/jira/browse/CASSANDRA-9643). Configurable threshold in cassandra.yaml 2) Probl

Metric to monitor partition size

2017-01-12 Thread Saumitra S
Is there any metric or way to find out if any partition has grown beyond a certain size or certain row count? If a partition reaches a certain size or limit, I want to stop sending further write requests to it. Is it possible?

Partition size estimation formula in 3.0

2016-09-19 Thread Jérôme Mainaud
Hello, Until 3.0, we had a nice formula to estimate partition size : sizeof(partition keys) + sizeof(static columns) + countof(rows) * sizeof(regular columns) + countof(rows) * countof(regular columns) * sizeof(clustering columns) + 8 * count(values in partition) With the

Re: Partition size

2016-09-12 Thread Jeremy Hanna
om Cassandra side. > > On 12 Sep 2016 9:50 p.m., "Jeff Jirsa" <mailto:jji...@apache.org>> wrote: > On 2016-09-08 18:53 (-0700), Anshu Vajpayee <mailto:anshu.vajpa...@gmail.com>> wrote: > > Is there any way to get partition size for a partition key ?

Re: Partition size

2016-09-12 Thread Jeff Jirsa
think there should be a way to put > restriction for it from Cassandra side. Perhaps not surprisingly, folks active in the other ticket (for determining partition size) also have a ticket to blacklist large partitions: https://issues.apache.org/jira/browse/CASSANDRA-12106 Again, not comple

Re: Partition size

2016-09-12 Thread Anshu Vajpayee
12 Sep 2016 9:50 p.m., "Jeff Jirsa" wrote: > On 2016-09-08 18:53 (-0700), Anshu Vajpayee > wrote: > > Is there any way to get partition size for a partition key ? > > > > Anshu, > > The simple answer to your question is that it is not currently possible to

Re: Partition size

2016-09-12 Thread San Luoji
tastax.com/en/cassandra/2.1/cassandra/tools/ > toolsCFstats.html> > > > > Folks, > > > > It is *Apache* Cassandra. If you are going to point to docs, please > > point to the official Apache docs unless there is a very good reason > > not to. > > > > In this case: > > > > http://cassandra.apache.org/doc/latest/configuration/ > cassandra_config_file.html#compaction_large_partition_warning_threshold_mb > > <http://cassandra.apache.org/doc/latest/configuration/ > cassandra_config_file.html#compaction_large_partition_warning_threshold_mb > > > > > > looks to the place. > > > > Mark > > > > > > > > > > Thanks > > > > > > Mark > > > > > > > > > On 9 September 2016 at 02:53, Anshu Vajpayee < > anshu.vajpa...@gmail.com > > > <mailto:anshu.vajpa...@gmail.com>> wrote: > > > > > > Is there any way to get partition size for a partition key ? > > > > > > > > > >

Re: Partition size

2016-09-12 Thread Jeff Jirsa
On 2016-09-08 18:53 (-0700), Anshu Vajpayee wrote: > Is there any way to get partition size for a partition key ? > Anshu, The simple answer to your question is that it is not currently possible to get a partition size for an arbitrary key without quite a lot of work (basically you&

Re: Partition size

2016-09-12 Thread Edward Capriolo
Mark Curtis wrote: >> > > > If your partition sizes are over 100MB iirc then you'll >> > normally see >> > > > warnings in your system.log, this will outline the partition >> > key, at >> > > > least in Cassandra 2.0 and 2.1 as I recall. >> > > > >> > > > Your best friend here is nodetool cfstats which shows you >> the >> > > > min/mean/max partition sizes for your table. It's quite >> > often used to >> > > > pinpoint large partitons on nodes in a cluster. >> > > > >> > > > More info >> > > > here: >> > > >> > https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/t >> oolsCFstats.html >> > <https://docs.datastax.com/en/cassandra/2.1/cassandra/ >> tools/toolsCFstats.html> >> > > >> > <https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/ >> toolsCFstats.html >> > <https://docs.datastax.com/en/cassandra/2.1/cassandra/ >> tools/toolsCFstats.html>> >> > > >> > > Folks, >> > > >> > > It is *Apache* Cassandra. If you are going to point to docs, >> > please >> > > point to the official Apache docs unless there is a very good >> > reason >> > > not to. >> > > >> > > In this case: >> > > >> > > >> > http://cassandra.apache.org/doc/latest/configuration/cassand >> ra_config_file.html#compaction_large_partition_warning_threshold_mb >> > <http://cassandra.apache.org/doc/latest/configuration/cassa >> ndra_config_file.html#compaction_large_partition_warning_threshold_mb> >> > > >> > <http://cassandra.apache.org/doc/latest/configuration/cassan >> dra_config_file.html#compaction_large_partition_warning_threshold_mb >> > <http://cassandra.apache.org/doc/latest/configuration/cassa >> ndra_config_file.html#compaction_large_partition_warning_threshold_mb>> >> > > >> > > looks to the place. >> > > >> > > Mark >> > > >> > > >> > > > >> > > > Thanks >> > > > >> > > > Mark >> > > > >> > > > >> > > > On 9 September 2016 at 02:53, Anshu Vajpayee >> > mailto:anshu.vajpa...@gmail.com> >> > > > <mailto:anshu.vajpa...@gmail.com >> > <mailto:anshu.vajpa...@gmail.com>>> wrote: >> > > > >> > > > Is there any way to get partition size for a partition >> > key ? >> > > > >> > > > >> > > >> > >> > >> >> >

Re: Partition size

2016-09-12 Thread Benedict Elliott Smith
.1/cassandra/tools/ > toolsCFstats.html > > <https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/ > toolsCFstats.html> > > > > > <https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/ > toolsCFstats.html > > <https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/ > toolsCFstats.html>> > > > > > > Folks, > > > > > > It is *Apache* Cassandra. If you are going to point to docs, > > please > > > point to the official Apache docs unless there is a very good > > reason > > > not to. > > > > > > In this case: > > > > > > > > http://cassandra.apache.org/doc/latest/configuration/ > cassandra_config_file.html#compaction_large_partition_warning_threshold_mb > > <http://cassandra.apache.org/doc/latest/configuration/ > cassandra_config_file.html#compaction_large_partition_warning_threshold_mb > > > > > > > <http://cassandra.apache.org/doc/latest/configuration/ > cassandra_config_file.html#compaction_large_partition_warning_threshold_mb > > <http://cassandra.apache.org/doc/latest/configuration/ > cassandra_config_file.html#compaction_large_partition_warning_threshold_mb > >> > > > > > > looks to the place. > > > > > > Mark > > > > > > > > > > > > > > Thanks > > > > > > > > Mark > > > > > > > > > > > > On 9 September 2016 at 02:53, Anshu Vajpayee > > mailto:anshu.vajpa...@gmail.com> > > > > <mailto:anshu.vajpa...@gmail.com > > <mailto:anshu.vajpa...@gmail.com>>> wrote: > > > > > > > > Is there any way to get partition size for a partition > > key ? > > > > > > > > > > > > > > > > >

Re: Partition size

2016-09-12 Thread Mark Thomas
on > > not to. > > > > In this case: > > > > > > http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#compaction_large_partition_warning_threshold_mb > > <http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#compaction_large_partition_warning_threshold_mb> > > > > <http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#compaction_large_partition_warning_threshold_mb > > <http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#compaction_large_partition_warning_threshold_mb>> > > > > looks to the place. > > > > Mark > > > > > > > > > > Thanks > > > > > > Mark > > > > > > > > > On 9 September 2016 at 02:53, Anshu Vajpayee > mailto:anshu.vajpa...@gmail.com> > > > <mailto:anshu.vajpa...@gmail.com > <mailto:anshu.vajpa...@gmail.com>>> wrote: > > > > > > Is there any way to get partition size for a partition > key ? > > > > > > > > > >

Re: Partition size

2016-09-12 Thread Benedict Elliott Smith
<https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/ > toolsCFstats.html> > > > > Folks, > > > > It is *Apache* Cassandra. If you are going to point to docs, please > > point to the official Apache docs unless there is a very good reason >

Re: Partition size

2016-09-12 Thread Mark Thomas
ction_large_partition_warning_threshold_mb> > > looks to the place. > > Mark > > > > > > Thanks > > > > Mark > > > > > > On 9 September 2016 at 02:53, Anshu Vajpayee > <mailto:anshu.vajpa...@gmail.com>> wrote: > > > > Is there any way to get partition size for a partition key ? > > > > >

Re: Partition size

2016-09-09 Thread Jonathan Haddad
int to the official Apache docs unless there is a very good reason not >> to. >> >> In this case: >> >> >> http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#compaction_large_partition_warning_threshold_mb >> >> looks to the place. >> >> Mark >> >> >> > >> > Thanks >> > >> > Mark >> > >> > >> > On 9 September 2016 at 02:53, Anshu Vajpayee > > <mailto:anshu.vajpa...@gmail.com>> wrote: >> > >> > Is there any way to get partition size for a partition key ? >> > >> > >> >>

Re: Partition size

2016-09-09 Thread Benedict Elliott Smith
/doc/latest/configuration/cassand > ra_config_file.html#compaction_large_partition_warning_threshold_mb > > looks to the place. > > Mark > > > > > > Thanks > > > > Mark > > > > > > On 9 September 2016 at 02:53, Anshu Vajpayee > <mailto:anshu.vajpa...@gmail.com>> wrote: > > > > Is there any way to get partition size for a partition key ? > > > > > >

Re: Partition size

2016-09-09 Thread Jeff Jirsa
On 9/9/16, 12:14 PM, "Mark Thomas" wrote: > If you are going to point to docs, please >point to the official Apache docs unless there is a very good reason not to. > (And if the good reason is that there’s a deficiency in the apache Cassandra docs, please make it known on the list or in a jir

Re: Partition size

2016-09-09 Thread Jeff Jirsa
ed in C* 3.x. What is considered a good partition size in C* 3.x In modern versions (2.1 and newer), the “real” risk of large partitions is that they generate a lot of garbage on read – it’s not a 1:1 equivalence, but it’s linear, and a partition that’s 10x as large generates 10x as much garbage. You

Re: Partition size

2016-09-09 Thread Mark Thomas
> Thanks > > Mark > > > On 9 September 2016 at 02:53, Anshu Vajpayee <mailto:anshu.vajpa...@gmail.com>> wrote: > > Is there any way to get partition size for a partition key ? > >

Re: Partition size

2016-09-09 Thread Mark Curtis
t; in Cassandra 2.0 and 2.1 as I recall. > > Has it improved in C* 3.x. What is considered a good partition size in C* > 3.x > The 100MB is just a default setting you can set this up or down as you need it: https://docs.datastax.com/en/cassandra/3.0/cassa

Re: Partition size

2016-09-09 Thread Rakesh Kumar
sidered a good partition size in C* 3.x

Re: Partition size

2016-09-09 Thread Mark Curtis
It's quite often used to pinpoint large partitons on nodes in a cluster. More info here: https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCFstats.html Thanks Mark On 9 September 2016 at 02:53, Anshu Vajpayee wrote: > Is there any way to get partition size for a partition key ? >

Partition size

2016-09-08 Thread Anshu Vajpayee
Is there any way to get partition size for a partition key ?

Estimating partition size for C*2.X and C*3.X and Time Series Data Modelling.

2016-06-20 Thread G P
f the same tables in MSSQL to C* is not recommended due to the way C*2.X stores its data. I took the DS220: Data Modelling Course, that showcases two formulas for estimating a partition size based on the Table design. [cid:image003.png@01D1CB16.9A41FD30] [cid:image004.png@01D1CB16.9A41FD30] Not

Re: on-disk size vs partition-size in cfhistograms

2016-05-20 Thread Alain RODRIGUEZ
tp://thelastpickle.com/blog/2016/03/04/introductiont-to-the-apache-cassandra-3-storage-engine.html . > What would be the typical use/interpretation of the "partition size" > metric. I guess people use that to spot wide rows mainly, but if you are happy summing those, it should be good as

on-disk size vs partition-size in cfhistograms

2016-05-06 Thread Joseph Tech
directory size/count of rows). Can this be considered a valid approach to extrapolate for future growth of data ? Related to this, is there any information we can gather from partition-size of cfhistograms (snipped output for my table below) : Partition Size (bytes) 642 bytes: 221 770 b

Re: Data Modeling: Partition Size and Query Efficiency

2016-01-06 Thread Jim Ancona
On Tue, Jan 5, 2016 at 5:52 PM, Jonathan Haddad wrote: > You could keep a "num_buckets" value associated with the client's account, > which can be adjusted accordingly as usage increases. > Yes, but the adjustment problem is tricky when there are multiple concurrent writers. What happens when yo

Re: Data Modeling: Partition Size and Query Efficiency

2016-01-05 Thread Jonathan Haddad
You could keep a "num_buckets" value associated with the client's account, which can be adjusted accordingly as usage increases. On Tue, Jan 5, 2016 at 2:17 PM Jim Ancona wrote: > On Tue, Jan 5, 2016 at 4:56 PM, Clint Martin < > clintlmar...@coolfiretechnologies.com> wrote: > >> What sort of dat

Re: Data Modeling: Partition Size and Query Efficiency

2016-01-05 Thread Jim Ancona
On Tue, Jan 5, 2016 at 4:56 PM, Clint Martin < clintlmar...@coolfiretechnologies.com> wrote: > What sort of data is your clustering key composed of? That might help some > in determining a way to achieve what you're looking for. > Just a UUID that acts as an object identifier. > > Clint > On Jan

Re: Data Modeling: Partition Size and Query Efficiency

2016-01-05 Thread Clint Martin
What sort of data is your clustering key composed of? That might help some in determining a way to achieve what you're looking for. Clint On Jan 5, 2016 2:28 PM, "Jim Ancona" wrote: > Hi Nate, > > Yes, I've been thinking about treating customers as either small or big, > where "small" ones have

Re: Data Modeling: Partition Size and Query Efficiency

2016-01-05 Thread Jim Ancona
Hi Nate, Yes, I've been thinking about treating customers as either small or big, where "small" ones have a single partition and big ones have 50 (or whatever number I need to keep sizes reasonable). There's still the problem of how to handle a small customer who becomes too big, but that will hap

Re: Data Modeling: Partition Size and Query Efficiency

2016-01-05 Thread Jim Ancona
tch to a new bucket. Do > a couple of extra queries when a key is not in that cache to determine what > the partition size and count to initialize the cache entry for a key. If > necessary, keep a separate table that tracks the partition size or maybe > just the (rough) row count to u

Re: Data Modeling: Partition Size and Query Efficiency

2016-01-05 Thread Nate McCall
> > > In this case, 99% of my data could fit in a single 50 MB partition. But if > I use the standard approach, I have to split my partitions into 50 pieces > to accommodate the largest data. That means that to query the 700 rows for > my median case, I have to read 50 partitions instead of one. >

Re: Data Modeling: Partition Size and Query Efficiency

2016-01-05 Thread Jack Krupansky
hen a key is not in that cache to determine what the partition size and count to initialize the cache entry for a key. If necessary, keep a separate table that tracks the partition size or maybe just the (rough) row count to use to determine when a new partition is needed. -- Jack Krupansky On Tue

Re: Data Modeling: Partition Size and Query Efficiency

2016-01-05 Thread Jim Ancona
nds on your requirements. > > Clint > On Jan 4, 2016 10:13 AM, "Jim Ancona" wrote: > >> A problem that I have run into repeatedly when doing schema design is how >> to control partition size while still allowing for efficient multi-row >> queries. &g

Re: Data Modeling: Partition Size and Query Efficiency

2016-01-04 Thread Clint Martin
ntees. It all just depends on your requirements. Clint On Jan 4, 2016 10:13 AM, "Jim Ancona" wrote: > A problem that I have run into repeatedly when doing schema design is how > to control partition size while still allowing for efficient multi-row > queries. > > We w

Data Modeling: Partition Size and Query Efficiency

2016-01-04 Thread Jim Ancona
A problem that I have run into repeatedly when doing schema design is how to control partition size while still allowing for efficient multi-row queries. We want to limit partition size to some number between 10 and 100 megabytes to avoid operational issues. The standard way to do that is to