This is not the right data model for Cassandra. Strong encouragement to
watch one of Patrick McFadin's data modeling videos on youtube.

You very much want to always query where a WHERE clause, which usually
means knowing a partition key (or set of partition keys) likely to contain
your data, and using sorting within those keys to make data easy to access.
This may mean a partition key that's a date (so all writes for a given date
land on one replica set, and you only query within that set), or
(date,bucket) tuple (where bucket is an int from 0-1000, for example, which
avoids hostspotting), then you can read (date, 7) and (date, 996)  and
everything else in concurrent async queries, or something else that gives
you deterministic partitioning so you're not walking past all of those dead
tombstones.

Reading with a naive SELECT with no WHERE is going to be perhaps the least
performant way to do this in cassandra, and you are probably miserable at
both the response time and the failure rate, because this is not how
Cassandra is designed to work.



On Mon, Oct 25, 2021 at 3:56 PM Joe Obernberger <
joseph.obernber...@gmail.com> wrote:

> Hi Jeff - yes, I'm doing a select without where - specifically:  select
> uuid from table limit 1000;
> Not inserting nulls, and nothing is TTL'd.
> At this point with zero rows, the above select fails.
>
> Sounds like my application needs a redesign as doing 1 billion inserts,
> and 100 million deletes results in an unusable table.  I'm using Cassandra
> to de-duplicate data and that's not a good use case for it.
>
> -Joe
> On 10/25/2021 6:51 PM, Jeff Jirsa wrote:
>
> The tombstone threshold is "how many tombstones are encountered within a
> single read command", and the default is something like 100,000 (
> https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L1293-L1294
> )
>
> Deletes are not forbidden, but you have to read in such a way that you
> touch less than 100,000 deletes per read.
>
> Are you doing full table scans or SELECT without WHERE?
> Are you inserting nulls in some columns?
> Are you TTL'ing everything ?
>
>
>
> On Mon, Oct 25, 2021 at 3:28 PM Joe Obernberger <
> joseph.obernber...@gmail.com> wrote:
>
>> Update - after 10 days, I'm able to use the table again; prior to that
>> all selects timed out.
>> Are deletes basically forbidden with Cassandra?  If you have a table
>> where you want to do lots of inserts and deletes, is there an option that
>> works in Cassandra?  Even thought the table now has zero rows, after
>> deleting them, I can no longer do a select from the table as it times out.
>> Thank you!
>>
>> -Joe
>> On 10/14/2021 3:38 PM, Joe Obernberger wrote:
>>
>> I'm not sure if tombstones is the issue; is it?  Grace is set to 10 days,
>> that time has not passed yet.
>>
>> -Joe
>> On 10/14/2021 1:37 PM, James Brown wrote:
>>
>> What is gc_grace_seconds set to on the table? Once that passes, you can
>> do `nodetool scrub` to more emphatically remove tombstones...
>>
>> On Thu, Oct 14, 2021 at 8:49 AM Joe Obernberger <
>> joseph.obernber...@gmail.com> wrote:
>>
>>> Hi all - I have a table where I've needed to delete a number of rows.
>>> I've run repair, but I still can't select from the table.
>>>
>>> select * from doc.indexorganize limit 10;
>>> OperationTimedOut: errors={'172.16.100.37:9042': 'Client request
>>> timeout. See Session.execute[_async](timeout)'},
>>> last_host=172.16.100.37:9042
>>>
>>> Info on the table:
>>>
>>> nodetool tablestats doc.indexorganize
>>> Total number of tables: 97
>>> ----------------
>>> Keyspace : doc
>>>          Read Count: 170275408
>>>          Read Latency: 1.6486837044783356 ms
>>>          Write Count: 6821769404
>>>          Write Latency: 0.08147347268570909 ms
>>>          Pending Flushes: 0
>>>                  Table: indexorganize
>>>                  SSTable count: 21
>>>                  Old SSTable count: 0
>>>                  Space used (live): 1536557040
>>>                  Space used (total): 1536557040
>>>                  Space used by snapshots (total): 1728378992
>>>                  Off heap memory used (total): 46251932
>>>                  SSTable Compression Ratio: 0.5218383898575761
>>>                  Number of partitions (estimate): 17365415
>>>                  Memtable cell count: 0
>>>                  Memtable data size: 0
>>>                  Memtable off heap memory used: 0
>>>                  Memtable switch count: 12
>>>                  Local read count: 17346304
>>>                  Local read latency: NaN ms
>>>                  Local write count: 31340451
>>>                  Local write latency: NaN ms
>>>                  Pending flushes: 0
>>>                  Percent repaired: 100.0
>>>                  Bytes repaired: 1.084GiB
>>>                  Bytes unrepaired: 0.000KiB
>>>                  Bytes pending repair: 0.000KiB
>>>                  Bloom filter false positives: 0
>>>                  Bloom filter false ratio: 0.00000
>>>                  Bloom filter space used: 38030728
>>>                  Bloom filter off heap memory used: 38030560
>>>                  Index summary off heap memory used: 7653060
>>>                  Compression metadata off heap memory used: 568312
>>>                  Compacted partition minimum bytes: 51
>>>                  Compacted partition maximum bytes: 86
>>>                  Compacted partition mean bytes: 67
>>>                  Average live cells per slice (last five minutes):
>>> 73.53164556962025
>>>                  Maximum live cells per slice (last five minutes): 5722
>>>                  Average tombstones per slice (last five minutes): 1.0
>>>                  Maximum tombstones per slice (last five minutes): 1
>>>                  Dropped Mutations: 0
>>>
>>> nodetool tablehistograms doc.indexorganize
>>> doc/indexorganize histograms
>>> Percentile      Read Latency     Write Latency SSTables    Partition
>>> Size        Cell Count
>>>                      (micros) (micros)
>>> (bytes)
>>> 50%                     0.00              0.00 0.00
>>> 60                 1
>>> 75%                     0.00              0.00 0.00
>>> 86                 2
>>> 95%                     0.00              0.00 0.00
>>> 86                 2
>>> 98%                     0.00              0.00 0.00
>>> 86                 2
>>> 99%                     0.00              0.00 0.00
>>> 86                 2
>>> Min                     0.00              0.00 0.00
>>> 51                 0
>>> Max                     0.00              0.00 0.00
>>> 86                 2
>>>
>>> Any ideas on what I can do?  Thank you!
>>>
>>> -Joe
>>>
>>>
>>
>> --
>> James Brown
>> Engineer
>>
>>
>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>>  Virus-free.
>> www.avg.com
>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>> <#m_-3623639998410431590_m_-996371029387335963_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>
>>

Reply via email to