Re: question when using SASI indexing

DuyHai Doan Fri, 05 Aug 2016 01:38:14 -0700

Ok the fact that you see some rows and after a while you see 0 rows means
that those rows are deleted.


Since SASI does only index INSERT & UPDATE but not DELETE, management of
tombstones is let to Cassandra to handle.

It means that if you do an INSERT, you'll have an entry into SASI index
file but when you do a DELETE, SASI does not remove the entry from its
index file.

When reading, SASI will give the partition offset to Cassandra and
Cassandra will fetch the data from SSTables, then realises that there is a
tombstone, thus return 0 row.

The only moment those entries will be remove from SASI index file is when
your SSTable get compacted and the data are purged.

The fact that you can see some rows then 0 rows mean that some of your
replicas have missed the tombstones.

"However, after about 20 attempts, all servers started to only return 0
results. " --> Read-repair kicks in so the tombstones are propagated and
then you see 0 row.



On Tue, Aug 2, 2016 at 10:52 PM, George Webster <webste...@gmail.com> wrote:

> The indexes were written about 1-2 months ago. No data has been added to
> the servers since the indexes were created. Additionally, the indexes
> appeared to be stable until I noticed the issue today. ... which occurred
> after a made a large query without setting a LIMIT
>
> I set the consistency level and moved the select statement between
> different nodes. The results remained inconsistent, returning a random
> number between 0 and 8. It did not appear to make much difference between
> the different nodes or consistency level. However, after about 20 attempts,
> all servers started to only return 0 results.
>
>
> Lastly, this appeared in the logs during that time:
>
> INFO  [IndexSummaryManager:1] 2016-08-02 22:11:43,245
> IndexSummaryRedistribution.java:74 - Redistributing index summaries
>
> INFO  [OptionalTasks:1] 2016-08-02 22:25:06,508 NoSpamLogger.java:91 -
> Maximum memory usage reached (536870912 bytes), cannot allocate chunk of
> 1048576 bytes
>
> On Tue, Aug 2, 2016 at 6:58 PM, DuyHai Doan <doanduy...@gmail.com> wrote:
>
>> One possible explanation is that you're querying data while the index
>> files are being built so that the result are different
>>  The second possible explanation is the consistency level.
>>
>> Try the query again using CL = QUORUM, try on several nodes to see if the
>> results are different
>>
>> On Tue, Aug 2, 2016 at 6:32 PM, George Webster <webste...@gmail.com>
>> wrote:
>>
>>> Hey DuyHai,
>>> Thank you for your help.
>>>
>>> 1) Cassandra version
>>> [cqlsh 5.0.1 | Cassandra 3.5 | CQL spec 3.4.0 | Native protocol v4]
>>>
>>>
>>> 2) CREATE CUSTOM INDEX statement for your index
>>>
>>> CREATE CUSTOM INDEX objects_mime_idx ON test.objects (mime) USING 
>>> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'analyzed' : 
>>> 'true', 'analyzer_class' : 
>>> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer', 
>>> 'tokenization_enable_stemming' : 'false', 'tokenization_locale' : 'en', 
>>> 'tokenization_normalize_lowercase' : 'true', 'tokenization_skip_stop_words' 
>>> : 'true'};
>>>
>>>
>>> 3) Consistency level used for your SELECT
>>> I am using the default consistency
>>> cassandra@cqlsh> CONSISTENCY
>>> Current consistency level is ONE.
>>>
>>>
>>> 4) Replication factor
>>>
>>> CREATE KEYSPACE system_distributed WITH REPLICATION = {
>>>     'class' : 'org.apache.cassandra.locator.SimpleStrategy',
>>>     'replication_factor': '3' }
>>> AND DURABLE_WRITES = true;
>>>
>>>
>>> 5) Are you creating the index when the table is EMPTY or have you
>>> created the index when the table already contains some data ?
>>> I created the indexes after the tables contained data.
>>>
>>>
>>> On Tue, Aug 2, 2016 at 5:22 PM, DuyHai Doan <doanduy...@gmail.com>
>>> wrote:
>>>
>>>> Hello George
>>>>
>>>> Can you provide more details ?
>>>>
>>>> 1) Cassandra version
>>>> 2) CREATE CUSTOM INDEX statement for your index
>>>> 3) Consistency level used for your SELECT
>>>> 4) Replication factor
>>>> 5) Are you creating the index when the table is EMPTY or have you
>>>> created the index when the table already contains some data ?
>>>>
>>>> On Tue, Aug 2, 2016 at 4:05 PM, George Webster <webste...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hey guys and gals,
>>>>>
>>>>> I am having a strange issue with Cassandra SASI and I was hoping you
>>>>> could help solve the mystery. My issue is inconsistency between returned
>>>>> results and strange log errors.
>>>>>
>>>>> The biggest issue is that when I perform a query I am getting back
>>>>> inconsistent results. First few times I received between 3 and 7 results
>>>>> and then I finally received 187 results. At no point in time did I change
>>>>> the query statement. However, after I received the 187 results, any on
>>>>> queries returned zero results.
>>>>>
>>>>> my query:
>>>>> SELECT *
>>>>>     FROM test.objects
>>>>>     WHERE mime LIKE 'ELF%';
>>>>>
>>>>> When I look in the system.log file I see the following:
>>>>> WARN  [SharedPool-Worker-1] 2016-08-02 15:58:53,256
>>>>> SelectStatement.java:351 - Aggregation query used without partition key
>>>>> WARN  [SharedPool-Worker-1] 2016-08-02 15:59:02,978
>>>>> SelectStatement.java:351 - Aggregation query used without partition key
>>>>>
>>>>>
>>>>> When I look in the debug.log file I see the following when zero
>>>>> results are returned:
>>>>> WARN  [SharedPool-Worker-1] 2016-08-02 15:58:53,256
>>>>> SelectStatement.java:351 - Aggregation query used without partition key
>>>>> WARN  [SharedPool-Worker-1] 2016-08-02 15:59:02,978
>>>>> SelectStatement.java:351 - Aggregation query used without partition key
>>>>>
>>>>> Additionally, I see a lot of errors in the log that state:
>>>>> INFO  [OptionalTasks:1] 2016-08-02 15:40:04,310 NoSpamLogger.java:91 -
>>>>> Maximum memory usage reached (536870912 bytes), cannot allocate chunk of
>>>>> 1048576 bytes
>>>>> INFO  [OptionalTasks:1] 2016-08-02 15:55:04,387 NoSpamLogger.java:91 -
>>>>> Maximum memory usage reached (536870912 bytes), cannot allocate chunk of
>>>>> 1048576 bytes
>>>>>
>>>>>
>>>>> Any ideas?
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: question when using SASI indexing

Reply via email to