Joel, Id rather thank you for naming 11513 earlier in the mail, I would have been lost in the code for a much longer time otherwise.
Repeating what Tianshi mentioned in 11513 - "*Cassandra community is awesome! Should buy you a beer, Joel."* :) On Wed, Jun 15, 2016 at 6:01 AM, Joel Knighton <joel.knigh...@datastax.com> wrote: > Great work, Bhuvan - I sat down after work to look at this more carefully. > > For a short summary, you are correct. > > For a longer summary, I initially thought the reproduction you provided > would not run into the issue from 3.4/3.5 because it didn't select any > static columns, which meant that it wouldn't have statics in its > ColumnFilter (basically, the filter we apply when deciding if we need to > look for the requested data in more SSTables). This was an incorrect > understanding - in order to preserve the CQL semantic (see CASSANDRA-6588 > for details), we are including all columns, including the static columns, > in the fetched columns, which means they are part of the ColumnFilter. I > believe there may be an opportunity for an optimization here, but that's a > whole different discussion. I now agree that these are the same issue. > > You are correct in your analysis that 3.4/3.5 are the only affected > versions. It has been patched in release 3.6 forward and was not introduced > until 3.4 > > Thanks for sticking with me on this - I'm going to resolve CASSANDRA-12003 > as a duplicate of CASSANDRA-11513. > > On Tue, Jun 14, 2016 at 4:21 PM, Bhuvan Rawal <bhu1ra...@gmail.com> wrote: > >> Joel, >> >> Thanks for your reply, I have checked and found that the behavior is same >> in case of CASSANDRA-11513 >> <https://issues.apache.org/jira/browse/CASSANDRA-11513>. I have verified >> this behavior (for both 11513 & 12003) to occur in case of 3.4 & 3.5. They >> both don't occur in 3.0.4, 3.6 & 3.7. >> >> Please find below the results of selecting only pk and clustering key >> from 11513. It has also been verified that both issues occur while >> selecting all / filtered rows therefore selection criteria is not an issue >> filtering by WHERE is: >> >> cqlsh:ks> select pk,a from test0 where pk=0 and a=2; >> >> pk | a >> ----+--- >> 0 | 1 >> 0 | 2 >> 0 | 3 >> >> We can verify this claim by applying 11513 Patch to 3.5 Tag and build & >> test for 12003. If it is fixed then we can guarantee the claim. Let me >> know if any further input may possibly be required here. >> >> On Wed, Jun 15, 2016 at 2:23 AM, Joel Knighton < >> joel.knigh...@datastax.com> wrote: >> >>> The important part of that query is that it's selecting a static column >>> (with select *), not whether it is filtering on one. In CASSANDRA-12003 and >>> this thread, it looks like you're only selecting the primary and clustering >>> columns. I'd be cautious about concluding that CASSANDRA-12003 and >>> CASSANDRA-11513 are the same issue and that CASSANDRA-12003 is fixed. >>> >>> If you have a reproduction path for CASSANDRA-12003, I'd recommend >>> attaching it to a ticket, and someone can investigate internals to see if >>> CASSANDRA-11513 (or something else entirely) fixed the issue. >>> >>> On Tue, Jun 14, 2016 at 2:13 PM, Bhuvan Rawal <bhu1ra...@gmail.com> >>> wrote: >>> >>>> Joel, >>>> >>>> If we look at the schema carefully: >>>> >>>> CREATE TABLE test0 ( >>>> pk int, >>>> a int, >>>> b text, >>>> s text static, >>>> PRIMARY KEY (*pk, a)* >>>> ); >>>> >>>> and filtering is performed on clustering column a and its not a static >>>> column: >>>> >>>> select * from test0 where pk=0 and a=2; >>>> >>>> >>>> >>>> On Wed, Jun 15, 2016 at 12:39 AM, Joel Knighton < >>>> joel.knigh...@datastax.com> wrote: >>>> >>>>> It doesn't seem to be an exact duplicate - CASSANDRA-11513 relies on >>>>> you selecting a static column, which you weren't doing in the reported >>>>> issue. That said, I haven't looked too closely. >>>>> >>>>> On Tue, Jun 14, 2016 at 2:07 PM, Bhuvan Rawal <bhu1ra...@gmail.com> >>>>> wrote: >>>>> >>>>>> I can reproduce CASSANDRA-11513 >>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-11513> locally on >>>>>> 3.5, possible duplicate. >>>>>> >>>>>> On Wed, Jun 15, 2016 at 12:29 AM, Joel Knighton < >>>>>> joel.knigh...@datastax.com> wrote: >>>>>> >>>>>>> There's some precedent for similar issues with static columns in 3.5 >>>>>>> with https://issues.apache.org/jira/browse/CASSANDRA-11513 - a >>>>>>> deterministic (or somewhat deterministic) path for reproduction would >>>>>>> help >>>>>>> narrow the issue down farther. I've played around locally with similar >>>>>>> schemas (sans the stratio indices) and couldn't reproduce the issue. >>>>>>> >>>>>>> On Tue, Jun 14, 2016 at 1:41 PM, Bhuvan Rawal <bhu1ra...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Jira CASSANDRA-12003 >>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-12003> Has been >>>>>>>> created for the same. >>>>>>>> >>>>>>>> On Tue, Jun 14, 2016 at 11:54 PM, Atul Saroha < >>>>>>>> atul.sar...@snapdeal.com> wrote: >>>>>>>> >>>>>>>>> Hi Tyler, >>>>>>>>> >>>>>>>>> This issue is mainly visible for tables having static columns, >>>>>>>>> still investigating. >>>>>>>>> We will try to test after removing lucene index but I don’t think >>>>>>>>> this plug-in could led to change in behaviour of cassandra write to >>>>>>>>> table's >>>>>>>>> memtable. >>>>>>>>> >>>>>>>>> >>>>>>>>> --------------------------------------------------------------------------------------------------------------------- >>>>>>>>> Atul Saroha >>>>>>>>> *Lead Software Engineer* >>>>>>>>> *M*: +91 8447784271 *T*: +91 124-415-6069 *EXT*: 12369 >>>>>>>>> Plot # 362, ASF Centre - Tower A, Udyog Vihar, >>>>>>>>> Phase -4, Sector 18, Gurgaon, Haryana 122016, INDIA >>>>>>>>> >>>>>>>>> On Tue, Jun 14, 2016 at 9:54 PM, Tyler Hobbs <ty...@datastax.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Is 'id' your partition key? I'm not familiar with the stratio >>>>>>>>>> indexes, but it looks like the primary key columns are both indexed. >>>>>>>>>> Perhaps this is related? >>>>>>>>>> >>>>>>>>>> On Tue, Jun 14, 2016 at 1:25 AM, Atul Saroha < >>>>>>>>>> atul.sar...@snapdeal.com> wrote: >>>>>>>>>> >>>>>>>>>>> After further debug, this issue is found in in-memory memtable >>>>>>>>>>> as doing nodetool flush + compact resolve the issue. And there is >>>>>>>>>>> no batch >>>>>>>>>>> write used for this table which is showing issue. >>>>>>>>>>> Table properties: >>>>>>>>>>> >>>>>>>>>>> WITH CLUSTERING ORDER BY (f_name ASC) >>>>>>>>>>>> AND bloom_filter_fp_chance = 0.01 >>>>>>>>>>>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} >>>>>>>>>>>> AND comment = '' >>>>>>>>>>>> AND compaction = {'class': >>>>>>>>>>>> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >>>>>>>>>>>> 'max_threshold': '32', 'min_threshold': '4'} >>>>>>>>>>>> AND compression = {'chunk_length_in_kb': '64', 'class': >>>>>>>>>>>> 'org.apache.cassandra.io.compress.LZ4Compressor'} >>>>>>>>>>>> AND crc_check_chance = 1.0 >>>>>>>>>>>> AND dclocal_read_repair_chance = 0.1 >>>>>>>>>>>> AND default_time_to_live = 0 >>>>>>>>>>>> AND gc_grace_seconds = 864000 >>>>>>>>>>>> AND max_index_interval = 2048 >>>>>>>>>>>> AND memtable_flush_period_in_ms = 0 >>>>>>>>>>>> AND min_index_interval = 128 >>>>>>>>>>>> AND read_repair_chance = 0.0 >>>>>>>>>>>> AND speculative_retry = '99PERCENTILE'; >>>>>>>>>>>> CREATE CUSTOM INDEX nbf_index ON nbf () USING >>>>>>>>>>>> 'com.stratio.cassandra.lucene.Index' WITH OPTIONS = >>>>>>>>>>>> {'refresh_seconds': >>>>>>>>>>>> '1', 'schema': '{ >>>>>>>>>>>> fields : { >>>>>>>>>>>> id : {type : "bigint"}, >>>>>>>>>>>> f_d_name : { >>>>>>>>>>>> type : "string", >>>>>>>>>>>> indexed : true, >>>>>>>>>>>> sorted : false, >>>>>>>>>>>> validated : true, >>>>>>>>>>>> case_sensitive : false >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> }'}; >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> --------------------------------------------------------------------------------------------------------------------- >>>>>>>>>>> Atul Saroha >>>>>>>>>>> *Lead Software Engineer* >>>>>>>>>>> *M*: +91 8447784271 *T*: +91 124-415-6069 *EXT*: 12369 >>>>>>>>>>> Plot # 362, ASF Centre - Tower A, Udyog Vihar, >>>>>>>>>>> Phase -4, Sector 18, Gurgaon, Haryana 122016, INDIA >>>>>>>>>>> >>>>>>>>>>> On Mon, Jun 13, 2016 at 11:11 PM, Siddharth Verma < >>>>>>>>>>> verma.siddha...@snapdeal.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> No, all rows were not the same. >>>>>>>>>>>> Querying only on the partition key gives 20 rows. >>>>>>>>>>>> In the erroneous result, while querying on partition key and >>>>>>>>>>>> clustering key, we got 16 of those 20 rows. >>>>>>>>>>>> >>>>>>>>>>>> And for "*tombstone_threshold"* there isn't any entry at >>>>>>>>>>>> column family level. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Siddharth Verma >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Tyler Hobbs >>>>>>>>>> DataStax <http://datastax.com/> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> <http://www.datastax.com/> >>>>>>> >>>>>>> Joel Knighton >>>>>>> Cassandra Developer | joel.knigh...@datastax.com >>>>>>> >>>>>>> <https://www.linkedin.com/company/datastax> >>>>>>> <https://www.facebook.com/datastax> <https://twitter.com/datastax> >>>>>>> <https://plus.google.com/+Datastax/about> >>>>>>> <http://feeds.feedburner.com/datastax> >>>>>>> <https://github.com/datastax/> >>>>>>> >>>>>>> <http://cassandrasummit.org/Email_Signature> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> <http://www.datastax.com/> >>>>> >>>>> Joel Knighton >>>>> Cassandra Developer | joel.knigh...@datastax.com >>>>> >>>>> <https://www.linkedin.com/company/datastax> >>>>> <https://www.facebook.com/datastax> <https://twitter.com/datastax> >>>>> <https://plus.google.com/+Datastax/about> >>>>> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/> >>>>> >>>>> <http://cassandrasummit.org/Email_Signature> >>>>> >>>> >>>> >>> >>> >>> -- >>> >>> <http://www.datastax.com/> >>> >>> Joel Knighton >>> Cassandra Developer | joel.knigh...@datastax.com >>> >>> <https://www.linkedin.com/company/datastax> >>> <https://www.facebook.com/datastax> <https://twitter.com/datastax> >>> <https://plus.google.com/+Datastax/about> >>> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/> >>> >>> <http://cassandrasummit.org/Email_Signature> >>> >> >> > > > -- > > <http://www.datastax.com/> > > Joel Knighton > Cassandra Developer | joel.knigh...@datastax.com > > <https://www.linkedin.com/company/datastax> > <https://www.facebook.com/datastax> <https://twitter.com/datastax> > <https://plus.google.com/+Datastax/about> > <http://feeds.feedburner.com/datastax> <https://github.com/datastax/> > > <http://cassandrasummit.org/Email_Signature> >