Sylvain, there's a bug in CHANGES.TXT for this issue. It says: "Duplicate rows returned when in clause has repeated values (CASSANDRA-6707)", but the issue number is really 6706.
-- Jack Krupansky On Thu, Feb 4, 2016 at 9:54 AM, Sylvain Lebresne <sylv...@datastax.com> wrote: > That behavior has been changed in 2.2 and upwards. If you don't like it, > upgrade. In the meantime, it's probably not hard to avoid passing duplicate > keys in IN. > > On Thu, Feb 4, 2016 at 3:48 PM, Edouard COLE <edouard.c...@rgsystem.com> > wrote: > >> Hello, >> >> >> >> When running that kind of query with TRACING ON; I noticed the >> coordinator is also performing multiple time the same query >> >> >> >> Because the element in the IN statement can involve many nodes, it makes >> sense to map/reduce the query, but running multiple time the same sub query >> should not happen. What if the result set change? Let’s imagine that query >> : SELECT * FROM t WHERE key IN (123, 123, …. X1000, 123), and while this >> query runs, the data for 123 change? >> >> >> >> key | value >> >> -----+------- >> >> 123 | 456 >> >> 123 | 456 >> >> 123 | 456 >> >> 123 | 789 <-- Change here L >> >> 123 | 789 >> >> >> >> >> >> There’s also something very important: when your table define a tuple >> being unique for a specific key, this is a real problem to be able to have >> a result set having multiple time the same key, which should be unique. >> This is why on every SQL implementation, this is not happening >> >> >> >> I think this is a bug >> >> >> >> Edouard COLE >> >> >> >> >> >> *De :* Alain RODRIGUEZ [mailto:arodr...@gmail.com] >> *Envoyé :* Thursday, February 04, 2016 11:55 AM >> *À :* Edouard COLE >> *Cc :* user@cassandra.apache.org >> *Objet :* Re: Duplicated key with an IN statement >> >> >> >> Hi, >> >> >> >> This is interesting. >> >> >> >> It seems rational that if you are looking at 2 keys and both exist (which >> is the case) it returns you 2 keys, it. Yet, I just checked this kind of >> command on MySQL and it gives a one line result. So here CQL differs from >> SQL (at least MySQL). I know we are trying to fit as much as possible with >> SQL to avoid loosing people, so we might want to change this. >> >> Not sure if this behavior is intentional / known. Not even sure someone >> ever tried to do this kind of query actually :). >> >> >> >> Does anyone know about that ? Should we raise a ticket ? >> >> >> >> ----------------- >> >> Alain Rodriguez >> >> France >> >> >> >> The Last Pickle >> >> http://www.thelastpickle.com >> >> >> >> >> >> >> >> 2016-02-04 8:36 GMT+00:00 Edouard COLE <edouard.c...@rgsystem.com>: >> >> Hello, >> >> I just discovered this, and I think this is weird: >> >> ed@debian:~$ cqlsh 192.168.10.8 >> Connected to _CLUSTER_ at 192.168.10.8:9160. >> [cqlsh 4.0.1 | Cassandra 2.0.14.459 | CQL spec 3.1.1 | Thrift protocol >> 19.39.0] >> Use HELP for help. >> cqlsh> USE ks-test ; >> cqlsh:ks-test> CREATE TABLE t ( >> ... key int, >> ... value int, >> ... PRIMARY KEY (key) >> ... ); >> cqlsh:ks-test> INSERT INTO t (key, value) VALUES (123, 456) ; >> cqlsh:ks-test> SELECT * FROM t ; >> >> key | value >> -----+------- >> 123 | 456 >> >> (1 rows) >> >> cqlsh:ks-test> SELECT * FROM t WHERE key IN (123, 123); >> >> key | value >> -----+------- >> 123 | 456 >> 123 | 456 <----- WTF? >> >> (2 rows) >> >> Adding multiple time the same key into an IN statement make the query >> returns multiple time the tuple >> >> This looks weird to me, can anyone give me some feedback on such a >> behavior? >> >> Edouard COLE >> >> >> > >