secondary index on static column
Hi, I'm quite new to cassandra and allthough there is much info on the net, sometimes I cannot find the solution to a problem. In this case, I have a second index on a static column and I don't understand the answer I get from my select. A cut down version of the table is: create table demo (id text, id2 bigint static, added timestamp, source text static, dest text, primary key (id, added)); create index on demo (id2); id and id2 match one to one. I make one insert: insert into demo (id, id2, added, source, dest) values ('id1', 22, '2017-01-28', 'src1', 'dst1'); The "select from demo;" gives the expected answer of the one inserted row. But "select from demo where id2=22" gives 70 rows as result (all the same). Why? I have read https://www.datastax.com/dev/blog/cassandra-native-secondary-index-deep-dive but I don't get it... thanks for answering, Michael
Re: secondary index on static column
Hi, it's a 3.9, installed on a jessie system. For me it's like this: I have a three node cluster. When creating the keyspace with replication factor 3 it works. When creating the keyspace with replication factor 2 it doesn't work and shows the weird behavior. This is a fresh install, I also have tried it multiple times and the result is the same. As SASI indices work, I use those. But I would like to solve this. Cheers, Michael On 02.02.2017 15:06, Romain Hardouin wrote: > Hi, > > What's your C* 3.X version? > I've just tested it on 3.9 and it works: > > cqlsh> SELECT * FROM test.idx_static where id2=22; > > id | added | id2 | source | dest > -+-+-++-- > id1 | 2017-01-27 23:00:00.00+ | 22 | src1 | dst1 > > (1 rows) > > Maybe your dataset is incorrect (try on a new table) or you hit a bug. > > Best, > > Romain > > > > Le Vendredi 27 janvier 2017 9h44, Micha a écrit : > > > Hi, > > I'm quite new to cassandra and allthough there is much info on the net, > sometimes I cannot find the solution to a problem. > > In this case, I have a second index on a static column and I don't > understand the answer I get from my select. > > A cut down version of the table is: > > create table demo (id text, id2 bigint static, added timestamp, source > text static, dest text, primary key (id, added)); > > create index on demo (id2); > > id and id2 match one to one. > > I make one insert: > insert into demo (id, id2, added, source, dest) values ('id1', 22, > '2017-01-28', 'src1', 'dst1'); > > > The "select from demo;" gives the expected answer of the one inserted row. > > But "select from demo where id2=22" gives 70 rows as result (all the same). > > Why? I have read > https://www.datastax.com/dev/blog/cassandra-native-secondary-index-deep-dive > > but I don't get it... > > thanks for answering, > Michael > > >
ByteOrdered partitioner when using sha-1 as partition key
Hi, my table has a sha-1 sum as partition key. Would in this case the ByteOrdered partitioner be a better choice than the Murmur3partitioner, since the keys are quite random? cheers, Michael
Re: ByteOrdered partitioner when using sha-1 as partition key
I think I was not clear enough... I have *one* table for which the row data contains (among other values) a sha-1 sum. There are no collisions. I thought computing a murmur hash for a sha-1 sum is just wasted time, as the murmur hash doesn't make the data more random than it already is. So it's just one table where this matters. Michael Am 11.02.2017 um 16:54 schrieb Jonathan Haddad: > The odds of only using a sha1 as your partition key for every table you > ever create is low. You will regret BOP until the end of time. > On Sat, Feb 11, 2017 at 5:53 AM Edward Capriolo <mailto:edlinuxg...@gmail.com>> wrote: > > Probably best to avoid bop even if you are aflready hashing keys > yourself. What do you do when checksuma collide? It is possible right? > > On Saturday, February 11, 2017, Micha <mailto:mich...@fantasymail.de>> wrote: > > Hi, > > my table has a sha-1 sum as partition key. Would in this case the > ByteOrdered partitioner be a better choice than the > Murmur3partitioner, > since the keys are quite random? > > > cheers, > Michael > > > > -- > Sorry this was sent from mobile. Will do less grammar and spell > check than usual. >
Re: ByteOrdered partitioner when using sha-1 as partition key
Am 11.02.2017 um 23:56 schrieb Jonathan Haddad: > The time it takes to calculate the hash is so insignificant that it > doesn't even remotely come close to justifying all the drawbacks. yes, most tasks (at least for me) are not cpu bound but io and network bound > You can, of course, benchmark it. I wouldn't bother though. BOP is > basically dead. no thanks :-) thanks for answering Michael
sasi index question (read timeout on many selects)
Hi, my table has (among others) three columns, which are unique blobs. So I made the first column the partition key and created two sasi indices for the two other columns. After inserting ca 90m records I'm not able to query a bunch of rows (sending 1 selects to the cluster) using only a sasi index. After a few seconds I get timeouts. I have read the documents about the sasi index but I don't get why this happens. Is this because I don't include the partition key in the query? I thought sasi index is globally held, in contrast to the normal secondary index.. thanks for helping, Michael
Re: sasi index question (read timeout on many selects)
On 16.02.2017 14:30, DuyHai Doan wrote: > Why indexing BLOB data ? It does not make any sense My partition key is a secure hash sum, I don't index a blob.
Re: sasi index question (read timeout on many selects)
it's like having a table (sha256 blob primary key, id timeuuid, data1 text, ., ) So both, sha256 and id are unique. I would like to query *either* with sha256 *or* with id. I thought this can be done with a sasi index, but it has to be done with a second table (manual way) or with a mv with id as partition key. On 16.02.2017 15:11, Benjamin Roth wrote: > No matter what has to be indexed here, the preferrable way is most > probably denormalization instead of another index. it's rather manual inserting the data with another partition key or make a mv for with the other key.
Re: sasi index question (read timeout on many selects)
On 16.02.2017 16:33, Jonathan Haddad wrote: > I agree w/ DuyHai regarding the index. The use case described here is a > terrible one for SASI indexes. > > Regarding MVs, do not use the ones that shipped with 3.x. They're not > ready for production. Manage it yourself by using a second table and > inserting a second record there. yes, thanks for pointing this out. Michael
Re: sasi index question (read timeout on many selects)
On 16.02.2017 16:33, Jonathan Haddad wrote: > > Regarding MVs, do not use the ones that shipped with 3.x. They're not > ready for production. Manage it yourself by using a second table and > inserting a second record there. > Out of interest... there is a slight discrepance between the advice not to use mv and the docu about the feature on the datastax side. Or do I have to use another cassandra version (instead of 3.9)?
recovering from failed repair , cassandra 3.10
Hi, after failed repair on a three node cluster all nodes were down. It cannot start, since it finds a mismatch in a mc_txn_anticompactionafterrepair log file: "got ADD " "expected "ADD:..." The two log files are different: one has "ADD, ADD; REMOVE, REMOVE, COMMIT" the other is missing an "ADD" Each of the nodes give this error. sstableutil -c also gives this error. How to deal with this? thanks, Michael - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: recovering from failed repair , cassandra 3.10
The error which keeps it from starting is below. The files like "mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log" are on both disks of a node but are different . Of course, just renaming (deleting) the two files (or making them equal) makes cassandra start again. But I would like to know the right way to handle this. I start the repair again with increased log level. thanks for answering, Michael ERROR 08:26:20 Mismatched line in file mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log: got 'ADD:[/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4348-big,0,8][1394849421]' expected 'ADD:[/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4349-big,0,8][910462411]', giving up ERROR 08:26:20 Failed to read records for transaction log [mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log in /data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0, /data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0] ERROR 08:26:20 Unexpected disk state: failed to read transaction log [mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log in /data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0, /data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0] Files and contents follow: /data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log ADD:[/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4349-big,0,8][910462411] REMOVE:[/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4241-big,1495845618000,8][2443235315] REMOVE:[/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4249-big,1495856254000,8][681858089] COMMIT:[,0,0][2613697770] /data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log ADD:[/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4348-big,0,8][1394849421] ***Does not match in first replica file ADD:[/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4349-big,0,8][910462411] REMOVE:[/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4241-big,1495845618000,8][2443235315] REMOVE:[/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4249-big,1495856254000,8][681858089] COMMIT:[,0,0][2613697770] On 31.05.2017 11:10, Oleksandr Shulgin wrote: > On Wed, May 31, 2017 at 9:11 AM, Micha <mailto:mich...@fantasymail.de>> wrote: > > Hi, > > after failed repair on a three node cluster all nodes were down. > > > To clarify, was it failed repair that brought the nodes down so that you > had to start them back? Do you see any error messages or stack trace in > the logs? > > > It cannot start, since it finds a mismatch in a > mc_txn_anticompactionafterrepair log file: > "got ADD " > "expected "ADD:..." > > > The two log files are different: > one has "ADD, ADD; REMOVE, REMOVE, COMMIT" > the other is missing an "ADD" > > > I assume this is about commit log. There doesn't seem to be a separate > log file named "mc_txn_anticompactionafterrepair" in your Cassandra version. > > Each of the nodes give this error. > > sstableutil -c also gives this error. > > How to deal with this? > > > I would try removing the faulty commit log file(s) and try to start the > node again, until it works. This might mean that you'll have to remove > all commit logs, but it's better than being completely down, I assume. > > -- > Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176 > 127-59-707 > - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
jbod disk usage unequal
Hi, I use a jbod setup (2 * 1TB) and the distribution is a little bit unequal on my three nodes: 270MB and 540MB 150 and 580 290 and 500 SStable size varies between 2GB and 130GB. Is is possible to move sstables from one disk to another to balance the disk usage? Otherwise is a raid-0 setup the only option for a balanced disk usage? Thanks, Michael - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: jbod disk usage unequal
thanks for answering, On 03.07.2017 20:01, Jeff Jirsa wrote: > > > Is there a reason you feel it's required, other than being bothered by the > fact that they're not equal? Just out of interest. I'm not sure if it would spread the io better between the disks if the files are spread more evenly on both disks . >> Otherwise is a raid-0 setup the only option for a balanced disk usage? > > it doesn't REALLY matter how full it is, right? Don't know, that's the question. I think there may be io hotspots if mostly one disk is used until it's full. - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
error 1300 from csv export
Hi, I got some errors from a csv export of a table. They are of the form: "Error for (number-1, number-2): ReadFailure Error from server: code=1300 ... " At the end "Exported 650 ranges out of 658 total, some records might be missing" Is there a way to start the export only for the failed ranges again? Thanks, Michael - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: error 1300 from csv export
Sorry for the noise, somehow overread the copy option BEGINTOKEN and ENDTOKEN.. Michael On 10.07.2017 13:11, Micha wrote: > Hi, > > I got some errors from a csv export of a table. > They are of the form: > "Error for (number-1, number-2): ReadFailure Error from server: > code=1300 ... " > > At the end "Exported 650 ranges out of 658 total, some records might be > missing" > > > Is there a way to start the export only for the failed ranges again? > > Thanks, > Michael > > > > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
adding nodes to a cluster and changing rf
Hi, I want to extend my cluster (C* 3.9) from three nodes with RF 2 to seven nodes with RF 3. Is there a preferable way to do this? For example: setting "auto_bootstrap: true" and bootstrapping each new node at a time? setting "auto_bootstrap: false" , starting all new nodes at once and then "nodetool rebuild"? Should I alter the RF after adding the new nodes? I could add 2 nodes, then change the RF, then add the remaining 2 nodes. Should all new nodes have "allocate_tokens_for_local_replication_factor" set to 3 since the RF will be changed to 3 later? There is quite some data stored, so I would prefer a method which does fewer reorganisations of the data. Thanks for your advice, Michael - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
secondary index use case
Hi, even after reading much about secondary index usage I'm not sure if I have the correct use case for it. My table will contain about 150'000'000 records (each about 2KB data). There are two uuids used to identify a row. One uuid is unique for each row, the other uuid is something like a groupid, which give mostly 20 records when queried. So, if I define my primary key as (groupuuid, uuid) then: "select * ... where groupuuid = X" gives me 0 - 20 rows "select * ... where groupuuid = X and uuid = Y" gives me 0 | 1 row now, sometimes I want to query only with uuid: "select * ... where uuid = X" to get exactly one row (without using groupid) Is this a good use case for a secondary index on uuid? Thanks for helping, Michael - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
UndeclaredThrowableException, C* 3.11
Hi, has someone experienced this? I added a fourth node to my cluster, after the boostrap I changed RP from 2 to 3 and ran nodetool repair on the new node. A few hours later the repair command exited with the UndeclaredThrowableException and the node was down. In the logs I don't see a reason for the exception or shutdown. How can I know that the repair was successful? There are messages with the repair id and "Session with... is complete" and "All sessions completed" and "Sync completed using session..." Is this an indicate for a completed repair? The output of nodetool info shows "percent Repaired 26.18%" Should this be 100% after a completed repair? Thanks, Michael - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: UndeclaredThrowableException, C* 3.11
ok, thanks, so I'll just start it again... On 02.08.2017 11:51, kurt greaves wrote: > If the repair command failed, repair also failed. Regarding % repaired, > no it's unlikely you will see 100% repaired after a single repair. Maybe > after a few consecutive repairs with no data load you might get it to 100%. - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
rebuild constantly fails, 3.11
Hi, it seems I'm not able to add add 3 node dc to a 3 node dc. After starting the rebuild on a new node, nodetool netstats show it will receive 1200 files from node-1 and 5000 from node-2. The stream from node-1 completes but the stream from node-2 allways fails, after sending ca 4000 files. After restarting the rebuild it again starts to send the 5000 files. The whole cluster is connected via one switch only , no firewall between, the networks shows no errors. The machines have 8 cores, 32GB RAM and two 1TB discs as raid0. the logs show no errors. The size of the data is ca 1TB. Any help is really welcome, cheers Michael The error is: Cassandra has shutdown. error: null -- StackTrace -- java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:267) at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:222) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source) at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1020) at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:298) at com.sun.proxy.$Proxy7.rebuild(Unknown Source) at org.apache.cassandra.tools.NodeProbe.rebuild(NodeProbe.java:1190) at org.apache.cassandra.tools.nodetool.Rebuild.execute(Rebuild.java:58) at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:254) at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:168) - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: rebuild constantly fails, 3.11
no, I have left it at the default value of 24hours. I've read about adjusting phi_convict_threshold, but I haven't done this yet as the network is stable. maybe I set this to 10. On 08.08.2017 15:24, ZAIDI, ASAD A wrote: > Is there any chance you've set streaming_socket_timeout_in_ms parameter set > too low on failing node? > > - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: rebuild constantly fails, 3.11
The logs didn't show an error. I have started it again with higher log level allthough errors should be logged despite the log level. If it breaks again I share the log with the possible error in it. The only error output I got was on the console: Cassandra has shutdown. error: null -- StackTrace -- java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:267) at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:222) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source) at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1020) at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:298) at com.sun.proxy.$Proxy7.rebuild(Unknown Source) at org.apache.cassandra.tools.NodeProbe.rebuild(NodeProbe.java:1190) at org.apache.cassandra.tools.nodetool.Rebuild.execute(Rebuild.java:58) at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:254) at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:168) On 08.08.2017 17:03, ZAIDI, ASAD A wrote: > Without exact failure text, it is really hard to guess what may be going-on - > can you please share logfile excerpt detailing the failure error so we can > have better idea of the nature failure. > Adjusting phi_convict_threshold may yet be another shot in the dark when we > don’t know what is causing the failure and network is supposedly stable. > > ~Asad > > > > -----Original Message- > From: Micha [mailto:mich...@fantasymail.de] > Sent: Tuesday, August 08, 2017 8:35 AM > To: user@cassandra.apache.org; ZAIDI, ASAD A ; > user@cassandra.apache.org > Subject: Re: rebuild constantly fails, 3.11 > > no, I have left it at the default value of 24hours. > > I've read about adjusting phi_convict_threshold, but I haven't done this yet > as the network is stable. maybe I set this to 10. > > > On 08.08.2017 15:24, ZAIDI, ASAD A wrote: >> Is there any chance you've set streaming_socket_timeout_in_ms parameter set >> too low on failing node? >> >> > > > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
effect of partition size
Hi, What are the effects of large partitions? I have a few tables which have partitions sizes as: 95% 24000 98% 42000 99% 85000 Max 82000 So, should I redesign the schema to get this max smaller or doesn't it matter much, since 99% of the partitions are <= 85000 ? Thanks for answering Michael - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: effect of partition size
ok, thanks for the answer. So the better approach here is to adjust the table schema to get the partition size to around 100MB max. This means using a partition key with multiple parts and making more selects instead of one when querying the data (which may increase parallelism). Michael - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org