Re: Multi-column range scans
Hello Mathew Since Cassandra 2.0.6 it is possible to query over composites: https://issues.apache.org/jira/browse/CASSANDRA-4851 For your example: select * from skill_count where skill='Complaints' and (interval_id,skill_level) >= (140235930,5) and interval_id < 140235990; On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen wrote: > Hi, > > We have a roll-up table that as follows. > > CREATE TABLE SKILL_COUNT ( > skill text, > interval_id bigint, > skill_level int, > skill_count int, > PRIMARY KEY (skill, interval_id, skill_level)); > > Essentially, > skill = a names skill i.e. "Complaints" > interval_id = a rounded epoch time (15 minute intervals) > skill_level = a number/rating from 1-10 > skill_count = the number of people with the specified skill, with the > specified skill level, logged in at the interval_id > > We'd like to run the following query against it > > select * from skill_count where skill='Complaints' and interval_id >= > 140235930 and interval_id < 140235990 and skill_level >= 5; > > to get a count of people with the relevant skill and level at the > appropriate time. However I am getting the following message. > > Bad Request: PRIMARY KEY part skill_level cannot be restricted (preceding > part interval_id is either not restricted or by a non-EQ relation) > > Looking at how the data is stored ... > > --- > RowKey: Complaints > => (name=140235930:2:, value=, timestamp=1405308260403000) > => (name=140235930:2:skill_count, value=000a, > timestamp=1405308260403000) > => (name=140235930:5:, value=, timestamp=1405308260403001) > => (name=140235930:5:skill_count, value=0014, > timestamp=1405308260403001) > => (name=140235930:8:, value=, timestamp=1405308260419000) > => (name=140235930:8:skill_count, value=001e, > timestamp=1405308260419000) > => (name=140235930:10:, value=, timestamp=1405308260419001) > => (name=140235930:10:skill_count, value=0001, > timestamp=1405308260419001) > > Should cassandra be able to allow for an extra level of filtering ? or is > this something that should be performed from within the application. > > We have a solution working in Oracle, but would like to store this data in > Cassandra, as all the other data that this solution relies on already sits > within Cassandra. > > Appreciate any guidance on this matter. > > Matt >
[RELEASE] Achilles 3.0.4
Hello all We are happy to announce the release of Achilles 3.0.4. Among the biggest changes: - support for static columns: http://goo.gl/o7D5yo - dynamic statements logging & tracing at runtime: http://goo.gl/w4jlqZ - SchemaBuilder, the mirror of QueryBuilder for creating schema programmatically: http://goo.gl/DspJQq Link to the changelog: http://goo.gl/tKqpFT Regards Duy Hai DOAN
Re: keyspace with hundreds of columnfamilies
Tommaso, looking at your description of the architecture the idea came up. You can perform sharding on cassandra client and write to different cassandra clusters to keep the number of column families reasonable. With best regards, Ilya On Thu, Jul 3, 2014 at 10:55 PM, tommaso barbugli wrote: > thank you for the replies; I am rethinking the schema design, one possible > solution is to "implode" one dimension and get N times less CFs. > With this approach I would come up with (cql) tables with up to 100 > columns; would that be a problem? > > Thank You, > Tommaso > > > 2014-07-02 23:43 GMT+02:00 Jack Krupansky : > > The official answer, engraved in stone tablets, and carried down from >> the mountain: “Although having more than dozens or hundreds of tables >> defined is almost certainly a Bad Idea (just as it is a design smell in a >> relational database), it's relatively straightforward to allow disabling >> the SlabAllocator.” Emphasis on “almost certainly a Bad Idea.” >> >> See: >> https://issues.apache.org/jira/browse/CASSANDRA-5935 >> “Allow disabling slab allocation” >> >> IOW, this is considered an anti-pattern, but... >> >> -- Jack Krupansky >> >> *From:* tommaso barbugli >> *Sent:* Wednesday, July 2, 2014 2:16 PM >> *To:* user@cassandra.apache.org >> *Subject:* Re: keyspace with hundreds of columnfamilies >> >> Hi, >> thank you for you replies on this; regarding the arena memory is this a >> fixed memory allocation or is some sort of in memory caching? I ask because >> I think that a substantial portion of the column families created will not >> be queried that frequently (and some will become inactive and stay like >> that really long time) >> >> Thank you, >> Tommaso >> >> >> 2014-07-02 18:35 GMT+02:00 Romain HARDOUIN : >> >>> Arena allocation is an improvement feature, not a limitation. >>> It was introduced in Cassandra 1.0 in order to lower memory >>> fragmentation (and therefore promotion failure). >>> AFAIK It's not intended to be tweaked so it might not be a good idea to >>> change it. >>> >>> Best, >>> Romain >>> >>> tommaso barbugli a écrit sur 02/07/2014 17:40:18 : >>> >>> > De : tommaso barbugli >>> > A : user@cassandra.apache.org, >>> > Date : 02/07/2014 17:40 >>> > Objet : Re: keyspace with hundreds of columnfamilies >>> > >>> > 1MB per column family sounds pretty bad to me; is this something I >>> > can tweak/workaround somehow? >>> > >>> > Thanks >>> > Tommaso >>> > >>> >>> > 2014-07-02 17:21 GMT+02:00 Romain HARDOUIN >> >: >>> > The trap is that each CF will consume 1 MB of memory due to arena >>> allocation. >>> > This might seem harmless but if you plan thousands of CF it means >>> > thousands of mega bytes... >>> > Up to 1,000 CF I think it could be doable, but not 10,000. >>> > >>> > Best, >>> > >>> > Romain >>> > >>> > >>> > tommaso barbugli a écrit sur 02/07/2014 >>> 10:13:41 : >>> > >>> > > De : tommaso barbugli >>> > > A : user@cassandra.apache.org, >>> > > Date : 02/07/2014 10:14 >>> > > Objet : keyspace with hundreds of columnfamilies >>> > > >>> > > Hi, >>> > > Are there any known issues, shortcomings about organising data in >>> > > hundreds of column families? >>> > > At this present I am running with 300 column families but I expect >>> > > that to get to a couple of thousands. >>> > > Is this something discouraged / unsupported (I am using Cassandra >>> 2.0). >>> > > >>> > > Thanks >>> > > Tommaso >>> >> >> > >
Re: Multi-column range scans
Sorry, I've just checked, the correct query should be: select * from skill_count where skill='Complaints' and (interval_id,skill_level) >= (140235930,5) and (interval_id,skill_level) < (140235990,11) On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan wrote: > Hello Mathew > > Since Cassandra 2.0.6 it is possible to query over composites: > https://issues.apache.org/jira/browse/CASSANDRA-4851 > > For your example: > > select * from skill_count where skill='Complaints' and > (interval_id,skill_level) >= (140235930,5) and interval_id < > 140235990; > > > On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen > wrote: > >> Hi, >> >> We have a roll-up table that as follows. >> >> CREATE TABLE SKILL_COUNT ( >> skill text, >> interval_id bigint, >> skill_level int, >> skill_count int, >> PRIMARY KEY (skill, interval_id, skill_level)); >> >> Essentially, >> skill = a names skill i.e. "Complaints" >> interval_id = a rounded epoch time (15 minute intervals) >> skill_level = a number/rating from 1-10 >> skill_count = the number of people with the specified skill, with the >> specified skill level, logged in at the interval_id >> >> We'd like to run the following query against it >> >> select * from skill_count where skill='Complaints' and interval_id >= >> 140235930 and interval_id < 140235990 and skill_level >= 5; >> >> to get a count of people with the relevant skill and level at the >> appropriate time. However I am getting the following message. >> >> Bad Request: PRIMARY KEY part skill_level cannot be restricted (preceding >> part interval_id is either not restricted or by a non-EQ relation) >> >> Looking at how the data is stored ... >> >> --- >> RowKey: Complaints >> => (name=140235930:2:, value=, timestamp=1405308260403000) >> => (name=140235930:2:skill_count, value=000a, >> timestamp=1405308260403000) >> => (name=140235930:5:, value=, timestamp=1405308260403001) >> => (name=140235930:5:skill_count, value=0014, >> timestamp=1405308260403001) >> => (name=140235930:8:, value=, timestamp=1405308260419000) >> => (name=140235930:8:skill_count, value=001e, >> timestamp=1405308260419000) >> => (name=140235930:10:, value=, timestamp=1405308260419001) >> => (name=140235930:10:skill_count, value=0001, >> timestamp=1405308260419001) >> >> Should cassandra be able to allow for an extra level of filtering ? or is >> this something that should be performed from within the application. >> >> We have a solution working in Oracle, but would like to store this data >> in Cassandra, as all the other data that this solution relies on already >> sits within Cassandra. >> >> Appreciate any guidance on this matter. >> >> Matt >> > >
Re: Multi-column range scans
or : select * from skill_count where skill='Complaints' and (interval_id,skill_level) >= (140235930,5) and (interval_id) < (140235990) Strange enough, when starting using tuple notation you'll need to stick to it even if there is only one element in the tuple On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan wrote: > Sorry, I've just checked, the correct query should be: > > select * from skill_count where skill='Complaints' and > (interval_id,skill_level) >= (140235930,5) and > (interval_id,skill_level) < (140235990,11) > > > On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan wrote: > >> Hello Mathew >> >> Since Cassandra 2.0.6 it is possible to query over composites: >> https://issues.apache.org/jira/browse/CASSANDRA-4851 >> >> For your example: >> >> select * from skill_count where skill='Complaints' and >> (interval_id,skill_level) >= (140235930,5) and interval_id < >> 140235990; >> >> >> On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen > > wrote: >> >>> Hi, >>> >>> We have a roll-up table that as follows. >>> >>> CREATE TABLE SKILL_COUNT ( >>> skill text, >>> interval_id bigint, >>> skill_level int, >>> skill_count int, >>> PRIMARY KEY (skill, interval_id, skill_level)); >>> >>> Essentially, >>> skill = a names skill i.e. "Complaints" >>> interval_id = a rounded epoch time (15 minute intervals) >>> skill_level = a number/rating from 1-10 >>> skill_count = the number of people with the specified skill, with the >>> specified skill level, logged in at the interval_id >>> >>> We'd like to run the following query against it >>> >>> select * from skill_count where skill='Complaints' and interval_id >= >>> 140235930 and interval_id < 140235990 and skill_level >= 5; >>> >>> to get a count of people with the relevant skill and level at the >>> appropriate time. However I am getting the following message. >>> >>> Bad Request: PRIMARY KEY part skill_level cannot be restricted >>> (preceding part interval_id is either not restricted or by a non-EQ >>> relation) >>> >>> Looking at how the data is stored ... >>> >>> --- >>> RowKey: Complaints >>> => (name=140235930:2:, value=, timestamp=1405308260403000) >>> => (name=140235930:2:skill_count, value=000a, >>> timestamp=1405308260403000) >>> => (name=140235930:5:, value=, timestamp=1405308260403001) >>> => (name=140235930:5:skill_count, value=0014, >>> timestamp=1405308260403001) >>> => (name=140235930:8:, value=, timestamp=1405308260419000) >>> => (name=140235930:8:skill_count, value=001e, >>> timestamp=1405308260419000) >>> => (name=140235930:10:, value=, timestamp=1405308260419001) >>> => (name=140235930:10:skill_count, value=0001, >>> timestamp=1405308260419001) >>> >>> Should cassandra be able to allow for an extra level of filtering ? or >>> is this something that should be performed from within the application. >>> >>> We have a solution working in Oracle, but would like to store this data >>> in Cassandra, as all the other data that this solution relies on already >>> sits within Cassandra. >>> >>> Appreciate any guidance on this matter. >>> >>> Matt >>> >> >> >
Re: Multi-column range scans
I don't think your query is doing what he wants. Your query will correctly set the starting point, but will also return larger interval_id's but with lower skill_levels: cqlsh:test> select * from skill_count where skill='Complaints' and (interval_id, skill_level) >= (140235930, 5); skill | interval_id | skill_level | skill_count +---+-+- Complaints | 140235930 | 5 | 20 Complaints | 140235930 | 8 | 30 Complaints | 140235930 | 10 | 1 Complaints | 140235940 | 2 | 10 Complaints | 140235940 | 8 | 30 (5 rows) cqlsh:test> select * from skill_count where skill='Complaints' and (interval_id, skill_level) >= (140235930, 5) and (interval_id) < (140235990); skill | interval_id | skill_level | skill_count +---+-+- Complaints | 140235930 | 5 | 20 <- desired Complaints | 140235930 | 8 | 30 <- desired Complaints | 140235930 | 10 | 1 <- desired Complaints | 140235940 | 2 | 10 <- SKIP Complaints | 140235940 | 8 | 30 <- desired The query results in a discontinuous range slice so isn't supported -- Essentially, the client will have to read the entire range and perform client-side filtering. Whether this is efficient depends on the cardinality of skill_level. I tried playing with the "allow filtering" cql clause, but it would appear from the documentation it's very restrictive... On Mon, Jul 14, 2014 at 7:44 AM, DuyHai Doan wrote: > or : > > > select * from skill_count where skill='Complaints' > and (interval_id,skill_level) >= (140235930,5) > and (interval_id) < (140235990) > > Strange enough, when starting using tuple notation you'll need to stick to > it even if there is only one element in the tuple > > > On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan wrote: > >> Sorry, I've just checked, the correct query should be: >> >> select * from skill_count where skill='Complaints' and >> (interval_id,skill_level) >= (140235930,5) and >> (interval_id,skill_level) < (140235990,11) >> >> >> On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan >> wrote: >> >>> Hello Mathew >>> >>> Since Cassandra 2.0.6 it is possible to query over composites: >>> https://issues.apache.org/jira/browse/CASSANDRA-4851 >>> >>> For your example: >>> >>> select * from skill_count where skill='Complaints' and >>> (interval_id,skill_level) >= (140235930,5) and interval_id < >>> 140235990; >>> >>> >>> On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen < >>> matthew.j.al...@gmail.com> wrote: >>> Hi, We have a roll-up table that as follows. CREATE TABLE SKILL_COUNT ( skill text, interval_id bigint, skill_level int, skill_count int, PRIMARY KEY (skill, interval_id, skill_level)); Essentially, skill = a names skill i.e. "Complaints" interval_id = a rounded epoch time (15 minute intervals) skill_level = a number/rating from 1-10 skill_count = the number of people with the specified skill, with the specified skill level, logged in at the interval_id We'd like to run the following query against it select * from skill_count where skill='Complaints' and interval_id >= 140235930 and interval_id < 140235990 and skill_level >= 5; to get a count of people with the relevant skill and level at the appropriate time. However I am getting the following message. Bad Request: PRIMARY KEY part skill_level cannot be restricted (preceding part interval_id is either not restricted or by a non-EQ relation) Looking at how the data is stored ... --- RowKey: Complaints => (name=140235930:2:, value=, timestamp=1405308260403000) => (name=140235930:2:skill_count, value=000a, timestamp=1405308260403000) => (name=140235930:5:, value=, timestamp=1405308260403001) => (name=140235930:5:skill_count, value=0014, timestamp=1405308260403001) => (name=140235930:8:, value=, timestamp=1405308260419000) => (name=140235930:8:skill_count, value=001e, timestamp=1405308260419000) => (name=140235930:10:, value=, timestamp=1405308260419001) => (name=140235930:10:skill_count, value=0001, timestamp=1405308260419001) Should cassandra be able to allow for an extra level of filtering ? or is this something that should be performed from within the application. We have a solution working in Oracle, but would like to store this data in Cassandra, as all the other data that this solution relies on already sits within Cassandra. Appreciate any guidance on this matter. >
Re: Multi-column range scans
Exact Ken, I get bitten again by the semantics of composite tuples. This kind of query won't be possible until something like wide row end slice predicate is available ( https://issues.apache.org/jira/browse/CASSANDRA-6167), if it will one day On Mon, Jul 14, 2014 at 5:02 PM, Ken Hancock wrote: > I don't think your query is doing what he wants. Your query will > correctly set the starting point, but will also return larger interval_id's > but with lower skill_levels: > > cqlsh:test> select * from skill_count where skill='Complaints' and > (interval_id, skill_level) >= (140235930, 5); > > skill | interval_id | skill_level | skill_count > +---+-+- > Complaints | 140235930 | 5 | 20 > Complaints | 140235930 | 8 | 30 > Complaints | 140235930 | 10 | 1 > Complaints | 140235940 | 2 | 10 > Complaints | 140235940 | 8 | 30 > > (5 rows) > > cqlsh:test> select * from skill_count where skill='Complaints' and > (interval_id, skill_level) >= (140235930, 5) and (interval_id) < > (140235990); > > skill | interval_id | skill_level | skill_count > +---+-+- > Complaints | 140235930 | 5 | 20 <- desired > Complaints | 140235930 | 8 | 30 <- desired > Complaints | 140235930 | 10 | 1 <- desired > Complaints | 140235940 | 2 | 10 <- SKIP > Complaints | 140235940 | 8 | 30 <- desired > > The query results in a discontinuous range slice so isn't supported -- > Essentially, the client will have to read the entire range and perform > client-side filtering. Whether this is efficient depends on the > cardinality of skill_level. > > I tried playing with the "allow filtering" cql clause, but it would appear > from the documentation it's very restrictive... > > > > > > On Mon, Jul 14, 2014 at 7:44 AM, DuyHai Doan wrote: > >> or : >> >> >> select * from skill_count where skill='Complaints' >> and (interval_id,skill_level) >= (140235930,5) >> and (interval_id) < (140235990) >> >> Strange enough, when starting using tuple notation you'll need to stick >> to it even if there is only one element in the tuple >> >> >> On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan >> wrote: >> >>> Sorry, I've just checked, the correct query should be: >>> >>> select * from skill_count where skill='Complaints' and >>> (interval_id,skill_level) >= (140235930,5) and >>> (interval_id,skill_level) < (140235990,11) >>> >>> >>> On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan >>> wrote: >>> Hello Mathew Since Cassandra 2.0.6 it is possible to query over composites: https://issues.apache.org/jira/browse/CASSANDRA-4851 For your example: select * from skill_count where skill='Complaints' and (interval_id,skill_level) >= (140235930,5) and interval_id < 140235990; On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen < matthew.j.al...@gmail.com> wrote: > Hi, > > We have a roll-up table that as follows. > > CREATE TABLE SKILL_COUNT ( > skill text, > interval_id bigint, > skill_level int, > skill_count int, > PRIMARY KEY (skill, interval_id, skill_level)); > > Essentially, > skill = a names skill i.e. "Complaints" > interval_id = a rounded epoch time (15 minute intervals) > skill_level = a number/rating from 1-10 > skill_count = the number of people with the specified skill, with > the specified skill level, logged in at the interval_id > > We'd like to run the following query against it > > select * from skill_count where skill='Complaints' and interval_id >= > 140235930 and interval_id < 140235990 and skill_level >= 5; > > to get a count of people with the relevant skill and level at the > appropriate time. However I am getting the following message. > > Bad Request: PRIMARY KEY part skill_level cannot be restricted > (preceding part interval_id is either not restricted or by a non-EQ > relation) > > Looking at how the data is stored ... > > --- > RowKey: Complaints > => (name=140235930:2:, value=, timestamp=1405308260403000) > => (name=140235930:2:skill_count, value=000a, > timestamp=1405308260403000) > => (name=140235930:5:, value=, timestamp=1405308260403001) > => (name=140235930:5:skill_count, value=0014, > timestamp=1405308260403001) > => (name=140235930:8:, value=, timestamp=1405308260419000) > => (name=140235930:8:skill_count, value=001e, > timestamp=1405308260419000) > => (name=140235930:10:, value=, timestamp=1405308260419001) > => (name=140235930
Upgrading from 1.1.9 to 1.2.18
Hello All, I'm trying to upgrade from a 3 node 1.1.9 cluster to a 6 node 1.2.18 cluster on ubuntu. Can sstableloader be used to stream from the existing cluster to the new cluster? If so, what that the suggested method? I keep getting the following when trying this: partitioner org.apache.cassandra.dht.RandomPartitioner does not match system partitioner org.apache.cassandra.dht.Murmur3Partitioner. Note that the default partitioner starting with Cassandra 1.2 is Murmur3Partitioner, so you will need to edit that to match your old partitioner if upgrading. It would appear that 1.1.9 doesn't have Murmur3Partitioner though, so I changed the partitioner on the new cluster to RandomPartitioner. Even with that, I get the following error: CLASSPATH=/etc/cassandra/conf/cassandra.yaml:/root/lib_cass15/apache-cassandra-1.2.18.jar:/root/lib_cass15/guava-13.0.1.jar:/etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-1.1.9.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-1.1.9.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-1.1.9.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.7.0.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/metrics-core-2.0.3.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra/lib/snaptree-0.1.jar:/usr/share/cassandra/lib/stress.jar Could not retrieve endpoint ranges: java.lang.RuntimeException: Could not retrieve endpoint ranges: at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:233) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:119) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:67) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_ring(Cassandra.java:1155) at org.apache.cassandra.thrift.Cassandra$Client.describe_ring(Cassandra.java:1142) at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:212) ... 2 more Is there a way to get sstableloader to work? If not, can someone point me to documentation explaining other ways to migrate the data/keyspaces? I haven't been able to find any detailed docs. Thank you If you received this message and have reason to believe the sender did not intend to direct it to you, please notify the sender immediately by e-mail and delete the message from your system. This message (including any attachments) may contain confidential and/or proprietary information that should be read only by certain individuals. As a result, any unauthorized disclosure, copying, or distribution of this e-mail and the information contained herein is strictly prohibited and may constitute a violation of law. If you have any questions about this e-mail please notify the sender immediately.
Re: UnavailableException
Mark, Here you go: *NodeTool status:* Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.10.20.15 1.62 TB256 8.1% 01a01f07-4df2-4c87-98e9-8dd38b3e4aee rack1 UN 10.10.20.19 1.66 TB256 8.3% 30ddf003-4d59-4a3e-85fa-e94e4adba1cb rack1 UN 10.10.20.35 1.62 TB256 9.0% 17cb8772-2444-46ff-8525-33746514727d rack1 UN 10.10.20.31 1.64 TB256 8.3% 1435acf9-c64d-4bcd-b6a4-abcec209815e rack1 UN 10.10.20.52 1.59 TB256 9.1% 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e rack1 UN 10.10.20.27 1.66 TB256 7.7% 76023cdd-c42d-4068-8b53-ae94584b8b04 rack1 UN 10.10.20.22 1.66 TB256 8.9% 46af9664-8975-4c91-847f-3f7b8f8d5ce2 rack1 UN 10.10.20.39 1.68 TB256 8.0% b7d44c26-4d75-4d36-a779-b7e7bdaecbc9 rack1 UN 10.10.20.45 1.49 TB256 7.7% 8d6bce33-8179-4660-8443-2cf822074ca4 rack1 UN 10.10.20.47 1.64 TB256 7.9% bcd51a92-3150-41ae-9c51-104ea154f6fa rack1 UN 10.10.20.62 1.59 TB256 8.2% 84b47313-da75-4519-94f3-3951d554a3e5 rack1 UN 10.10.20.51 1.66 TB256 8.9% 0343cd58-3686-465f-8280-56fb72d161e2 rack1 *Astyanax Connection Settings:* seeds :12 maxConns :16 maxConnsPerHost:16 connectTimeout :2000 socketTimeout :6 maxTimeoutCount:16 maxBlockedThreadsPerHost:16 maxOperationsPerConnection:16 DiscoveryType: RING_DESCRIBE ConnectionPoolType: TOKEN_AWARE DefaultReadConsistencyLevel: CL_QUORUM DefaultWriteConsistencyLevel: CL_QUORUM On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy wrote: > Can you post the output of nodetool status and your Astyanax connection > settings? > > > On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha wrote: > >> This is how we create our keyspace. We just ran this command once through >> a cqlsh session on one of the nodes, so don't quite understand what you >> mean by "check that your DC names match up" >> >> CREATE KEYSPACE prod WITH replication = { >> 'class': 'NetworkTopologyStrategy', >> 'datacenter1': '3' >> }; >> >> >> >> On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink >> wrote: >> >>> What replication strategy are you using? if using NetworkTopolgyStrategy >>> double check that your DC names match up (case sensitive) >>> >>> Chris >>> >>> On Jul 11, 2014, at 9:38 AM, Ruchir Jha wrote: >>> >>> Here's the complete stack trace: >>> >>> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: >>> TokenRangeOfflineException: >>> [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), >>> attempts=3]UnavailableException() >>> at >>> com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165) >>> at >>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65) >>> at >>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) >>> at >>> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151) >>> at >>> com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69) >>> at >>> com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256) >>> at >>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485) >>> at >>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79) >>> at >>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123) >>> Caused by: UnavailableException() >>> at >>> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841) >>> at >>> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) >>> at >>> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) >>> at >>> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) >>> at >>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129) >>> at >>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126) >>> at >>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60) >>> ... 12 more >>> >>> >>> >>> On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav >>> wrote: >>> Please post the full exception. On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha wrote: > We have a 12 node cluster and we are consistently seeing this > exception being thrown during peak write traffic. We have a replication > factor of 3 and a write consistency level of QUORUM. Also note there is no > unusual Or Full GC activity during this time. Appreciate
Re: Cassandra use cases/Strengths/Weakness
We've struggled getting consistent write latency & linear write scalability with a pretty heavy insert load (1000's of records/second), and our records are about 1k-2k of data (mix of integer/string columns and a blob). Wondering if you have any rough numbers for your "small to medium write sizes" experience? On 07/04/2014 01:58 PM, James Horey wrote: ... * Low write latency with respect to small to medium write sizes (logs, sensor data, etc.) * Linear write scalability * ...
Re: Upgrading from 1.1.9 to 1.2.18
On Mon, Jul 14, 2014 at 9:54 AM, Denning, Michael < michael.denn...@kavokerrgroup.com> wrote: > I'm trying to upgrade from a 3 node 1.1.9 cluster to a 6 node 1.2.18 > cluster on ubuntu. Can sstableloader be used to stream from the existing > cluster to the new cluster? If so, what that the suggested method? I keep > getting the following when trying this: > http://palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra One of the caveats mentioned there is that sstableloader often does not work between major versions. If I were you, I would accomplish this task by dividing it in two : 1) Upgrade my 3 node cluster from 1.1.9 to 1.2.18 via rolling restart/upgradesstables. 2) Expand 3 node cluster to 6 nodes Is there a reason you are not using this process? =Rob
RE: COMMERCIAL:Re: Upgrading from 1.1.9 to 1.2.18
3 node cluster is in production. It’d difficult for me to get sign off on the change control to upgrade it. The 6 node cluster is already stood up (in aws). In an ideal scenario I’d just be able to bring the data over to the new cluster. From: Robert Coli [mailto:rc...@eventbrite.com] Sent: Monday, July 14, 2014 1:53 PM To: user@cassandra.apache.org Subject: COMMERCIAL:Re: Upgrading from 1.1.9 to 1.2.18 On Mon, Jul 14, 2014 at 9:54 AM, Denning, Michael mailto:michael.denn...@kavokerrgroup.com>> wrote: I'm trying to upgrade from a 3 node 1.1.9 cluster to a 6 node 1.2.18 cluster on ubuntu. Can sstableloader be used to stream from the existing cluster to the new cluster? If so, what that the suggested method? I keep getting the following when trying this: http://palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra One of the caveats mentioned there is that sstableloader often does not work between major versions. If I were you, I would accomplish this task by dividing it in two : 1) Upgrade my 3 node cluster from 1.1.9 to 1.2.18 via rolling restart/upgradesstables. 2) Expand 3 node cluster to 6 nodes Is there a reason you are not using this process? =Rob If you received this message and have reason to believe the sender did not intend to direct it to you, please notify the sender immediately by e-mail and delete the message from your system. This message (including any attachments) may contain confidential and/or proprietary information that should be read only by certain individuals. As a result, any unauthorized disclosure, copying, or distribution of this e-mail and the information contained herein is strictly prohibited and may constitute a violation of law. If you have any questions about this e-mail please notify the sender immediately.
Re: COMMERCIAL:Re: Upgrading from 1.1.9 to 1.2.18
On Mon, Jul 14, 2014 at 11:12 AM, Denning, Michael < michael.denn...@kavokerrgroup.com> wrote: > 3 node cluster is in production. It’d difficult for me to get sign off > on the change control to upgrade it. The 6 node cluster is already stood > up (in aws). In an ideal scenario I’d just be able to bring the data over > to the new cluster. > Ok, use the "copy the sstables" method from the previous link? 1) fork writes so all writes go to both clusters 2) nodetool flush on source cluster 3) copy all sstables to all target nodes, being careful to avoid name collision (use rolling restart, probably, "refresh" is unsafe) 4) run cleanup on target nodes (this will have the same effect as doing an upgradesstables, as a bonus) 5) turn off writes to old cluster/turn on reads to new cluster If I were you, I would strongly consider not using vnodes on your new cluster. Unless you are very confident the cluster will grow above appx 10 nodes in the near future, you are likely to Just Lose from vnodes. =Rob
Re: UnavailableException
Is there a line when doing nodetool info/status like: Datacenter: datacenter1 = You need to make sure the Datacenter name matches the name specified in your replication factor Chris On Jul 14, 2014, at 12:04 PM, Ruchir Jha wrote: > Mark, > > Here you go: > > NodeTool status: > > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns Host ID >Rack > UN 10.10.20.15 1.62 TB256 8.1% > 01a01f07-4df2-4c87-98e9-8dd38b3e4aee rack1 > UN 10.10.20.19 1.66 TB256 8.3% > 30ddf003-4d59-4a3e-85fa-e94e4adba1cb rack1 > UN 10.10.20.35 1.62 TB256 9.0% > 17cb8772-2444-46ff-8525-33746514727d rack1 > UN 10.10.20.31 1.64 TB256 8.3% > 1435acf9-c64d-4bcd-b6a4-abcec209815e rack1 > UN 10.10.20.52 1.59 TB256 9.1% > 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e rack1 > UN 10.10.20.27 1.66 TB256 7.7% > 76023cdd-c42d-4068-8b53-ae94584b8b04 rack1 > UN 10.10.20.22 1.66 TB256 8.9% > 46af9664-8975-4c91-847f-3f7b8f8d5ce2 rack1 > UN 10.10.20.39 1.68 TB256 8.0% > b7d44c26-4d75-4d36-a779-b7e7bdaecbc9 rack1 > UN 10.10.20.45 1.49 TB256 7.7% > 8d6bce33-8179-4660-8443-2cf822074ca4 rack1 > UN 10.10.20.47 1.64 TB256 7.9% > bcd51a92-3150-41ae-9c51-104ea154f6fa rack1 > UN 10.10.20.62 1.59 TB256 8.2% > 84b47313-da75-4519-94f3-3951d554a3e5 rack1 > UN 10.10.20.51 1.66 TB256 8.9% > 0343cd58-3686-465f-8280-56fb72d161e2 rack1 > > > Astyanax Connection Settings: > > seeds :12 > maxConns :16 > maxConnsPerHost:16 > connectTimeout :2000 > socketTimeout :6 > maxTimeoutCount:16 > maxBlockedThreadsPerHost:16 > maxOperationsPerConnection:16 > DiscoveryType: RING_DESCRIBE > ConnectionPoolType: TOKEN_AWARE > DefaultReadConsistencyLevel: CL_QUORUM > DefaultWriteConsistencyLevel: CL_QUORUM > > > > On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy wrote: > Can you post the output of nodetool status and your Astyanax connection > settings? > > > On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha wrote: > This is how we create our keyspace. We just ran this command once through a > cqlsh session on one of the nodes, so don't quite understand what you mean by > "check that your DC names match up" > > CREATE KEYSPACE prod WITH replication = { > 'class': 'NetworkTopologyStrategy', > 'datacenter1': '3' > }; > > > > On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink > wrote: > What replication strategy are you using? if using NetworkTopolgyStrategy > double check that your DC names match up (case sensitive) > > Chris > > On Jul 11, 2014, at 9:38 AM, Ruchir Jha wrote: > >> Here's the complete stack trace: >> >> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: >> TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, >> latency=22784(42874), attempts=3]UnavailableException() >> at >> com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165) >> at >> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65) >> at >> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) >> at >> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151) >> at >> com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69) >> at >> com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123) >> Caused by: UnavailableException() >> at >> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841) >> at >> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) >> at >> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) >> at >> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126) >> at >> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60) >> ... 12 more >> >> >> >> On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav wrote: >> Please post the full exception
Re: UnavailableException
If you list all 12 nodes in seeds list, you can try using NodeDiscoveryType.NONE instead of RING_DESCRIBE. Its been recommended that way by some anyway so if you add nodes to cluster your app wont start using it until all bootstrapping and everythings settled down. Chris On Jul 14, 2014, at 12:04 PM, Ruchir Jha wrote: > Mark, > > Here you go: > > NodeTool status: > > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns Host ID >Rack > UN 10.10.20.15 1.62 TB256 8.1% > 01a01f07-4df2-4c87-98e9-8dd38b3e4aee rack1 > UN 10.10.20.19 1.66 TB256 8.3% > 30ddf003-4d59-4a3e-85fa-e94e4adba1cb rack1 > UN 10.10.20.35 1.62 TB256 9.0% > 17cb8772-2444-46ff-8525-33746514727d rack1 > UN 10.10.20.31 1.64 TB256 8.3% > 1435acf9-c64d-4bcd-b6a4-abcec209815e rack1 > UN 10.10.20.52 1.59 TB256 9.1% > 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e rack1 > UN 10.10.20.27 1.66 TB256 7.7% > 76023cdd-c42d-4068-8b53-ae94584b8b04 rack1 > UN 10.10.20.22 1.66 TB256 8.9% > 46af9664-8975-4c91-847f-3f7b8f8d5ce2 rack1 > UN 10.10.20.39 1.68 TB256 8.0% > b7d44c26-4d75-4d36-a779-b7e7bdaecbc9 rack1 > UN 10.10.20.45 1.49 TB256 7.7% > 8d6bce33-8179-4660-8443-2cf822074ca4 rack1 > UN 10.10.20.47 1.64 TB256 7.9% > bcd51a92-3150-41ae-9c51-104ea154f6fa rack1 > UN 10.10.20.62 1.59 TB256 8.2% > 84b47313-da75-4519-94f3-3951d554a3e5 rack1 > UN 10.10.20.51 1.66 TB256 8.9% > 0343cd58-3686-465f-8280-56fb72d161e2 rack1 > > > Astyanax Connection Settings: > > seeds :12 > maxConns :16 > maxConnsPerHost:16 > connectTimeout :2000 > socketTimeout :6 > maxTimeoutCount:16 > maxBlockedThreadsPerHost:16 > maxOperationsPerConnection:16 > DiscoveryType: RING_DESCRIBE > ConnectionPoolType: TOKEN_AWARE > DefaultReadConsistencyLevel: CL_QUORUM > DefaultWriteConsistencyLevel: CL_QUORUM > > > > On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy wrote: > Can you post the output of nodetool status and your Astyanax connection > settings? > > > On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha wrote: > This is how we create our keyspace. We just ran this command once through a > cqlsh session on one of the nodes, so don't quite understand what you mean by > "check that your DC names match up" > > CREATE KEYSPACE prod WITH replication = { > 'class': 'NetworkTopologyStrategy', > 'datacenter1': '3' > }; > > > > On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink > wrote: > What replication strategy are you using? if using NetworkTopolgyStrategy > double check that your DC names match up (case sensitive) > > Chris > > On Jul 11, 2014, at 9:38 AM, Ruchir Jha wrote: > >> Here's the complete stack trace: >> >> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: >> TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, >> latency=22784(42874), attempts=3]UnavailableException() >> at >> com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165) >> at >> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65) >> at >> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) >> at >> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151) >> at >> com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69) >> at >> com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123) >> Caused by: UnavailableException() >> at >> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841) >> at >> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) >> at >> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) >> at >> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126) >> at >> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60) >> ... 12 more >> >> >> >> On Fri, Jul 1
Re: UnavailableException
Yes the line is : Datacenter: datacenter1 which matches with my create keyspace command. As for the NodeDiscoveryType, we will follow it but I don't believe it to be the root of my issue here because the nodes start up atleast 6 hours before the UnavailableException and as far as adding nodes is concerned we would only do it after hours. On Mon, Jul 14, 2014 at 2:34 PM, Chris Lohfink wrote: > If you list all 12 nodes in seeds list, you can try using > NodeDiscoveryType.NONE instead of RING_DESCRIBE. > > Its been recommended that way by some anyway so if you add nodes to > cluster your app wont start using it until all bootstrapping and > everythings settled down. > > Chris > > On Jul 14, 2014, at 12:04 PM, Ruchir Jha wrote: > > Mark, > > Here you go: > > *NodeTool status:* > > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns Host ID > Rack > UN 10.10.20.15 1.62 TB256 8.1% > 01a01f07-4df2-4c87-98e9-8dd38b3e4aee rack1 > UN 10.10.20.19 1.66 TB256 8.3% > 30ddf003-4d59-4a3e-85fa-e94e4adba1cb rack1 > UN 10.10.20.35 1.62 TB256 9.0% > 17cb8772-2444-46ff-8525-33746514727d rack1 > UN 10.10.20.31 1.64 TB256 8.3% > 1435acf9-c64d-4bcd-b6a4-abcec209815e rack1 > UN 10.10.20.52 1.59 TB256 9.1% > 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e rack1 > UN 10.10.20.27 1.66 TB256 7.7% > 76023cdd-c42d-4068-8b53-ae94584b8b04 rack1 > UN 10.10.20.22 1.66 TB256 8.9% > 46af9664-8975-4c91-847f-3f7b8f8d5ce2 rack1 > UN 10.10.20.39 1.68 TB256 8.0% > b7d44c26-4d75-4d36-a779-b7e7bdaecbc9 rack1 > UN 10.10.20.45 1.49 TB256 7.7% > 8d6bce33-8179-4660-8443-2cf822074ca4 rack1 > UN 10.10.20.47 1.64 TB256 7.9% > bcd51a92-3150-41ae-9c51-104ea154f6fa rack1 > UN 10.10.20.62 1.59 TB256 8.2% > 84b47313-da75-4519-94f3-3951d554a3e5 rack1 > UN 10.10.20.51 1.66 TB256 8.9% > 0343cd58-3686-465f-8280-56fb72d161e2 rack1 > > > *Astyanax Connection Settings:* > > seeds :12 > maxConns :16 > maxConnsPerHost:16 > connectTimeout :2000 > socketTimeout :6 > maxTimeoutCount:16 > maxBlockedThreadsPerHost:16 > maxOperationsPerConnection:16 > DiscoveryType: RING_DESCRIBE > ConnectionPoolType: TOKEN_AWARE > DefaultReadConsistencyLevel: CL_QUORUM > DefaultWriteConsistencyLevel: CL_QUORUM > > > > On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy > wrote: > >> Can you post the output of nodetool status and your Astyanax connection >> settings? >> >> >> On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha wrote: >> >>> This is how we create our keyspace. We just ran this command once >>> through a cqlsh session on one of the nodes, so don't quite understand what >>> you mean by "check that your DC names match up" >>> >>> CREATE KEYSPACE prod WITH replication = { >>> 'class': 'NetworkTopologyStrategy', >>> 'datacenter1': '3' >>> }; >>> >>> >>> >>> On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink >> > wrote: >>> What replication strategy are you using? if using NetworkTopolgyStrategy double check that your DC names match up (case sensitive) Chris On Jul 11, 2014, at 9:38 AM, Ruchir Jha wrote: Here's the complete stack trace: com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), attempts=3]UnavailableException() at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151) at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69) at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123) Caused by: UnavailableException() at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.c
Re: Multi-column range scans
Thanks for both your help, greatly appreciated. We'll proceed down the path of putting the filtering into the application logic for the time being. Matt. On Tue, Jul 15, 2014 at 1:20 AM, DuyHai Doan wrote: > Exact Ken, I get bitten again by the semantics of composite tuples. > > This kind of query won't be possible until something like wide row end > slice predicate is available ( > https://issues.apache.org/jira/browse/CASSANDRA-6167), if it will one day > > > > > On Mon, Jul 14, 2014 at 5:02 PM, Ken Hancock > wrote: > >> I don't think your query is doing what he wants. Your query will >> correctly set the starting point, but will also return larger interval_id's >> but with lower skill_levels: >> >> cqlsh:test> select * from skill_count where skill='Complaints' and >> (interval_id, skill_level) >= (140235930, 5); >> >> skill | interval_id | skill_level | skill_count >> +---+-+- >> Complaints | 140235930 | 5 | 20 >> Complaints | 140235930 | 8 | 30 >> Complaints | 140235930 | 10 | 1 >> Complaints | 140235940 | 2 | 10 >> Complaints | 140235940 | 8 | 30 >> >> (5 rows) >> >> cqlsh:test> select * from skill_count where skill='Complaints' and >> (interval_id, skill_level) >= (140235930, 5) and (interval_id) < >> (140235990); >> >> skill | interval_id | skill_level | skill_count >> +---+-+- >> Complaints | 140235930 | 5 | 20 <- desired >> Complaints | 140235930 | 8 | 30 <- desired >> Complaints | 140235930 | 10 | 1 <- desired >> Complaints | 140235940 | 2 | 10 <- SKIP >> Complaints | 140235940 | 8 | 30 <- desired >> >> The query results in a discontinuous range slice so isn't supported -- >> Essentially, the client will have to read the entire range and perform >> client-side filtering. Whether this is efficient depends on the >> cardinality of skill_level. >> >> I tried playing with the "allow filtering" cql clause, but it would >> appear from the documentation it's very restrictive... >> >> >> >> >> >> On Mon, Jul 14, 2014 at 7:44 AM, DuyHai Doan >> wrote: >> >>> or : >>> >>> >>> select * from skill_count where skill='Complaints' >>> and (interval_id,skill_level) >= (140235930,5) >>> and (interval_id) < (140235990) >>> >>> Strange enough, when starting using tuple notation you'll need to stick >>> to it even if there is only one element in the tuple >>> >>> >>> On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan >>> wrote: >>> Sorry, I've just checked, the correct query should be: select * from skill_count where skill='Complaints' and (interval_id,skill_level) >= (140235930,5) and (interval_id,skill_level) < (140235990,11) On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan wrote: > Hello Mathew > > Since Cassandra 2.0.6 it is possible to query over composites: > https://issues.apache.org/jira/browse/CASSANDRA-4851 > > For your example: > > select * from skill_count where skill='Complaints' and > (interval_id,skill_level) >= (140235930,5) and interval_id < > 140235990; > > > On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen < > matthew.j.al...@gmail.com> wrote: > >> Hi, >> >> We have a roll-up table that as follows. >> >> CREATE TABLE SKILL_COUNT ( >> skill text, >> interval_id bigint, >> skill_level int, >> skill_count int, >> PRIMARY KEY (skill, interval_id, skill_level)); >> >> Essentially, >> skill = a names skill i.e. "Complaints" >> interval_id = a rounded epoch time (15 minute intervals) >> skill_level = a number/rating from 1-10 >> skill_count = the number of people with the specified skill, with >> the specified skill level, logged in at the interval_id >> >> We'd like to run the following query against it >> >> select * from skill_count where skill='Complaints' and interval_id >= >> 140235930 and interval_id < 140235990 and skill_level >= 5; >> >> to get a count of people with the relevant skill and level at the >> appropriate time. However I am getting the following message. >> >> Bad Request: PRIMARY KEY part skill_level cannot be restricted >> (preceding part interval_id is either not restricted or by a non-EQ >> relation) >> >> Looking at how the data is stored ... >> >> --- >> RowKey: Complaints >> => (name=140235930:2:, value=, timestamp=1405308260403000) >> => (name=140235930:2:skill_count, value=000a, >> timestamp=1405308260403000) >> => (name=140235930:5:, value=, timestamp=1405308260403001) >>
Re: high pending compactions
I'm looking into creation of monitoring thresholds for cassandra to report on its health. Does it make sense to set an alert threshold on compaction stats? If so, would setting it to a value equal to or greater than concurrent compactions make sense? Thanks, Greg On Mon, Jun 9, 2014 at 2:14 PM, S C wrote: > Thank you all for quick responses. > -- > From: clohf...@blackbirdit.com > Subject: Re: high pending compactions > Date: Mon, 9 Jun 2014 14:11:36 -0500 > To: user@cassandra.apache.org > > Bean: org.apache.cassandra.db.CompactionManager > > also nodetool compactionstats gives you how many are in the queue + > estimate of how many will be needed. > > in 1.1 you will OOM *far* before you hit the limit,. In theory though, > the compaction executor is a little special cased and will actually throw > an exception (normally it will block) > > Chris > > On Jun 9, 2014, at 7:49 AM, S C wrote: > > Thank you all for valuable suggestions. Couple more questions, > > How to check the compaction queue? MBean/C* system log ? > What happens if the queue is full? > > -- > From: colinkuo...@gmail.com > Date: Mon, 9 Jun 2014 18:53:41 +0800 > Subject: Re: high pending compactions > To: user@cassandra.apache.org > > As Jake suggested, you could firstly increase > "compaction_throughput_mb_per_sec" and "concurrent_compactions" to suitable > values if system resource is allowed. From my understanding, major > compaction will internally acquire lock before running compaction. In your > case, there might be a major compaction blocking the pending following > compaction tasks. You could check the result of "nodetool compactionstats" > and C* system log for double confirm. > > If the running compaction is compacting wide row for a long time, you > could try to tune "in_memory_compaction_limit_in_mb" value. > > Thanks, > > > > On Sun, Jun 8, 2014 at 11:27 PM, S C wrote: > > I am using Cassandra 1.1 (sorry bit old) and I am seeing high pending > compaction count. "pending tasks: 67" while active compaction tasks are > not more than 5. I have a 24CPU machine. Shouldn't I be seeing more > compactions? Is this a pattern of high writes and compactions backing up? > How can I improve this? Here are my thoughts. > > >1. Increase memtable_total_space_in_mb >2. Increase compaction_throughput_mb_per_sec >3. Increase concurrent_compactions > > > Sorry if this was discussed already. Any pointers is much appreciated. > > Thanks, > Kumar > > >
Index creation sometimes fails
Hi everyone, I have some code that I've been fiddling with today that uses the DataStax Java driver to create a table and then create a secondary index on a column in that table. I've testing this code fairly thoroughly on a single-node Cassandra instance on my laptop and in unit test (using the CassandraDaemon). When running on a three-node cluster, however, I see strange behavior. Although my table always gets created, the secondary index often does not! If I delete the table and then create it again (through the same code that I've written), I've never seen the index fail to appear the second time. Does anyone have any idea what to look for here? I have no experience working on a Cassandra cluster and I wonder if maybe I am doing something dumb (I basically just installed DSE and started up the three nodes and that was it). I don't see anything that looks unusual in OpsCenter for DSE. The only thing I've noticed is that the presence of output like the following from my program after executing the command to create the index is perfectly correlated with successful creation of the index: 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Received event EVENT CREATED kiji_retail2.t_model_repo, scheduling delivery 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: [Control connection] Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [b309518a-35d2-3790-bb66-ea39bb0d188c] If anyone can give me a hand, I would really appreciate it. I am out of ideas! Best regards, Clint
Re: Index creation sometimes fails
BTW I have seen this using versions 2.0.1 and 2.0.3 of the java driver on a three-node cluster with DSE 4.5. On Mon, Jul 14, 2014 at 5:51 PM, Clint Kelly wrote: > Hi everyone, > > I have some code that I've been fiddling with today that uses the > DataStax Java driver to create a table and then create a secondary > index on a column in that table. I've testing this code fairly > thoroughly on a single-node Cassandra instance on my laptop and in > unit test (using the CassandraDaemon). > > When running on a three-node cluster, however, I see strange behavior. > Although my table always gets created, the secondary index often does > not! If I delete the table and then create it again (through the same > code that I've written), I've never seen the index fail to appear the > second time. > > Does anyone have any idea what to look for here? I have no experience > working on a Cassandra cluster and I wonder if maybe I am doing > something dumb (I basically just installed DSE and started up the > three nodes and that was it). I don't see anything that looks unusual > in OpsCenter for DSE. > > The only thing I've noticed is that the presence of output like the > following from my program after executing the command to create the > index is perfectly correlated with successful creation of the index: > > 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Received > event EVENT CREATED kiji_retail2.t_model_repo, scheduling delivery > 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: > [Control connection] Refreshing schema for kiji_retail2 > 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Refreshing > schema for kiji_retail2 > 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: > Checking for schema agreement: versions are > [9a8d72f9-e384-3aa8-bc85-185e2c303ade, > b309518a-35d2-3790-bb66-ea39bb0d188c] > 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: > Checking for schema agreement: versions are > [9a8d72f9-e384-3aa8-bc85-185e2c303ade, > b309518a-35d2-3790-bb66-ea39bb0d188c] > 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: > Checking for schema agreement: versions are > [9a8d72f9-e384-3aa8-bc85-185e2c303ade, > b309518a-35d2-3790-bb66-ea39bb0d188c] > 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: > Checking for schema agreement: versions are > [9a8d72f9-e384-3aa8-bc85-185e2c303ade, > b309518a-35d2-3790-bb66-ea39bb0d188c] > 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: > Checking for schema agreement: versions are > [b309518a-35d2-3790-bb66-ea39bb0d188c] > > If anyone can give me a hand, I would really appreciate it. I am out of > ideas! > > Best regards, > Clint