Re: CAS operation does not return value on failure

2016-05-05 Thread Jack Krupansky
the exception trace class names indicates that the error is detected in the Java driver, not Cassandra. -- Jack Krupansky On Thu, May 5, 2016 at 6:45 PM, horschi wrote: > Hi Jack, > > I thought that it is Cassandra that fills the value on CAS failures. So > the question if it is to

Re: CAS operation does not return value on failure

2016-05-04 Thread Jack Krupansky
Probably better to ask this on the Java driver user list. -- Jack Krupansky On Wed, May 4, 2016 at 11:46 AM, horschi wrote: > Hi, > > I am doing some testing on CAS operations and I am frequently having the > issue that my resultset says wasApplied()==false, but it does not c

Re: Security assessment of Cassandra

2016-04-26 Thread Jack Krupansky
Just following up... Oleg, have you gotten a satisfactory level of feedback from the community on the security assessment issues? And if there is any sort of final assessment that can be publicly accessed, that would be great. -- Jack Krupansky On Thu, Feb 11, 2016 at 3:29 PM, oleg yusim wrote

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
for cqlsh? -- Jack Krupansky On Tue, Apr 19, 2016 at 4:56 PM, Jack Krupansky wrote: > Sylvain & Tyler, this Jira is for a user reporting a timeout for SELECT > COUNT(*) using 3.3: > https://issues.apache.org/jira/browse/CASSANDRA-11566 > > I'll let one of you guys foll

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
ng should make that not a problem. Or is there a timeout in cqlsh simply because the operation is slow - as opposed to the server reporting an internal timeout? Thanks. -- Jack Krupansky On Tue, Apr 19, 2016 at 12:45 PM, Tyler Hobbs wrote: > > On Tue, Apr 19, 2016 at 11:32 AM, Jack Kru

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
be treated more as a batch-style OLAP operation rather than a real-time OLTP operation... I think. Thanks. -- Jack Krupansky On Tue, Apr 19, 2016 at 12:04 PM, Tyler Hobbs wrote: > > On Tue, Apr 19, 2016 at 9:51 AM, Jack Krupansky > wrote: > >> >> 1. Another clarification

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
proportional to the number of rows on all nodes? I mean, you can't dedupe using only partition keys of the coordinator node, right? What I'm wondering is if the usability of COUNT (et al) is memory limited as well as time. Thanks. -- Jack Krupansky On Tue, Apr 19, 2016 at 5:36 AM, Sylvain Lebre

Proper use of COUNT

2016-04-18 Thread Jack Krupansky
? A companion question is whether COUNT(column_name) has the same limitations and recommendations. It does have to actually fetch the column values as opposed to simply determining the existence of the row, but how consequential that additional processing is, I couldn't say. -- Jack Krupansky

Re: Experience with Kubernetes

2016-04-15 Thread Jack Krupansky
data or state is relatively inconsequential. How that model applies to a database server that works best with fairly large amounts of ultra-fast local data storage is not so obvious. Maybe that simply wasn't a design goal? -- Jack Krupansky On Fri, Apr 15, 2016 at 3:48 PM, David Aronchick

Re: Most stable version?

2016-04-14 Thread Jack Krupansky
keep your chosen release in production for longer than the older 3.0 releases will be in production. Ultimately, this is a personality test: Are you adventuresome or conservative? To be clear, with the new tick-tock release scheme, 3.5 is designed to be a stable release. -- Jack Krupansky On Thu

Re: Cassandra 2.1.12 Node size

2016-04-14 Thread Jack Krupansky
uring times of stress. -- Jack Krupansky On Thu, Apr 14, 2016 at 10:14 AM, Alain RODRIGUEZ wrote: > Would adding nodes be the right way to start if I want to get the data per >> node down > > > Yes, if everything else is fine, the last and always available option to > red

Experience with Kubernetes

2016-04-14 Thread Jack Krupansky
aster/examples/cassandra Is there a better approach to deploying a Cassandra/DSE cluster than Kubernetes? Thanks. -- Jack Krupansky

Re: performance question

2016-04-12 Thread Jack Krupansky
Facets can be used, and grouping of results as well, in DSE Search (Solr), but there are a lot of different approaches that can be used, depending on the specific user experience you require. -- Jack Krupansky On Tue, Apr 12, 2016 at 9:32 PM, Gross, Daniel wrote: > Hi Jack, > > >

Re: performance question

2016-04-12 Thread Jack Krupansky
full Solr searches, including faceting. The new SASI secondary index feature in Cassandra 3.4 can be used for some more sophisticated searches as well, but it's not quite up to what Stratio and DSE Search can do. -- Jack Krupansky On Tue, Apr 12, 2016 at 8:07 PM, Gross, Daniel wrote:

Re: Latency overhead on Cassandra cluster deployed on multiple AZs (AWS)

2016-04-12 Thread Jack Krupansky
Which instance type are you using? Some may be throttled for EBS access, so you could bump into a rate limit, and who knows what AWS will do at that point. -- Jack Krupansky On Tue, Apr 12, 2016 at 6:02 AM, Alessandro Pieri wrote: > Thanks Chris for your reply. > > I ran the tests 3

Re: DSE Search : NPE when executing Solr CQL queries using solr_query

2016-04-12 Thread Jack Krupansky
e the FOT is setting an output column value to NULL. Also, see if there is a "Caused By" entry elsewhere in the Java stack trace. -- Jack Krupansky On Tue, Apr 12, 2016 at 6:07 AM, Joseph Tech wrote: > hi, > > I am facing an issue where Solr queries executed from cqlsh using t

Re: Migrating to CQL and Non Compact Storage

2016-04-11 Thread Jack Krupansky
. -- Jack Krupansky On Mon, Apr 11, 2016 at 6:15 PM, Jim Ancona wrote: > > On Mon, Apr 11, 2016 at 4:19 PM, Jack Krupansky > wrote: > >> Some of this may depend on exactly how you are using so-called COMPACT >> STORAGE. I mean, if your tables really are modeled as all b

Re: Large primary keys

2016-04-11 Thread Jack Krupansky
document given the document text. -- Jack Krupansky On Mon, Apr 11, 2016 at 7:12 PM, James Carman wrote: > S3 maybe? > > On Mon, Apr 11, 2016 at 7:05 PM Robert Wille wrote: > >> I do realize its kind of a weird use case, but it is legitimate. I have a >> collection of docume

Re: Migrating to CQL and Non Compact Storage

2016-04-11 Thread Jack Krupansky
So, where are we? Is it just the complaint that migration is slow and re-modeling is difficult, or are there specific questions about how to do the re-modeling? -- Jack Krupansky On Mon, Apr 11, 2016 at 1:30 PM, Anuj Wadehra wrote: > Thanks Jim. I think you understand the pain of migrat

Re: 1, 2, 3...

2016-04-11 Thread Jack Krupansky
tition being treated as a single row. -- Jack Krupansky On Mon, Apr 11, 2016 at 11:46 AM, Emīls Šolmanis wrote: > Wouldn't the "number of keys" part of *nodetool cfstats* run on every > node, summed and divided by replication factor give you a decent > approximation?

Re: Migrating to CQL and Non Compact Storage

2016-04-11 Thread Jack Krupansky
bite the bullet and re-model your data to exploit the features of CQL rather than fight CQL trying to mimic Thrift per se. In any case, take another shot at framing the problem and then maybe people here can help you out. -- Jack Krupansky On Mon, Apr 11, 2016 at 10:39 AM, Anuj Wadehra wrote

Re: 1, 2, 3...

2016-04-11 Thread Jack Krupansky
for q=*:* and that will very quickly return the total row count. I presume that Stratio will handle this fine as well. -- Jack Krupansky On Mon, Apr 11, 2016 at 11:10 AM, wrote: > Cassandra is not good for table scan type queries (which count(*) > typically is). While there are some attem

Re: DataStax OpsCenter with Apache Cassandra

2016-04-11 Thread Jack Krupansky
(And what's the cost of a DSE license for DSE with Cassandra 3.x/3.5? No fair telling people they have to wait for DSE 5.0! Or 5.x, whenever Cassandra 3.4/3.5 will be supported.) -- Jack Krupansky On Mon, Apr 11, 2016 at 11:04 AM, wrote: > For 2.2 and earlier, there are no license fe

1, 2, 3...

2016-04-08 Thread Jack Krupansky
/preferred technique? For example, is it more efficient to query the row count one node at a time? And for bonus points: How do you count (CQL) rows for each node? Again, excluding replication. -- Jack Krupansky

Re: Cassandra Single Node Setup Questions

2016-04-07 Thread Jack Krupansky
Not that we aren't enthusiastic about you moving to Cassandra, but it needs to be for the right reasons, and for Cassandra the right reasons are scaling and HA. In case it's not obvious, I would make a really lousy used-car or real-estate/time-share salesman! -- Jack Krupansky On

Re: Cassandra Single Node Setup Questions

2016-04-06 Thread Jack Krupansky
andra (properly.) -- Jack Krupansky On Wed, Apr 6, 2016 at 10:30 AM, Paco Trujillo wrote: > The fact that there is one single DC does not mean that you do not need > multiples nodes. Without multiples nodes you do not have redundancy (the > nodes fail and you lose the database) and you

Re: Cassandra Single Node Setup Questions

2016-04-06 Thread Jack Krupansky
applications which have a lot of data and the need for high availability (redundancy, meaning at least three copies of the data.) Neither of which seems to be your requirement. How much data do you have? What led you to believe that you only need a single node? -- Jack Krupansky On Wed, Apr 6

Re: Cassandra table limitation

2016-04-06 Thread Jack Krupansky
cluster as the other tenants. Were there any other specific reasons for choosing Cassandra other than pursuing multi-tenancy? Out of curiosity, what source of information pointed you in the direction of multi-tenancy? -- Jack Krupansky On Wed, Apr 6, 2016 at 1:17 AM, Kai Wang wrote: > With sm

Re: Cassandra table limitation

2016-04-05 Thread Jack Krupansky
collection of applications which share the same data. If there are multiple applications that don't share the same data, then they absolutely should be on separate clusters. -- Jack Krupansky On Tue, Apr 5, 2016 at 5:40 PM, Kai Wang wrote: > Once a while the question about table count rises

Re: How many nodes do we require

2016-03-31 Thread Jack Krupansky
Maybe that's a great definition of a modern distributed cluster: each person (node) has a different notion of priority. I'll wait for the next user email in which they complain that their data is "too stable" (missing updates.) -- Jack Krupansky On Thu, Mar 31, 2016 at 12:

Acceptable repair time

2016-03-28 Thread Jack Krupansky
acceptable full repair times for nodes and what the resulting node data size is. What impact vnodes has on these numbers is a bonus question. Thanks! -- Jack Krupansky

Solr and vnodes anyone?

2016-03-28 Thread Jack Krupansky
whether 64 or even 32 would deliver acceptable query performance? Anybody here have any practical experience on this issue, either testing or even better, in production? Absent any further input, my advice would be to limit DSE Search/Solr to a token count of 64 per node. -- Jack Krupansky

Re: Does saveToCassandra work with Cassandra Lucene plugin ?

2016-03-28 Thread Jack Krupansky
The exception message has an empty column name. Odd. Not sure if that is a bug in the exception code or whether you actually have an empty column name somewhere. Did you use the absolutely exact same commands to create the keyspace, table, and custom index as in the Stratio readme? -- Jack

Re: *** What is the best way to model this JSON *** ??

2016-03-28 Thread Jack Krupansky
range of potential queries? Which are the most common and need to be the fastest? -- Jack Krupansky On Mon, Mar 28, 2016 at 12:10 PM, Lokesh Ceeba - Vendor < lokesh.ce...@walmart.com> wrote: > Hello Team, > >How to design/develop the best data model for this JSON ?

Re: Consistency Level (QUORUM vs LOCAL_QUORUM)

2016-03-27 Thread Jack Krupansky
com/en/cassandra/3.x/cassandra/dml/dmlConfigConsistency.html . In short, Cassandra does indeed guarantee the degree of immediate consistency that you specify (and presumably want.) -- Jack Krupansky On Sun, Mar 27, 2016 at 6:36 PM, Harikrishnan A wrote: > Hello, > > I have a question re

Re: apache cassandra for trading system

2016-03-25 Thread Jack Krupansky
patterns will drive the data modeling, and also impact how much data you can realistically place on each node. What are your HA (High Availability) requirements? -- Jack Krupansky On Fri, Mar 25, 2016 at 2:40 PM, Jonathan Haddad wrote: > You can use keyspaces with multiple data centers to

Re: Understanding Cassandra tuning

2016-03-25 Thread Jack Krupansky
= 93.75 MB/sec, which is fairly close to your numbers, so just a little write amplification or spiking or fuzzy math on AWS end might trigger some AWS throttling. -- Jack Krupansky On Fri, Mar 25, 2016 at 11:42 AM, Giampaolo Trapasso < giampaolo.trapa...@radicalbit.io> wrote: >

Re: How many nodes do we require

2016-03-25 Thread Jack Krupansky
It depends on how much data you have. A single node can store a lot of data, but the more data you have the longer a repair or node replacement will take. How long can you tolerate for a full repair or node replacement? Generally, RF=3 is both sufficient and recommended. -- Jack Krupansky On

Re: What is the best way to model my time series?

2016-03-25 Thread Jack Krupansky
attern for Cassandra. But... you can probably get it to work with enough care and sufficient provisioning of the cluster. The big problem is that rapid, large-scale removal from the queue generates tons of tombstones that need to be removed. The DateTieredCompactionStrategy may help as well. -- Jack

Re: Counter values become under-counted when running repair.

2016-03-24 Thread Jack Krupansky
, but normally apps try to issue requests to a local data center for performance. Having to ping all data centers on all requests to achieve a quorum seems a bit excessive. Can you advise us on your thinking when you selected RF=2? -- Jack Krupansky On Thu, Mar 24, 2016 at 2:17 AM, Dikang Gu

Re: Rack aware question.

2016-03-23 Thread Jack Krupansky
ata can be retrieved even when a rack-level failure occurs. In short, if CL=ALL is acceptable, then you might as well dump the rack-aware approach, which was how you got into this situation in the first place. -- Jack Krupansky On Wed, Mar 23, 2016 at 7:31 PM, Anubhav Kale wrote: > I ran into

Re: com.datastax.driver.core.Connection "This should not happen and is likely a bug, please report."

2016-03-23 Thread Jack Krupansky
iling list is always a good idea, but as we continually see, people have difficulty even discerning the distinction between the Cassandra user list and the driver user lists. In truth, to some (a lot) of us, this distinction between "user" and "driver" is quite baffling. Sor

Re: Modeling Audit Trail on Cassandra

2016-03-19 Thread Jack Krupansky
executedby is the ID assigned to an employee. I'm presuming that JSON is to be used for objectbefore/after. This suggests no ability to query by individual object fields. I didn't sense any other columns that would be JSON. -- Jack Krupansky On Wed, Mar 16, 2016 at 3:48 PM, Tom van

Re: Single node Solr FTs not working

2016-03-19 Thread Jack Krupansky
highlight what the difference is that causes the problem. Doc: http://docs.datastax.com/en/latest-dse/datastax_enterprise/srch/srchTrnsFrm.html -- Jack Krupansky On Fri, Mar 18, 2016 at 4:30 AM, Joseph Tech wrote: > Hi, > > I had setup a single-node DSE 4.8.x to start in Search mode t

Re: Questions about Datastax support

2016-03-19 Thread Jack Krupansky
lease. -- Jack Krupansky On Thu, Mar 17, 2016 at 10:39 AM, Rakesh Kumar wrote: > > 1. They have a published support policy: > > http://www.datastax.com/support-policy/supported-software > > Why is the version number so different from the cassandra community > editio

Re: Modeling Audit Trail on Cassandra

2016-03-19 Thread Jack Krupansky
that an MV PK can only include one non-PK data column - CASSANDRA-9928 <https://issues.apache.org/jira/browse/CASSANDRA-9928>.) -- Jack Krupansky On Wed, Mar 16, 2016 at 4:40 PM, I PVP wrote: > Jack/Tom > Thanks for answering. > > Here is the table definition so far: > >

Re: Questions about Datastax support

2016-03-19 Thread Jack Krupansky
1. They have a published support policy: http://www.datastax.com/support-policy/supported-software -- Jack Krupansky On Thu, Mar 17, 2016 at 10:09 AM, Rakesh Kumar wrote: > Few questions: > > 1 - Has there been an announcement as to when Datastax will stop > supporting 2.x v

Re: Question about SELECT command

2016-03-19 Thread Jack Krupansky
value which can directly be mapped to a node (or multiple nodes with replication.) Ad hoc, complex, and expensive queries are anti-patterns in Cassandra (very discouraged if not outright not supported.) -- Jack Krupansky On Thu, Mar 17, 2016 at 12:25 PM, Thouraya TH wrote: > Yes, i have tes

Re:

2016-03-15 Thread Jack Krupansky
Be sure to post your final (working) insert for others to learn from! -- Jack Krupansky On Tue, Mar 15, 2016 at 11:56 AM, Rami Badran wrote: > thanks got it > > On Tue, Mar 15, 2016 at 5:54 PM, Jack Krupansky > wrote: > >> There's a UDT example in the doc, showing

Re:

2016-03-15 Thread Jack Krupansky
There's a UDT example in the doc, showing that you don't put quotes around the UDT key names: https://docs.datastax.com/en/cql/3.3/cql/cql_using/useInsertUDT.html -- Jack Krupansky On Tue, Mar 15, 2016 at 11:52 AM, Jack Krupansky wrote: > In any case, please post any diagn

Re:

2016-03-15 Thread Jack Krupansky
In any case, please post any diagnostic message/exception that you may be getting. -- Jack Krupansky On Tue, Mar 15, 2016 at 11:13 AM, Rami Badran wrote: > sorry like this > insert into users (uid,loginIds) values ('111','{ 'emails' : '{'

Re:

2016-03-15 Thread Jack Krupansky
No quotes around the UDT key names. (Or use double quotes.) -- Jack Krupansky On Tue, Mar 15, 2016 at 10:56 AM, Rami Badran wrote: > here is the CQL > > insert into users (uid,loginIds) values ('111',{ 'emails' : {' > f...@baggins.com', '

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-12 Thread Jack Krupansky
still needs to be centered on point queries and narrow contiguous slices. Even with Spark and analytics that may indeed need to do a full scan of a large amount of data, the model needs to be that the big scan is done in small chunks. -- Jack Krupansky On Sat, Mar 12, 2016 at 10:23 AM, Jason

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jack Krupansky
Thanks, that level of query detail gives us a better picture to focus on. I think through this some more over the weekend. Also, these queries focus on raw, bulk retrieval of sensor data readings, but do you have reading-based queries, such as range of an actual sensor reading? -- Jack Krupansky

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jack Krupansky
ber of rows) without hitting a bulk size issue for the partition. But... I don't want to jump to solutions until we have a firmer handle on the query side of the fence. -- Jack Krupansky On Fri, Mar 11, 2016 at 5:37 PM, Jason Kania wrote: > Jack, > > Thanks for the response. > &

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jack Krupansky
standing all of this other stuff upfront. -- Jack Krupansky On Thu, Mar 10, 2016 at 12:39 PM, Jason Kania wrote: > Jack, > > Thanks for the response. I don't think I provided enough information and > used the wrong terminology as your response is more the canned advice is > re

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

2016-03-11 Thread Jack Krupansky
What is your schema and data like - in particular, how wide are your partitions (number of rows and typical row size)? Maybe you just need (a lot) more heap for rows during the repair process. -- Jack Krupansky On Fri, Mar 11, 2016 at 11:19 AM, Adam Plumb wrote: > These are brand new bo

Re: What is wrong in this token function

2016-03-10 Thread Jack Krupansky
(for 2.2 and 3.x) -- Jack Krupansky On Thu, Mar 10, 2016 at 5:14 PM, Rakesh Kumar wrote: > I am using default Murmur3. So are you saying in case of Murmur3 the > following two queries > > select count*) > where customer_id = '289' > and event_time >= '201

Re: Exception about too long clustering key

2016-03-10 Thread Jack Krupansky
ons of the repo. Interesting. I mean, I wanted to search through the code as of the tag for 2.2.4. You would have to actually check out the code from that tag and then search in an IDE. -- Jack Krupansky On Thu, Mar 10, 2016 at 3:53 PM, Emīls Šolmanis wrote: > > Jack > > Yeah, I tra

Re: What is wrong in this token function

2016-03-10 Thread Jack Krupansky
ou can use RDBMS-like WHERE conditions to select a slice of the partition. -- Jack Krupansky On Thu, Mar 10, 2016 at 4:45 PM, Rakesh Kumar wrote: > > typo: the primary key was (customer_id + event_time ) > > > -Original Message- > From: Rakesh Kumar > To: user &g

Re: How to measure the write amplification of C*?

2016-03-10 Thread Jack Krupansky
commit log on a separate SSD device. That should probably be mentioned. -- Jack Krupansky On Thu, Mar 10, 2016 at 12:52 PM, Matt Kennedy wrote: > It isn't really the data written by the host that you're concerned with, > it's the data written by your application. I'd start b

Re: Exception about too long clustering key

2016-03-10 Thread Jack Krupansky
Did you ever find the source of the message? I couldn't find it in github either, either in the driver or Cassandra proper. -- Jack Krupansky On Thu, Mar 10, 2016 at 12:39 PM, Emīls Šolmanis wrote: > In case someone stumbles upon this same thing later. > > Ended up being a collec

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jack Krupansky
ditional tables. As a general proposition, Cassandra should not be used for heavy filtering - query tables with the filtering criteria baked into the PK is the way to go. -- Jack Krupansky On Thu, Mar 10, 2016 at 8:54 AM, Jason Kania wrote: > Hi, > > We have sensor input that cr

Re: How can I make Cassandra stable in a 2GB RAM node environment ?

2016-03-09 Thread Jack Krupansky
ich trades off performance for storage capacity. But... that would be an enhancement, not something that is "supported" out of the box today. What use cases would this satisfy? I mean, who is it that can get away with sacrificing performance these days? -- Jack Krupansky On Mon, Mar 7, 201

Re: ntpd clock sync

2016-03-09 Thread Jack Krupansky
How far out of sync are the nodes? A few minutes or less? Many hours? Worst case, you could simply take the entire cluster down until that future time has passed and then bring it back up. -- Jack Krupansky On Wed, Mar 9, 2016 at 11:27 AM, Jeff Jirsa wrote: > If you don’t overwrite or del

Re: moving keyspaces to another disk while Cassandra is running

2016-03-07 Thread Jack Krupansky
small is small? Six nodes? -- Jack Krupansky On Mon, Mar 7, 2016 at 5:57 AM, Krzysztof Księżyk wrote: > Hi, > > I have small Cassandra cluster running on boxes with 256GB SSD and 2TB HDD. > Originally SSD was for system and commit log and HDD for data. But > unfortunately becau

Re: How to create an additional cluster in Cassandra exclusively for Analytics Purpose

2016-03-06 Thread Jack Krupansky
ports keyword and prefix/suffix search. But it doesn't support multi-column ad hoc queries, which is what people tend to use Lucene and Solr for. So, again, it all depends on your queries and your data cardinality. -- Jack Krupansky On Sun, Mar 6, 2016 at 1:29 AM, Bhuvan Rawal wrote: > Yes

Re: How to create an additional cluster in Cassandra exclusively for Analytics Purpose

2016-03-05 Thread Jack Krupansky
You haven't been clear about how you intend to add Solr. You can also use Stratio or Stargate for basic Lucene search if you don't want need full Solr support and want to stick to open source rather than go with DSE Search for Solr. -- Jack Krupansky On Sun, Mar 6, 2016 at 12:25 AM, Bh

Re: How can I make Cassandra stable in a 2GB RAM node environment ?

2016-03-04 Thread Jack Krupansky
are absolute requirements. -- Jack Krupansky On Fri, Mar 4, 2016 at 9:04 PM, Hiroyuki Yamada wrote: > Hi, > > I'm working on some POCs for Cassandra with single 2GB RAM node > environment and > some issues came up with me, so let me ask here. > > I have tried to insert

Re: Updating secondary index options

2016-03-04 Thread Jack Krupansky
Is this a secondary indexer of your own design so that you know that changing the options will be safe for existing index entries? It might be worth a Jira. Otherwise, you may jus have to manually go in and hack the information under the hood. -- Jack Krupansky On Fri, Mar 4, 2016 at 12:14 PM

Re: Removing Node causes bunch of HostUnavailableException

2016-03-03 Thread Jack Krupansky
? When you say that the failures don't last for more than a few minutes, you mean from the moment you perform the nodetool removenode? And is operation completely normal after those few minutes? -- Jack Krupansky On Thu, Mar 3, 2016 at 4:40 PM, Peddi, Praveen wrote: > Hi Jack, > >

Re: Removing Node causes bunch of HostUnavailableException

2016-03-03 Thread Jack Krupansky
of nodes that were removed? How many seed nodes does each node typically have? -- Jack Krupansky On Thu, Mar 3, 2016 at 4:16 PM, Peddi, Praveen wrote: > Thanks Alain for quick and detailed response. My answers inline. One thing > I want to clarify is, the nodes got recycled due to some aut

Re: Practical limit on number of column families

2016-03-01 Thread Jack Krupansky
It is the total table count, across all key spaces. Memory is memory. -- Jack Krupansky On Tue, Mar 1, 2016 at 6:26 PM, Brian Sam-Bodden wrote: > Eric, > Is the keyspace as a multitenancy solution as bad as the many tables > pattern? Is the memory overhead of keyspaces as heavy a

Re: List of List

2016-03-01 Thread Jack Krupansky
Thrift? Hah! Sorry, I can't help you if you are going that route. I recommend CQL - only. -- Jack Krupansky On Tue, Mar 1, 2016 at 4:47 PM, Sandeep Kalra wrote: > The way I was planning is to give a restful interface to lookup details of > a question, and then user must get compl

Re: Commit log size vs memtable total size

2016-03-01 Thread Jack Krupansky
It would be nice to get this info into the doc or at least a blog post. -- Jack Krupansky On Tue, Mar 1, 2016 at 4:37 PM, Tyler Hobbs wrote: > > On Tue, Mar 1, 2016 at 6:13 AM, Vlad wrote: > >> So commit log can't keep more than memtable size, why is difference in >>

Re: List of List

2016-03-01 Thread Jack Krupansky
Okay, so a very large number of questions, each with a very modest number of answers (generally under 5), each with a modest number of comments (generally under 5). Now we're back to the issue of how you wish to query and access the data. -- Jack Krupansky On Tue, Mar 1, 2016 at 12:

Re: Cassandra Ussages

2016-03-01 Thread Jack Krupansky
I would spin it as Cassandra being the right choice where your primary need in OLTP and with a secondary need for analytics. IOW, where you would otherwise need to use two separate databases for the same data. -- Jack Krupansky On Tue, Mar 1, 2016 at 12:40 PM, Jonathan Haddad wrote: > Sp

Re: List of List

2016-03-01 Thread Jack Krupansky
Clustering columns are your friends. But the first question is how you need to query the data. Queries drive data models in Cassandra. What is the cardinality of this data - how many answers per question and how many comments per answer? -- Jack Krupansky On Tue, Mar 1, 2016 at 12:23 PM

Re: Cassandra Ussages

2016-03-01 Thread Jack Krupansky
with string key values effectively gives you extensible columns. -- Jack Krupansky On Tue, Mar 1, 2016 at 11:22 AM, Andrés Ivaldi wrote: > Jonathan thanks for the link, > I believe that maybe is good as Data Store part, because is fast for I/o > and handles Time Series, for analytics

Re: Practical limit on number of column families

2016-03-01 Thread Jack Krupansky
mate use case that can't easily be handled by a single table, that could get the discussion started. -- Jack Krupansky On Tue, Mar 1, 2016 at 9:11 AM, Fernando Jimenez < fernando.jime...@wealth-port.com> wrote: > Hi Jack > > Being purposefully developed to only handle up to “a few h

Re: Practical limit on number of column families

2016-03-01 Thread Jack Krupansky
se is strongly not recommended. As the Jira notes, "having more than dozens or hundreds of tables defined is almost certainly a Bad Idea." "Bad Idea" means not good. As in don't go there. And if you do, don't expect such a mis-adventure to be supported by the community

Re: Practical limit on number of column families

2016-03-01 Thread Jack Krupansky
, your specific access patterns, and your specific load. And it also depends on your own personal tolerance for degradation of latency and throughput - some people might find a given set of performance metrics acceptable while other might not. -- Jack Krupansky On Tue, Mar 1, 2016 at 3:54 AM, F

Re: Practical limit on number of column families

2016-02-29 Thread Jack Krupansky
asically have two choices: an additional cluster column to distinguish categories of table, or separate clusters for each few hundred of tables. -- Jack Krupansky On Mon, Feb 29, 2016 at 12:30 PM, Fernando Jimenez < fernando.jime...@wealth-port.com> wrote: > Hi all > > I have a u

Re: Cassandra Data Audit

2016-02-25 Thread Jack Krupansky
There is an open Jira on this exact topic - Change Data Capture (CDC): https://issues.apache.org/jira/browse/CASSANDRA-8844 Unfortunately, open means not yet done. -- Jack Krupansky On Thu, Feb 25, 2016 at 2:13 AM, Charulata Sharma (charshar) < chars...@cisco.com> wrote: > Thank

Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-24 Thread Jack Krupansky
s commence immediately, fairly soon, or only after about as long as they take from a clean fresh start? -- Jack Krupansky On Wed, Feb 24, 2016 at 7:04 PM, Mike Heffner wrote: > Nate, > > So we have run several install tests, bisecting the 2.1.x release line, > and we believe t

Re: JBOD device space allocation?

2016-02-24 Thread Jack Krupansky
o under which the user would hit this? I mean, why would the code care either way with respect to JBOD strategy for the case where no local data is stored? -- Jack Krupansky On Wed, Feb 24, 2016 at 2:15 AM, Marcus Eriksson wrote: > It is mentioned here btw: http://www.datastax.com/dev/blog/im

JBOD device space allocation?

2016-02-23 Thread Jack Krupansky
are seeing for device space utilization. Thanks! -- Jack Krupansky

Re: Nodes go down periodically

2016-02-23 Thread Jack Krupansky
most technical problems on a node would be clearly logged on that node. If you see a lapse of connectivity no more than once or twice a day, consider yourselves lucky. Is it only one node at a time that goes down, and at widely dispersed times? How many nodes? -- Jack Krupansky On Tue, Feb 23

Re: Gossip Protocol

2016-02-21 Thread Jack Krupansky
. What type of info did you wish to pass around? -- Jack Krupansky On Sun, Feb 21, 2016 at 8:56 AM, Thouraya TH wrote: > Hi all; > > Please, where can i find what are the details saved by gossip protocol ? > > Is it possible to add other informations to informations exchanged

Re: Forming a cluster of embedded Cassandra instances

2016-02-15 Thread Jack Krupansky
But again, you could also simply spawn a process running Cassandra as-is in its intended form which would eliminate the potential for conflict between the app heap and Casandra's JVM heap. -- Jack Krupansky On Mon, Feb 15, 2016 at 12:56 AM, Jan Kesten wrote: > Hi, > > the embedde

Re: Performance issues with "many" CQL columns

2016-02-14 Thread Jack Krupansky
What does your query actually look like today? Is your non-EQ on timestamp selecting a single row a few rows or many rows (dozens, hundreds, thousands)? -- Jack Krupansky On Sun, Feb 14, 2016 at 7:40 PM, Gianluca Borello wrote: > Thanks again. > > One clarification about "readi

Re: Performance issues with "many" CQL columns

2016-02-14 Thread Jack Krupansky
You can definitely read all of columns in a single SELECT. And the n-INSERTS can be batched and will insert fewer cells in the storage engine than the previous approach. -- Jack Krupansky On Sun, Feb 14, 2016 at 7:31 PM, Gianluca Borello wrote: > Thank you for your reply. > > Your

Re: Performance issues with "many" CQL columns

2016-02-14 Thread Jack Krupansky
of them. -- Jack Krupansky On Sun, Feb 14, 2016 at 5:22 PM, Gianluca Borello wrote: > Hi > > I've just painfully discovered a "little" detail in Cassandra: Cassandra > touches all columns on a CQL select (related issues > https://issues.apache.org/jira/

Re: Forming a cluster of embedded Cassandra instances

2016-02-14 Thread Jack Krupansky
What motivated the use of an embedded instance for development - as opposed to simply spawning a process for Cassandra? -- Jack Krupansky On Sun, Feb 14, 2016 at 2:05 PM, John Sanda wrote: > The project I work on day to day uses an embedded instance of Cassandra, > but it is intend

Re: Forming a cluster of embedded Cassandra instances

2016-02-13 Thread Jack Krupansky
des. That said, if any of the senior Cassandra developers wish to personally support your efforts towards embedded clusters, they are certainly free to do so. we'll see if any of them step forward. -- Jack Krupansky On Sat, Feb 13, 2016 at 3:47 PM, Binil Thomas wrote: > Hi all, > &

Re: Cassandra eats all cpu cores, high load average

2016-02-12 Thread Jack Krupansky
problem recurs for that node? -- Jack Krupansky On Fri, Feb 12, 2016 at 4:06 AM, Skvazh Roman wrote: > Hello! > We have a cluster of 25 c3.4xlarge nodes (16 cores, 32 GiB) with attached > 1.5 TB 4000 PIOPS EBS drive. > Sometimes one or two nodes user cpu spikes to 100%, load aver

Re: Rows with same key

2016-02-11 Thread Jack Krupansky
(Note to self... check docs to see if they give this troubleshooting tip. I didn't see it at first glance.) -- Jack Krupansky On Thu, Feb 11, 2016 at 2:45 PM, Kai Wang wrote: > Are you supplying timestamps from the client side? Are clocks in sync > cross your nodes? > > >

Re: Session timeout

2016-02-11 Thread Jack Krupansky
mitigation efforts if their infrastructure does not implicitly effect mitigation for various security exposures. -- Jack Krupansky On Thu, Feb 11, 2016 at 3:21 PM, oleg yusim wrote: > Robert, Jack, Bryan, > > As you suggested, I put together document, titled > Cassandra_Security_Topics

Re: Security labels

2016-02-11 Thread Jack Krupansky
document or is it strictly internal for your employer? I know there is a database of these assessments, but I don't know who controls what becomes public and when. -- Jack Krupansky On Thu, Feb 11, 2016 at 3:23 PM, oleg yusim wrote: > Hi Dani, > > As promised, I sort of put all my q

Re: Cassandra Collections performance issue

2016-02-11 Thread Jack Krupansky
are you indexing map columns, keys or values? -- Jack Krupansky On Thu, Feb 11, 2016 at 10:44 AM, Clint Martin < clintlmar...@coolfiretechnologies.com> wrote: > I have experienced excessive performance issues while using collections as > well. Mostly my issue was due to the excessi

  1   2   3   4   >