Re: Repair completes successfully but data is still inconsistent

2014-11-27 Thread André Cruz
On 26 Nov 2014, at 19:07, Robert Coli wrote: > > Yes. Do you know if 5748 was created as a result of compaction or via a flush > from a memtable? It was the result of a compaction: INFO [CompactionExecutor:22422] 2014-11-13 13:08:41,926 CompactionTask.java (line 262) Compacted 2 sstables to

Re: Repair completes successfully but data is still inconsistent

2014-11-26 Thread André Cruz
On 24 Nov 2014, at 18:54, Robert Coli wrote: > > But for any given value on any given node, you can verify the value it has in > 100% of SStables... that's what both the normal read path and repair should > do when reconciling row fragments into the materialized row? Hard to > understand a cas

Re: Repair completes successfully but data is still inconsistent

2014-11-24 Thread André Cruz
On 21 Nov 2014, at 19:01, Robert Coli wrote: > > 2- Why won’t repair propagate this column value to the other nodes? Repairs > have run everyday and the value is still missing on the other nodes. > > No idea. Are you sure it's not expired via TTL or masked in some other way? > When you ask tha

Re: Repair completes successfully but data is still inconsistent

2014-11-21 Thread André Cruz
On 19 Nov 2014, at 19:53, Robert Coli wrote: > > My hunch is that you originally triggered this by picking up some obsolete > SSTables during the 1.2 era. Probably if you clean up the existing zombies > you will not encounter them again, unless you encounter another "obsolete > sstables marked

Re: Repair completes successfully but data is still inconsistent

2014-11-19 Thread André Cruz
On 19 Nov 2014, at 11:37, André Cruz wrote: > > All the nodes were restarted on 21-23 October, for the upgrade (1.2.16 -> > 1.2.19) I mentioned. The delete happened after. I should also point out that > we were experiencing problems related to CASSANDRA-4206 and CASSANDRA

Re: Repair completes successfully but data is still inconsistent

2014-11-19 Thread André Cruz
ems now? Maybe with the upgrade the new Cassandra correctly compacted this row and all hell broke loose? If so, is there a easy way to fix this? Shouldn’t repair also propagate this zombie column to the other nodes? Thank you and best regards, André Cruz

Re: Repair completes successfully but data is still inconsistent

2014-11-18 Thread André Cruz
On 18 Nov 2014, at 01:08, Michael Shuler wrote: > > André, does `nodetool gossipinfo` show all the nodes in schema agreement? > Yes: $ nodetool -h XXX.XXX.XXX.XXX gossipinfo |grep -i schema SCHEMA:8ef63726-c845-3565-9851-91c0074a9b5e SCHEMA:8ef63726-c845-3565-9851-91c0074a9b5e SCHEMA:8ef

Re: Repair completes successfully but data is still inconsistent

2014-11-17 Thread André Cruz
On 14 Nov 2014, at 18:44, André Cruz wrote: > > On 14 Nov 2014, at 18:29, Michael Shuler wrote: >> >> On 11/14/2014 12:12 PM, André Cruz wrote: >>> Some extra info. I checked the backups and on the 8th of November, all 3 >>> replicas had the tombstone o

Re: Repair completes successfully but data is still inconsistent

2014-11-14 Thread André Cruz
On 14 Nov 2014, at 18:29, Michael Shuler wrote: > > On 11/14/2014 12:12 PM, André Cruz wrote: >> Some extra info. I checked the backups and on the 8th of November, all 3 >> replicas had the tombstone of the deleted column. So: >> >> 1 November - column is delete

Re: Repair completes successfully but data is still inconsistent

2014-11-14 Thread André Cruz
replica has the original value (!), with the original timestamp… Is there a logical explanation for this behaviour? Thank you, André Cruz

Repair completes successfully but data is still inconsistent

2014-11-14 Thread André Cruz
r this? On another note, why haven't repairs propagated this zombie column to the other nodes? Any help or pointers where to go next would be appreciated. Best regards, André Cruz

Extracting throughput-heavy Cassandra operations to different process

2013-09-25 Thread André Cruz
Hello. I have been looking into the memory patterns of Cassandra (1.1.5), and I have a suggestion that I haven't seen discussed here. Cassandra is configured by default with a GC tailored for pauseless operation instead of throughput, and this makes sense since it needs to answer client querie

Re: Frequent Full GC that take > 30s

2013-09-24 Thread André Cruz
On Sep 24, 2013, at 5:18 AM, Mohit Anchlia wrote: > Your ParNew size is way too small. Generally 4GB ParNew (-Xmn) works out best > for 16GB heap I was afraid that a 4GB ParNew would cause Young GCs to take too long. I'm going to test higher ParNew values. Thanks, André signature.asc Descri

Re: Frequent Full GC that take > 30s

2013-09-24 Thread André Cruz
On Sep 24, 2013, at 5:05 AM, 谢良 wrote: > it looks to me that "MaxTenuringThreshold" is too small, do you have any > chance to try with a bigger one, like 4 or 8 or sth else? MaxTenuringThreshold=1 seems a bit odd, yes. But it is the Cassandra default, maybe there is a reason for this? Perhaps

Frequent Full GC that take > 30s

2013-09-23 Thread André Cruz
ich application threads were stopped: 0.0408350 seconds Total time for which application threads were stopped: 0.0264510 seconds Thanks for the help, André Cruz signature.asc Description: Message signed with OpenPGP using GPGMail

Repair needed on all nodes if RF == number of nodes?

2013-07-16 Thread André Cruz
Hello. I have a cluster with 3 nodes and RF is 3. I've noticed that when I run a repair on a node (I don't use -pr), all nodes are involved. So, does this mean the other nodes are "repaired" as well? Do I still need to run repair on the other 2 nodes inside the gc_grace_period? Thanks, André

Re: Compacted data returns with repair?

2013-06-04 Thread André Cruz
On Jun 4, 2013, at 4:54 PM, horschi wrote: > this sounds like the following issue: > > https://issues.apache.org/jira/browse/CASSANDRA-4905 Thanks! Another good reason to upgrade. André

Compacted data returns with repair?

2013-06-04 Thread André Cruz
Hello. I deleted a lot of data from one of my CFs, waited the gc_grace_period, and as the compactions were deleting the data slowly, ran a major compaction on that CF. It reduced the size to what I expected. I did not run a major compaction on the other 2 nodes (RF = 3) before repairs took plac

Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-15 Thread André Cruz
On May 10, 2013, at 7:24 PM, Robert Coli wrote: > 1) What version of Cassandra do you run, on what hardware? 1.1.5 - 6 nodes, 32GB RAM, 300GB data per node, 900GB 10k RAID1, Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz. > 2) What consistency level do you write at? Do you do DELETEs? QUORUM. Yes,

Re: Quorum read after quorum write guarantee

2013-03-12 Thread André Cruz
On Mar 12, 2013, at 6:04 AM, aaron morton wrote: >> by a multiget will not find the just inserted data. > Can you explain how the data is not found. > Does it not find new columns or does it return stale columns ? It does not find new columns, I don't overwrite data. > If the read is run agai

Re: Quorum read after quorum write guarantee

2013-03-11 Thread André Cruz
On Mar 11, 2013, at 5:02 PM, Tyler Hobbs wrote: > What kind of inserts and multiget queries are you running? I use the ColumnFamily objects. The pool is initialised with "write_consistency_level=ConsistencyLevel.QUORUM". The insert is a regular insert, so the QUORUM is used. When fetching I us

Re: Quorum read after quorum write guarantee

2013-03-10 Thread André Cruz
On 10/03/2013, at 16:49, Chuan-Heng Hsiao wrote: > However, my guess is that cassandra only guarantee that > if you successfully write and you successfully read, then quorum will > give you the latest data. That's what I thought, but that's not what I'm seeing all the time. I have no errors read

Re: Quorum read after quorum write guarantee

2013-03-10 Thread André Cruz
Yes, same thread. Cassandra 1.1.5 btw. Sent from my iPhone On 10/03/2013, at 16:47, Dave Brosius wrote: > is the read and write happening on the same thread? > > On 03/10/2013 12:00 PM, André Cruz wrote: >> Hello. >> >> In my application it sometimes happens t

Quorum read after quorum write guarantee

2013-03-10 Thread André Cruz
Hello. In my application it sometimes happens that I execute a multiget (I use pycassa) to fetch data that I have just inserted. I use quorum writes and reads, and my RF is 3. I've noticed that sometimes (1 in 1000 perhaps) an insert followed (300ms after) by a multiget will not find the just

Healthy JVM GC

2013-02-08 Thread André Cruz
nabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly Thanks, André Cruz

Re: Collecting of tombstones columns during read query fills up heap

2013-01-10 Thread André Cruz
On Jan 10, 2013, at 8:01 PM, aaron morton wrote: >> So, one column represents a file in that directory and it has no value. > Just so I understand, the file contents are *not* stored in the column value ? No, on that particular CF the columns are SuperColumns with 5 sub columns (size, is_dir,

Collecting of tombstones columns during read query fills up heap

2013-01-10 Thread André Cruz
d and queries to that row are efficient again. Thanks, André Cruz

Re: Row cache and counters

2013-01-03 Thread André Cruz
Does anyone see anything wrong in these settings? Anything to account for a 8s timeout during a counter increment? Thanks, André On 31/12/2012, at 14:35, André Cruz wrote: > On Dec 29, 2012, at 8:53 PM, Mohit Anchlia wrote: > >> Can you post gc settings? Also check logs and see

Re: Row cache and counters

2012-12-31 Thread André Cruz
Compacted row maximum size: 770 Compacted row mean size: 298 Is there anything wrong with my configuration? Best regards, André Cruz

Re: Row cache and counters

2012-12-29 Thread André Cruz
On 29/12/2012, at 16:59, rohit bhatia wrote: > Reads during a write still occur during a counter increment with CL ONE, but > that latency is not counted in the request latency for the write. Your local > node write latency of 45 microseconds is pretty quick. what is your timeout > and the wri

Row cache and counters

2012-12-29 Thread André Cruz
ted when a counter update occurs or is just invalidated? Best regards, André Cruz

Re: Strange delay in query

2012-11-13 Thread André Cruz
On Nov 13, 2012, at 8:54 AM, aaron morton wrote: >> I don't think that statement is accurate. > Which part ? Probably this part: "After running a major compaction, automatic minor compactions are no longer triggered, frequently requiring you to manually run major compactions on a routine basis

Re: Strange delay in query

2012-11-11 Thread André Cruz
On Nov 11, 2012, at 12:01 AM, Binh Nguyen wrote: > FYI: Repair does not remove tombstones. To remove tombstones you need to run > compaction. > If you have a lot of data then make sure you run compaction on all nodes > before running repair. We had a big trouble with our system regarding > tom

Re: Strange delay in query

2012-11-09 Thread André Cruz
ws littered > with column tombstones (you could check with dumping the sstables...) > > Just a thought... > > Josep M. > > On Thu, Nov 8, 2012 at 12:23 PM, André Cruz wrote: > These are the two columns in question: > > => (super_column=13957152-234b-11e2-92bc-e0

Re: Strange delay in query

2012-11-08 Thread André Cruz
I don't think their size is an issue here. André On Nov 8, 2012, at 6:04 PM, Andrey Ilinykh wrote: > What is the size of columns? Probably those two are huge. > > > On Thu, Nov 8, 2012 at 4:01 AM, André Cruz wrote: > On Nov 7, 2012, at 12:15 PM, André Cruz wrote:

Re: Strange delay in query

2012-11-08 Thread André Cruz
On Nov 7, 2012, at 12:15 PM, André Cruz wrote: > This error also happens on my application that uses pycassa, so I don't think > this is the same bug. I have narrowed it down to a slice between two consecutive columns. Observe this behaviour using pycassa: >>> DISCO_CA

Re: Strange delay in query

2012-11-07 Thread André Cruz
On Nov 7, 2012, at 2:12 AM, Chuan-Heng Hsiao wrote: > I assume you are using cassandra-cli and connecting to some specific node. > > You can check the following steps: > > 1. Can you still reproduce this issue? (not -> maybe the system/node issue) Yes. I can reproduce this issue on all 3 nodes

Re: Strange delay in query

2012-11-06 Thread André Cruz
think that fetching the first 34 columns would be fast, and just a little bit slower than 33 columns, but this is a big difference. Thank you and best regards, André Cruz On Nov 6, 2012, at 2:43 PM, André Cruz wrote: > Hello. > > I have a SCF that is acting strange. See these 2 query

Strange delay in query

2012-11-06 Thread André Cruz
ner: org.apache.cassandra.dht.RandomPartitioner Schema versions: a354e01a-d342-3755-9821-c550dcd1caba: [zzz, yyy, xxx] Is there more information that I can provide? Best regards, André Cruz

Re: Query advice to prevent node overload

2012-09-18 Thread André Cruz
On Sep 18, 2012, at 3:06 AM, aaron morton wrote: >> select filename from inode where filename > ‘/tmp’ and filename < ‘/tmq’ and >> sentinel = ‘x’; Wouldn't that return files from directories '/tmp1', '/tmp2', for example? I thought the goal was to return files and subdirectories recursively i

Re: Query advice to prevent node overload

2012-09-17 Thread André Cruz
On Sep 17, 2012, at 3:04 AM, aaron morton wrote: >> I have a schema that represents a filesystem and one example of a Super CF >> is: > This may help with some ideas > http://www.datastax.com/dev/blog/cassandra-file-system-design Could you explain the usage of the "sentinel"? Which nodes have i

Re: Query advice to prevent node overload

2012-09-17 Thread André Cruz
On Sep 17, 2012, at 3:04 AM, aaron morton wrote: >> I have a schema that represents a filesystem and one example of a Super CF >> is: > This may help with some ideas > http://www.datastax.com/dev/blog/cassandra-file-system-design > > In general we advise to avoid Super Columns if possible. They

Query advice to prevent node overload

2012-09-14 Thread André Cruz
Hello. I have a schema that represents a filesystem and one example of a Super CF is: CF FilesPerDir: (DIRNAME -> (FILENAME -> (attribute1: value1, attribute2: value2)) And in cases of directory moves, I have to fetch all files of that directory and subdirectories. This implies one cassandra q

Re: [RELEASE] Apache Cassandra 1.1.5 released

2012-09-12 Thread André Cruz
On Sep 12, 2012, at 1:53 AM, Jason Axelson wrote: > That looks like something that I've run into as well on previous > versions of Cassandra. Our workaround was to not drop a keyspace and > the re-use it (which we were doing as part of a test suite). Thanks, I'll keep that in mind. André

Re: [RELEASE] Apache Cassandra 1.1.5 released

2012-09-11 Thread André Cruz
I'm also having "AssertionError"s. ERROR [ReadStage:51687] 2012-09-10 14:33:54,211 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ReadStage:51687,5,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSlice

Re: unsubscribe

2012-08-27 Thread André Cruz
On Aug 27, 2012, at 4:16 PM, "Nikolaidis, Christos" wrote: > No worries :-) I was replying to the list so whoever manages it can > unsubscribe me. That's not how you unsubscribe. You need to send an email to user-unsubscr...@cassandra.apache.org. André

Re: unsubscribe

2012-08-27 Thread André Cruz
On Aug 27, 2012, at 4:11 PM, Eric Evans wrote: > Since I am not in a position to unsubscribe anyone, I can only assume > that I have received this message in error. As per the frightening > legalese quoted above, I hereby notify you by email, and will now > proceed to destroy the original message

Re: Error deleting column families with 1.1

2012-05-09 Thread André Cruz
eveloper > @aaronmorton > http://www.thelastpickle.com > > On 8/05/2012, at 9:00 AM, André Cruz wrote: > >> Hello. >> >> Since I upgraded to Cassandra 1.1, I get the following error when trying to >> delete a CF. After this happens the CF is not

Error deleting column families with 1.1

2012-05-07 Thread André Cruz
Hello. Since I upgraded to Cassandra 1.1, I get the following error when trying to delete a CF. After this happens the CF is not accessible anymore, but I cannot create another one with the same name until I restart the server. INFO [MigrationStage:1] 2012-05-07 18:10:12,682 ColumnFamilyS

Advice on architecture

2012-03-27 Thread André Cruz
need as much free space for compactions than I would if the SSTables were larger. Am I missing something here? Is this the best way to deal with this (abnormal) use case? Thanks and best regards, André Cruz smime.p7s Description: S/MIME cryptographic signature