Re: Issues running Bulkloader program on AIX server

2013-04-04 Thread praveen.akunuru
Hi All,

Sorry, my environment is as below:


  1.  3 node cluster with Cassandra 1.1.9 provided with DSE 3.0 on Linux
  2.  We are trying to run the bulk loader from AIX 6.1 server. Java version 
1.5.

Regards,
Praveen

From: Praveen Akunuru 
mailto:praveen.akun...@wipro.com>>
Date: Thursday, April 4, 2013 12:21 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Issues running Bulkloader program on AIX server

Hi All,

I am facing issues with running java Bulkloader program from a AIX server. The 
program is working fine on Linux server. I am receiving the below error on AIX. 
Can anyone help me in getting this working?

java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317)
at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219)
at org.xerial.snappy.Snappy.(Snappy.java:44)
at java.lang.J9VMInternals.initializeImpl(Native Method)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
at 
org.apache.cassandra.io.compress.SnappyCompressor.create(SnappyCompressor.java:45)
at 
org.apache.cassandra.io.compress.SnappyCompressor.isAvailable(SnappyCompressor.java:55)
at 
org.apache.cassandra.io.compress.SnappyCompressor.(SnappyCompressor.java:37)
at java.lang.J9VMInternals.initializeImpl(Native Method)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:82)
at java.lang.J9VMInternals.initializeImpl(Native Method)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
at 
org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.(SSTableSimpleUnsortedWriter.java:80)
at 
org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.(SSTableSimpleUnsortedWriter.java:93)
at BulkLoadExample.main(BulkLoadExample.java:55)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.UnsatisfiedLinkError: snappyjava (Not found in 
java.library.path)
at java.lang.ClassLoader.loadLibraryWithPath(ClassLoader.java:1011)
at 
java.lang.ClassLoader.loadLibraryWithClassLoader(ClassLoader.java:975)
at java.lang.System.loadLibrary(System.java:469)
at 
org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52)
... 25 more
log4j:WARN No appenders could be found for logger 
(org.apache.cassandra.io.compress.SnappyCompressor).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Unhandled exception
Type=Segmentation error vmState=0x
J9Generic_Signal_Number=0004 Signal_Number=000b Error_Value= 
Signal_Code=0032
Handler1=09001000A06FF5A0 Handler2=09001000A06F60F0

Regards,
Praveen

Wipro Limited (Company Regn No in UK FC 019088)
Address: Level 2, West wing, 3 Sheldon Square, London W2 6PS, United Kingdom. 
Tel +44 20 7432 8500 Fax: +44 20 7286 5703 

VAT Number: 563 1964 27

(Branch of Wipro Limited (Incorporated in India at Bangalore with limited 
liability vide Reg no L9KA1945PLC02800 with Registrar of Companies at 
Bangalore, India. Authorized share capital  Rs 5550 mn))

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email. 

www.wipro.com


Cassandra services down frequently [Version 1.1.4]

2013-04-04 Thread adeel . akbar

Hi,

We are running 4 nodes Cassandra cluster (1.1.4) with Replica Factor 2  
(DC 1) and Replica Factor 1 (DC 2) in two differnet data cnters with  
network topology. Our machines are having 16GB RAM and 8 core with two  
hard drives.


# /opt/apache-cassandra-1.1.4/bin/nodetool -h localhost ring
Address DC  RackStatus State   Load 
Effective-Ownership Token
   
 169417178424467235000914166253263322299
10.0.0.3DC1 RAC1Up Normal  91.93 GB 
66.67%  0
10.0.0.4DC1 RAC1Up Normal  84.88 GB 
66.67%  56713727820156410577229101238628035242
10.0.0.15   DC1 RAC1Up Normal  82.51 GB 
66.67%  113427455640312821154458202477256070484
10.40.1.103 DC2 RAC1Up Normal  303.2 MB 
100.00% 169417178424467235000914166253263322299


# java -version
java version "1.6.0_43"
Java(TM) SE Runtime Environment (build 1.6.0_43-b01)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)

After some time (1 hour / 2 hour) cassandra shut services on one or  
two nodes with follwoing errors;



 INFO 11:01:25,527 GC for ConcurrentMarkSweep: 1968 ms for 2  
collections, 3817667464 used; max is 4093640704
 INFO 11:01:42,838 GC for ConcurrentMarkSweep: 1828 ms for 2  
collections, 3850830504 used; max is 4093640704

java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid27363.hprof ...
Heap dump file created [4664912349 bytes in 44.731 secs]
ERROR 11:02:41,156 Exception in thread Thread[CompactionExecutor:87,1,main]
java.lang.OutOfMemoryError: Java heap space
at  
org.apache.cassandra.io.util.FastByteArrayOutputStream.expand(FastByteArrayOutputStream.java:104)
at  
org.apache.cassandra.io.util.FastByteArrayOutputStream.write(FastByteArrayOutputStream.java:220)

at java.io.DataOutputStream.write(DataOutputStream.java:90)
at  
org.apache.cassandra.io.util.DataOutputBuffer.write(DataOutputBuffer.java:61)
at  
org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at  
org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at  
org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:62)
at  
org.apache.cassandra.db.SuperColumnSerializer.serialize(SuperColumn.java:366)
at  
org.apache.cassandra.db.SuperColumnSerializer.serialize(SuperColumn.java:339)
at  
org.apache.cassandra.db.ColumnFamilySerializer.serializeForSSTable(ColumnFamilySerializer.java:89)
at  
org.apache.cassandra.db.compaction.PrecompactedRow.write(PrecompactedRow.java:138)
at  
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:156)
at  
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
at  
org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154)
at  
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at  
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at  
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at  
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)

at java.lang.Thread.run(Thread.java:662)
 INFO 11:02:41,373 Stop listening to thrift clients
 INFO 11:02:41,376 InetAddress /10.0.0.15 is now dead.
 INFO 11:02:41,376 InetAddress /10.0.0.3 is now dead.
 INFO 11:02:41,377 InetAddress /10.40.1.103 is now dead.
 INFO 11:02:41,397 InetAddress /10.0.0.3 is now UP
 INFO 11:02:41,397 InetAddress /10.0.0.15 is now UP
 INFO 11:02:41,398 InetAddress /10.40.1.103 is now UP
 INFO 11:02:41,398 Started hinted handoff for token: 0 with IP: /10.0.0.3
 INFO 11:02:41,450 Announcing shutdown
 INFO 11:02:48,184 GC for ConcurrentMarkSweep: 1887 ms for 2  
collections, 2234362128 used; max is 4093640704

 INFO 11:02:48,206 Waiting for messaging service to quiesce
 INFO 11:02:48,207 MessagingService shutting down server thread.


Our cassandra.yaml configurations are as under;


cluster_name: 'ABC Cluster'
initial_token: 0
hinted_handoff_enabled: true
max_hint_window_in_ms: 2147483647 # one hour
hinted_handoff_throttle_delay_in_ms: 0
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
authority: org.apache.cassandra.auth.AllowAllAuthority
partitioner: org.apache.cassandra.dht.RandomPartitioner

data_file_directories:
- /u/cassandra

Re: Lost data after expanding cluster c* 1.2.3-1

2013-04-04 Thread Kais Ahmed
Hi aaron,

I ran the command "nodetool rebuild_index host keyspace cf" on all the
nodes, in the log i see :

INFO [RMI TCP Connection(5422)-10.34.139.xxx] 2013-04-04 08:31:53,641
ColumnFamilyStore.java (line 558) User Requested secondary index re-build
for ...

but nothing's happening, how can i monitor the progress? and how can i know
when it's finished?

Thanks,


2013/4/2 aaron morton 

> The problem come from that i don't put  auto_boostrap to true for the new
> nodes, not in this documentation (
> http://www.datastax.com/docs/1.2/install/expand_ami)
>
> auto_bootstrap defaults to True if not specified in the yaml.
>
> can i do that at any time, or when the cluster are not loaded
>
> Not sure what the question is.
> Both those operations are online operations you can do while the node is
> processing requests.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 1/04/2013, at 9:26 PM, Kais Ahmed  wrote:
>
> > At this moment the errors started, we see that members and other data
> are gone, at this moment the nodetool status return (in red color the 3 new
> nodes)
> > What errors?
> The errors was in my side in the application, not cassandra errors
>
> > I put for each of them seeds = A ip, and start each with two minutes
> intervals.
> > When I'm making changes I tend to change a single node first, confirm
> everything is OK and then do a bulk change.
> Thank you for that advice.
>
> >I'm not sure what or why it went wrong, but that should get you to a
> stable place. If you have any problems keep an eye on the logs for errors
> or warnings.
> The problem come from that i don't put  auto_boostrap to true for the new
> nodes, not in this documentation (
> http://www.datastax.com/docs/1.2/install/expand_ami)
>
> >if you are using secondary indexes use nodetool rebuild_index to rebuild
> those.
> can i do that at any time, or when the cluster are not loaded
>
> Thanks aaron,
>
> 2013/4/1 aaron morton 
>
>> Please do not rely on colour in your emails, the best way to get your
>> emails accepted by the Apache mail servers is to use plain text.
>>
>> > At this moment the errors started, we see that members and other data
>> are gone, at this moment the nodetool status return (in red color the 3 new
>> nodes)
>> What errors?
>>
>> > I put for each of them seeds = A ip, and start each with two minutes
>> intervals.
>> When I'm making changes I tend to change a single node first, confirm
>> everything is OK and then do a bulk change.
>>
>> > Now the cluster seem to work normally, but i can use the secondary for
>> the moment, the queryanswer are random
>> run nodetool repair -pr on each node, let it finish before starting the
>> next one.
>> if you are using secondary indexes use nodetool rebuild_index to rebuild
>> those.
>> Add one node new node to the cluster and confirm everything is ok, then
>> add the remaining ones.
>>
>> >I'm not sure what or why it went wrong, but that should get you to a
>> stable place. If you have any problems keep an eye on the logs for errors
>> or warnings.
>>
>> Cheers
>>
>> -
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>>
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 31/03/2013, at 10:01 PM, Kais Ahmed  wrote:
>>
>> > Hi aaron,
>> >
>> > Thanks for reply, i will try to explain what append exactly
>> >
>> > I had 4 C* called [A,B,C,D] cluster (1.2.3-1 version) start with ec2
>> ami (https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) with
>> > this config --clustername myDSCcluster --totalnodes 4--version community
>> >
>> > Two days after this cluster in production, i saw that the cluster was
>> overload, I wanted to extend it by adding 3 another nodes.
>> >
>> > I create a new cluster with 3 C* [D,E,F]  (
>> https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2)
>> >
>> > And follow the documentation (
>> http://www.datastax.com/docs/1.2/install/expand_ami) for adding them in
>> the ring.
>> > I put for each of them seeds = A ip, and start each with two minutes
>> intervals.
>> >
>> > At this moment the errors started, we see that members and other data
>> are gone, at this moment the nodetool status return (in red color the 3 new
>> nodes)
>> >
>> > Datacenter: eu-west
>> > ===
>> > Status=Up/Down
>> > |/ State=Normal/Leaving/Joining/
>> >> Moving
>> >> --  Address   Load   Tokens  Owns   Host ID
>> Rack
>> >> UN  10.34.142.xxx 10.79 GB   256 15.4%
>>  4e2e26b8-aa38-428c-a8f5-e86c13eb4442  1b
>> >> UN  10.32.49.xxx   1.48 MB25613.7%
>>  e86f67b6-d7cb-4b47-b090-3824a5887145  1b
>> >> UN  10.33.206.xxx  2.19 MB25611.9%
>>  92af17c3-954a-4511-bc90-29a9657623e4  1b
>> >> UN  10.32.27.xxx   1.95 MB256  14.9%
>>  862e6b39-b380-40b4-9d61-d83cb8dacf9e  1b
>> >> UN  10.34.139.xxx 11.67 GB   25615.5%
>>  0324e394-b65f-46c8-acb4-

Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski

Hi Aaron,

At first, before I go with a lot of logs:

I'm considering a problem related to this issue: 
https://issues.apache.org/jira/browse/CASSANDRA-4905


Let's say the tombstone on one of the nodes (X) is gcable and was not 
compacted (purged) so far. After it was created we re-created this row, 
but due some problems it was written only to the second node (Y), so we 
have "live" data on node Y which is newer than the gcable tombstone on 
replica node X. Some time ago we did NOT repair our cluster for a  while 
(well, pretty long while), so it's possible that such situation happened.


My concern is: will AntiEntropy ignore this tombstone only, or basically 
everything related to the row key that this tombstone was created for?


If it's not the case, here are the answers you asked for :-)


What version are you on ?


1.2.1
(plus CASSANDRA-5298 & CASSANDRA-5299 patches to be exact ;-) )


Can you run a repair on the CF and check:
Does the repair detect differences in the CF and stream changes ?
> After the streaming does it run a secondary index rebuild on the new 
sstable ? (Should be in the logs)


I'm attaching a log file (cssa-repair.log).

Just to clarify: the key I use for tests belongs to *:1:7 node and *:2:1 
is a replica for that node (checked with nodetool getendpoints). 
Yesterday I was repairing this CF cluster-wide, but to (hopefully) make 
debugging simplier, what I send you is related only to these two nodes.


So as I understand these logs: no changes have been detected and nothing 
was streamed. Indexes have not been rebuilt, obviously.


However, on the other hand I'd expect to get "Nothing to repair for 
keyspace production" in nodetool output in this case - am I wrong? I'm a 
bit confused with the info I get here ;-)



Can you provide the full query trace ?


I'm attaching two files, as this stack trace is pretty long: 
no-index.log (query by row key) and index.log (query by indexed column).



M.
*** When repairing node 1 (requested key belongs to its primary range):

* node 1 log:
 INFO [Thread-1780798] 2013-04-04 08:17:37,286 StorageService.java (line 2311) Starting repair command #8, repairing 1 ranges for keyspace production
 INFO [AntiEntropySessions:9] 2013-04-04 08:17:37,288 AntiEntropyService.java (line 652) [repair #1c2df170-9d00-11e2-938f-11f9b91aba37] new session: will sync cssa01-07/2001:5d19:13:169:0:1:1:7, /2001:5d19:13:169:0:1:2:1 on range (5671372782015641057722910123862803524,11342745564031282115445820247725607048] for production.[Users]
 INFO [AntiEntropySessions:9] 2013-04-04 08:17:37,288 AntiEntropyService.java (line 857) [repair #1c2df170-9d00-11e2-938f-11f9b91aba37] requesting merkle trees for Users (to [/2001:5d19:13:169:0:1:2:1, cssa01-07/2001:5d19:13:169:0:1:1:7])
 INFO [AntiEntropyStage:1] 2013-04-04 08:17:37,326 AntiEntropyService.java (line 214) [repair #1c2df170-9d00-11e2-938f-11f9b91aba37] Received merkle tree for Users from /2001:5d19:13:169:0:1:1:7
 INFO [AntiEntropyStage:1] 2013-04-04 08:17:37,326 AntiEntropyService.java (line 214) [repair #1c2df170-9d00-11e2-938f-11f9b91aba37] Received merkle tree for Users from /2001:5d19:13:169:0:1:2:1
 INFO [AntiEntropyStage:1] 2013-04-04 08:17:37,340 AntiEntropyService.java (line 988) [repair #1c2df170-9d00-11e2-938f-11f9b91aba37] Endpoints /2001:5d19:13:169:0:1:1:7 and /2001:5d19:13:169:0:1:2:1 are consistent for Users
 INFO [AntiEntropyStage:1] 2013-04-04 08:17:37,340 AntiEntropyService.java (line 764) [repair #1c2df170-9d00-11e2-938f-11f9b91aba37] Users is fully synced
 INFO [AntiEntropySessions:9] 2013-04-04 08:17:37,340 AntiEntropyService.java (line 698) [repair #1c2df170-9d00-11e2-938f-11f9b91aba37] session completed successfully

* node 2 log:
  INFO [AntiEntropyStage:1] 2013-04-04 08:17:37,302 AntiEntropyService.java (line 246) [repair #1c2df170-9d00-11e2-938f-11f9b91aba37] Sending completed merkle tree to /2001:5d19:13:169:0:1:1:7 for (production,Users)


*** When repairing node 2 (replica of node 1):

* node 2 log:
 INFO [Thread-1778894] 2013-04-04 08:22:27,727 StorageService.java (line 2311) Starting repair command #8, repairing 1 ranges for keyspace production
 INFO [AntiEntropySessions:9] 2013-04-04 08:22:27,728 AntiEntropyService.java (line 652) [repair #c94b91f0-9d00-11e2-843a-aba1caf27753] new session: will sync cssa02-01/2001:5d19:13:169:0:1:2:1, /2001:5d19:13:169:0:1:1:8 on range (11342745564031282115445820247725607048,17014118346046923173168730371588410572] for production.[Users]
 INFO [AntiEntropySessions:9] 2013-04-04 08:22:27,728 AntiEntropyService.java (line 857) [repair #c94b91f0-9d00-11e2-843a-aba1caf27753] requesting merkle trees for Users (to [/2001:5d19:13:169:0:1:1:8, cssa02-01/2001:5d19:13:169:0:1:2:1])
 INFO [ValidationExecutor:20] 2013-04-04 08:22:27,729 ColumnFamilyStore.java (line 640) Enqueuing flush of Memtable-Users.Users_active_idx@1715128181(28/180 serialized/live bytes, 4 ops)
 INFO [ValidationExecutor:20] 2013-04-04 08:22:27,730 ColumnFamilyStore.java (line 640) Enqu

Re: upgrading 1.1.x to 1.2.x via sstableloader

2013-04-04 Thread Michał Czerwiński
I see, thanks for the replay!

One more question:

I can see that multiple nodes have same sstable names for a certain
keyspace / cf.
I am moving 8 nodes to a 6 nodes cluster, so at some point when putting
sstables in place I would overwrite files from other node. What is the best
way to solve this problem? Is it safe to change sstable file name to avoid
name collisions?



On 4 April 2013 02:54, aaron morton  wrote:

> > java.lang.UnsupportedOperationException: SSTable
> zzz/xxx/yyy-hf-47-Data.db is not compatible with current version ib
> You cannot stream files that have a different on disk format.
>
> 1.2 can read the old files, but cannot accept them as streams. You can
> copy the files to the new machines and use nodetool refresh to load them,
> then upgradesstables to re-write them before running repair.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 3/04/2013, at 10:53 PM, Michał Czerwiński 
> wrote:
>
> > Does anyone knows what is the best process to put data from cassandra
> 1.1.x (1.1.7 to be more precise) to cassandra 1.2.3 ?
> >
> > I am trying to use sstableloader and stream data to a new cluster but I
> get.
> >
> > ERROR [Thread-125] 2013-04-03 16:37:27,330 IncomingTcpConnection.java
> (line 183) Received stream using protocol version 5 (my version 6).
> Terminating connection
> >
> > ERROR [Thread-141] 2013-04-03 16:38:05,704 CassandraDaemon.java (line
> 164) Exception in thread Thread[Thread-141,5,main]
> >
> > java.lang.UnsupportedOperationException: SSTable
> zzz/xxx/yyy-hf-47-Data.db is not compatible with current version ib
> >
> > at
> org.apache.cassandra.streaming.StreamIn.getContextMapping(StreamIn.java:77)
> >
> > at
> org.apache.cassandra.streaming.IncomingStreamReader.(IncomingStreamReader.java:87)
> >
> > at
> org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:238)
> >
> > at
> org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:178)
> >
> > at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)
> >
> >
> >
> > I've changed Murmur3Partitioner to RandomPartitioner already and I've
> noticed I am not able to use 1.1.7's sstableloader so I copied sstables to
> new nodes and tried doing it locally on cassandra 1.2.3, but it seems
> protocol versions do not match (see error above)
> >
> > The reason why I want to use sstableloader is that I have different
> number of nodes and would like to avoid using rsync and then repair/cleanup
> of excessive data.
> >
> > Thanks!
> >
>
>


Re: Repair does not fix inconsistency

2013-04-04 Thread Sylvain Lebresne
> I'm considering a problem related to this issue:
> https://issues.apache.org/**jira/browse/CASSANDRA-4905
>
> Let's say the tombstone on one of the nodes (X) is gcable and was not
> compacted (purged) so far. After it was created we re-created this row, but
> due some problems it was written only to the second node (Y), so we have
> "live" data on node Y which is newer than the gcable tombstone on replica
> node X. Some time ago we did NOT repair our cluster for a  while (well,
> pretty long while), so it's possible that such situation happened.
>

That would my bet, yes.


My concern is: will AntiEntropy ignore this tombstone only, or basically
> everything related to the row key that this tombstone was created for?
>

It will only ignore the tombstone itself.

In theory, that older than gcgrace tombstone should eventually be reclaimed
by compaction, though it's not guaranteed that it will be by the first
compaction including it (but if you use SizeTieredCompaction, a major
compaction would ensure that you get rid of it; that being said, I'm not
necessarily advising a major compaction, if you can afford to wait for
normal compaction to get rid of it, that's probably simpler).

--
Sylvain


>
> If it's not the case, here are the answers you asked for :-)
>
>
>  What version are you on ?
>>
>
> 1.2.1
> (plus CASSANDRA-5298 & CASSANDRA-5299 patches to be exact ;-) )
>
>
>  Can you run a repair on the CF and check:
>> Does the repair detect differences in the CF and stream changes ?
>>
> > After the streaming does it run a secondary index rebuild on the new
> sstable ? (Should be in the logs)
>
> I'm attaching a log file (cssa-repair.log).
>
> Just to clarify: the key I use for tests belongs to *:1:7 node and *:2:1
> is a replica for that node (checked with nodetool getendpoints). Yesterday
> I was repairing this CF cluster-wide, but to (hopefully) make debugging
> simplier, what I send you is related only to these two nodes.
>
> So as I understand these logs: no changes have been detected and nothing
> was streamed. Indexes have not been rebuilt, obviously.
>
> However, on the other hand I'd expect to get "Nothing to repair for
> keyspace production" in nodetool output in this case - am I wrong? I'm a
> bit confused with the info I get here ;-)
>
>
>  Can you provide the full query trace ?
>>
>
> I'm attaching two files, as this stack trace is pretty long: no-index.log
> (query by row key) and index.log (query by indexed column).
>
>
> M.
>


Re: Repair does not fix inconsistency

2013-04-04 Thread horschi
Hi Michal,

Let's say the tombstone on one of the nodes (X) is gcable and was not
> compacted (purged) so far. After it was created we re-created this row, but
> due some problems it was written only to the second node (Y), so we have
> "live" data on node Y which is newer than the gcable tombstone on replica
> node X. Some time ago we did NOT repair our cluster for a  while (well,
> pretty long while), so it's possible that such situation happened.
>
> My concern is: will AntiEntropy ignore this tombstone only, or basically
> everything related to the row key that this tombstone was created for?
>
It will only ignore the tombstone (which should have been repair in a
previous repair anyway - assuming you to repairs within gc_grace). Any
newer columns (overwriting the tombstone) would be still alive and would
not be ignored.

The only way for CASSANDRA-4905 to make any difference is to not run repair
within gc_grace. With the patch it would not repair these old tombstones
any more. But in that case you should simply increase gc_grace and not undo
the patch :-)



"When I query (cqlsh) some rows by key (CL is default = ONE) I _always_ get
a correct result.  However, when I query it by indexed column, it returns
nothing."
This looks to me more like a secondary index issue. If you say the access
via rowkey is always correct, then the repair works fine. I think there
might be something wrong with your secondary index then.


Cheers,
Christian


Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski

Hi Christian,

About CASSANDRA-4905 - thanks for explaining this :-)


This looks to me more like a secondary index issue. If you say the access
via rowkey is always correct, then the repair works fine. I think there
might be something wrong with your secondary index then.


This was my first thought too, but if you take a look at the logs I 
attached to previous e-mail, you'll notice that query "by key" 
(no-index.log) retrieves data from BOTH replicas, while the "by indexed 
column" one (index.log) talks only to one of them (too bad it's the one 
that contains tombstone only - 1:7). In the first case it is possible to 
"resolve" the conflict and return the proper result, while in the second 
case it's impossible because tombstone is the only thing that is 
returned for this key.


Moreover, when I query this CF by indexed column with CL >= TWO it 
_does_ return proper result. If index was broken I'd expect it to be 
broken in this case too. Thus, from my point of view it's rather a 
repair-related thing, than index-related one.


M.




Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski

Hi Sylvain,

Thanks for explaination :-) However, in this case, I still do not get 
why this (probably) gcable tombstone on 2:1 could cause this mess. As AE 
ignores only the tombstone itself (which means that there are no data 
for this key on 2:1 node from repair's point of view), it should result 
in repairing this inconsistency by streaming missing data from 1:7 (thus 
there'll be both: live data and gcable tombstone on 2:1 after the 
repair), shouldn't it?


Yes, I'm considering running a major compaction (we store about 200KB of 
data in this CF, so it's not a problem at all ;-) ), but before I do I 
want to make sure I understand the problem, so as long as I can live 
with QUORUM read / writes, I'll wait with compacting and play a bit with 
this problem :-)


M.

W dniu 04.04.2013 12:28, Sylvain Lebresne pisze:

I'm considering a problem related to this issue:
https://issues.apache.org/**jira/browse/CASSANDRA-4905

Let's say the tombstone on one of the nodes (X) is gcable and was not
compacted (purged) so far. After it was created we re-created this row, but
due some problems it was written only to the second node (Y), so we have
"live" data on node Y which is newer than the gcable tombstone on replica
node X. Some time ago we did NOT repair our cluster for a  while (well,
pretty long while), so it's possible that such situation happened.



That would my bet, yes.


My concern is: will AntiEntropy ignore this tombstone only, or basically

everything related to the row key that this tombstone was created for?



It will only ignore the tombstone itself.

In theory, that older than gcgrace tombstone should eventually be reclaimed
by compaction, though it's not guaranteed that it will be by the first
compaction including it (but if you use SizeTieredCompaction, a major
compaction would ensure that you get rid of it; that being said, I'm not
necessarily advising a major compaction, if you can afford to wait for
normal compaction to get rid of it, that's probably simpler).

--
Sylvain




If it's not the case, here are the answers you asked for :-)


  What version are you on ?




1.2.1
(plus CASSANDRA-5298 & CASSANDRA-5299 patches to be exact ;-) )


  Can you run a repair on the CF and check:

Does the repair detect differences in the CF and stream changes ?

After the streaming does it run a secondary index rebuild on the new

sstable ? (Should be in the logs)

I'm attaching a log file (cssa-repair.log).

Just to clarify: the key I use for tests belongs to *:1:7 node and *:2:1
is a replica for that node (checked with nodetool getendpoints). Yesterday
I was repairing this CF cluster-wide, but to (hopefully) make debugging
simplier, what I send you is related only to these two nodes.

So as I understand these logs: no changes have been detected and nothing
was streamed. Indexes have not been rebuilt, obviously.

However, on the other hand I'd expect to get "Nothing to repair for
keyspace production" in nodetool output in this case - am I wrong? I'm a
bit confused with the info I get here ;-)


  Can you provide the full query trace ?




I'm attaching two files, as this stack trace is pretty long: no-index.log
(query by row key) and index.log (query by indexed column).


M.







Re: Repair does not fix inconsistency

2013-04-04 Thread horschi
Hi,

This was my first thought too, but if you take a look at the logs I
> attached to previous e-mail, you'll notice that query "by key"
> (no-index.log) retrieves data from BOTH replicas, while the "by indexed
> column" one (index.log) talks only to one of them (too bad it's the one
> that contains tombstone only - 1:7). In the first case it is possible to
> "resolve" the conflict and return the proper result, while in the second
> case it's impossible because tombstone is the only thing that is returned
> for this key.
>
Sorry, I did not look into the logs. Thats the first time I'm seeing the
trace btw. :-)

Does CQL not allow CL=ONE queries? Why does it ask two nodes for the key,
when you say that you are using CL=default=1? I'm a bit confused here (I'm
a thrift user).

But thinking about your theory some more: I think CASSANDRA-4905 might make
reappearing columns more common (only if you do not run repair within
gc_grace of course). Before CASSANDRA-4905 the tombstones would be repaired
even after gc_grace, so it was a bit more forgiving. It was never
guaranteed that the inconsistency would be repaired though.

I think you should have increased gc-grace or run repair within the 10 days.


The repair bit makes sense now in my head, unlike the CQL CL :-)

cheers,
Christian


Re: nodetool status inconsistencies, repair performance and system keyspace compactions

2013-04-04 Thread Ondřej Černoš
Hi,

most has been resolved - the failed to uncompress error was really a
bug in cassandra (see
https://issues.apache.org/jira/browse/CASSANDRA-5391) and the problem
with different load reporting is a change between 1.2.1 (reports 100%
for 3 replicas/3 nodes/2 DCs setup I have) and 1.2.3 which reports the
fraction. Is this correct?

Anyway, the nodetool repair still takes ages to finish, considering
only megabytes of not changing data are involved in my test:

[root@host:/etc/puppet] nodetool repair ks
[2013-04-04 13:26:46,618] Starting repair command #1, repairing 1536
ranges for keyspace ks
[2013-04-04 13:47:17,007] Repair session
88ebc700-9d1a-11e2-a0a1-05b94e1385c7 for range
(-2270395505556181001,-2268004533044804266] finished
...
[2013-04-04 13:47:17,063] Repair session
65d31180-9d1d-11e2-a0a1-05b94e1385c7 for range
(1069254279177813908,1070290707448386360] finished
[2013-04-04 13:47:17,063] Repair command #1 finished

This is the status before the repair (by the way, after the datacenter
has been bootstrapped from the remote one):

[root@host:/etc/puppet] nodetool status
Datacenter: us-east
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Tokens  Owns   Host ID
Rack
UN  xxx.xxx.xxx.xxx5.74 MB256 17.1%
06ff8328-32a3-4196-a31f-1e0f608d0638  1d
UN  xxx.xxx.xxx.xxx5.73 MB256 15.3%
7a96bf16-e268-433a-9912-a0cf1668184e  1d
UN  xxx.xxx.xxx.xxx5.72 MB256 17.5%
67a68a2a-12a8-459d-9d18-221426646e84  1d
Datacenter: na-dev
==
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns   Host ID
   Rack
UN  xxx.xxx.xxx.xxx   5.74 MB256 16.4%
eb86aaae-ef0d-40aa-9b74-2b9704c77c0a  cmp02
UN  xxx.xxx.xxx.xxx   5.74 MB256 17.0%
cd24af74-7f6a-4eaa-814f-62474b4e4df1  cmp01
UN  xxx.xxx.xxx.xxx   5.74 MB256 16.7%
1a55cfd4-bb30-4250-b868-a9ae13d81ae1  cmp05

Why does it take 20 minutes to finish? Fortunately the big number of
compactions I reported in the previous email was not triggered.

And is there a documentation where I could find the exact semantics of
repair when vnodes are used (and what -pr means in such a setup) and
when run in multiple datacenter setup? I still don't quite get it.

regards,
Ondřej Černoš


On Thu, Mar 28, 2013 at 3:30 AM, aaron morton  wrote:
> During one of my tests - see this thread in this mailing list:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/java-io-IOException-FAILED-TO-UNCOMPRESS-5-exception-when-running-nodetool-rebuild-td7586494.html
>
> That thread has been updated, check the bug ondrej created.
>
> How will this perform in production with much bigger data if repair
> takes 25 minutes on 7MB and 11k compactions were triggered by the
> repair run?
>
> Seems a little odd.
> See what happens the next time you run repair.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 27/03/2013, at 2:36 AM, Ondřej Černoš  wrote:
>
> Hi all,
>
> I have 2 DCs, 3 nodes each, RF:3, I use local quorum for both reads and
> writes.
>
> Currently I test various operational qualities of the setup.
>
> During one of my tests - see this thread in this mailing list:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/java-io-IOException-FAILED-TO-UNCOMPRESS-5-exception-when-running-nodetool-rebuild-td7586494.html
> - I ran into this situation:
>
> - all nodes have all data and agree on it:
>
> [user@host1-dc1:~] nodetool status
>
> Datacenter: na-prod
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad Tokens  Owns
> (effective)  Host IDRack
> UN  XXX.XXX.XXX.XXX   7.74 MB256 100.0%
> 0b1f1d79-52af-4d1b-a86d-bf4b65a05c49  cmp17
> UN  XXX.XXX.XXX.XXX   7.74 MB256 100.0%
> 039f206e-da22-44b5-83bd-2513f96ddeac  cmp10
> UN  XXX.XXX.XXX.XXX   7.72 MB256 100.0%
> 007097e9-17e6-43f7-8dfc-37b082a784c4  cmp11
> Datacenter: us-east
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad Tokens  Owns
> (effective)  Host IDRack
> UN  XXX.XXX.XXX.XXX7.73 MB256 100.0%
> a336efae-8d9c-4562-8e2a-b766b479ecb4  1d
> UN  XXX.XXX.XXX.XXX7.73 MB256 100.0%
> ab1bbf0a-8ddc-4a12-a925-b119bd2de98e  1d
> UN  XXX.XXX.XXX.XXX 7.73 MB256 100.0%
> f53fd294-16cc-497e-9613-347f07ac3850  1d
>
> - only one node disagrees:
>
> [user@host1-dc2:~] nodetool status
> Datacenter: us-east
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address Load   Tokens   Owns   Host ID
> 

Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski
Well... Strange. We have such problem with 6 users, but there's only ONE 
tombstone (created 8 days ago, so it's not gcable yet) in all the 
SSTables on 2:1 node - checked using sstable2json.
Moreover, this tombstone DOES NOT belong to the row key I'm using for 
tests, because this user was NOT ever removed / replaced.
So now I have no bloody idea how C* can see a tombstone for this key 
when running a query. Maybe it's a index problem then?


D'oh! :|

...

Oh! This "Read 0 live cells and 1 tombstoned" info for "by indexed 
column" query seems to be about the data that were read from index - it 
tells me about "Key cache hit for  sstable" for SSTables 23 & 24, but I 
don't have such SSTables for Users CF. However, I have SSTables like 
this for index. It's not telling me anything more about why it thinks 
that this row was removed, but... well... good to know it, anyway ;-)


M.


Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski



Does CQL not allow CL=ONE queries? Why does it ask two nodes for the key,
when you say that you are using CL=default=1? I'm a bit confused here (I'm
a thrift user).


Yup, that's another thing I'm curious about too (default CL is ONE for 
sure), but as for now it helps me to investigate my problem, I consider 
it as a feature :>




But thinking about your theory some more: I think CASSANDRA-4905 might make
reappearing columns more common (only if you do not run repair within
gc_grace of course). Before CASSANDRA-4905 the tombstones would be repaired
even after gc_grace, so it was a bit more forgiving. It was never
guaranteed that the inconsistency would be repaired though.

I think you should have increased gc-grace or run repair within the 10 days.


See my last e-mail I wrote as a reply to my own e-mail replying 
Sylvain's e-mail - things get more strange now... ;-)


M.


Re: Repair does not fix inconsistency

2013-04-04 Thread horschi
> Well... Strange. We have such problem with 6 users, but there's only ONE
> tombstone (created 8 days ago, so it's not gcable yet) in all the SSTables
> on 2:1 node - checked using sstable2json.
> Moreover, this tombstone DOES NOT belong to the row key I'm using for
> tests, because this user was NOT ever removed / replaced.
> So now I have no bloody idea how C* can see a tombstone for this key when
> running a query. Maybe it's a index problem then?
>
Yes, maybe there are two issues here: repair not running and maybe really
some index-thing.

Maybe you can try a CL=ONE with cassandra-cli? So that we can see how it
works without index.


it tells me about "Key cache hit for  sstable" for SSTables 23 & 24, but I
> don't have such SSTables for Users CF. However, I have SSTables like this
> for index.
>
I think the Index-SSTables and the data SSTables are compacted separately
and the numbers can differ from the data, even though they are flushed
together. So the numbers can differ. (anybody feel free to correct me on
this)


Repair hangs when merkle tree request is not acknowledged

2013-04-04 Thread Paul Sudol
Hello,

I have a cluster with 4 nodes, 2 nodes in 2 data centers. I had a hardware 
failure in one DC and had to replace the nodes. I'm running 1.2.3 on all of the 
nodes now. I was able to run nodetool rebuild on the two replacement nodes, but 
now I cannot finish a repair on any of them. I have 18 column families, if I 
run a repair on a single CF at a time, I can get the node repaired eventually. 
A repair on a certain CF will fail, and I run it again and again, eventually it 
will succeed.

I've got an RF of 2, 1 copy in each DC, so the repair needs to pull data from 
the other DC to finish it's repair.

The problem seems to be that the merkle tree request sometimes is not received 
by the node in the other DC. Usually when the merkle tree request is sent, the 
nodes that it was sent to start a compaciton/validation. In certain cases this 
does not happen, only the node that I ran the repair on will begin 
compaction/validation and send the merkle tree to itself. Then it's waiting for 
a merkle tree from the other node, and it will never get it. After about 24 
hours it will time out and say the node in question died.

Is there a setting I can use to force the merkle tree request to be 
acknowledged or resent if it's not acknowledged? I setup NTPD on all the nodes 
and tried the cross_node_timeout, but that did not help.

Thanks in advance,

Paul

Re: Repair does not fix inconsistency

2013-04-04 Thread horschi
Repair is fine - all the data seem to be in SSTables. I've checked it and
> while index tells me that I have 1 tombstone and 0 live cells for a key, I
> can _see_, thanks to sstable2json, that I have 3 "live cells" (assuming a
> cell is an entry in SSTable) and 0 tombstones. After being confused for the
> most of the day, now I'm almost sure it's a problem with index (re)building.

I'm glad to hear that. I feared my ticket might be responsible for your
data loss. I could not live the guilt ;-) Seriously: I'm glad we can rule
out the repair change.



> The same: for key-based query it returns correct result no matter if I use
> CL=ONE or TWO (or stronger). When querying by indexed column it works for
> CL=TWO or more, but returns no data for CL=ONE.

Yes, if it works with CL=one, then it must be the index. Check the
mailing-list, I think someone else posted something similar the other day.


cheers,
Christian


Re: Alter table drop column seems not working

2013-04-04 Thread julien Campan
You are right, the documentation says that this action is not supported.

I was surprised because the "auto completion" in cqlsh allows you to try it
and, moreover, you have an example of a drop column when you use "help
alter_drop".

Maybe it would be nice to change at least the documentation and
auto-completion ?



2013/4/4 aaron morton 

> I dont think it's supported
> http://www.datastax.com/docs/1.2/cql_cli/cql/ALTER_TABLE#dropping-typed-col
>
> Anyone else know?
>
> Cheers
>
>-
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 3/04/2013, at 8:11 PM, julien Campan  wrote:
>
> Hi,
>
> I'm working with cassandra 1.2.2.
>
> When I try to drop a column , it's not working.
>
> This is what I tried :
>
> CREATE TABLE cust (
>   ise text PRIMARY KEY,
>   id_avatar_1 uuid,
>   id_avatar_2 uuid,
>   id_avatar_3 uuid,
>   id_avatar_4 uuid
> ) ;
>
>
> cqlsh> ALTER TABLE cust DROP id_avatar_1 ;
>
> ==>Bad Request: line 1:17 no viable alternative at input 'DROP'
> ==>Perhaps you meant to use CQL 2? Try using the -2 option when
> ==>starting cqlsh.
>
> Can someone  tell me how to drop a column or if it is a bug ?
>
> Thank
>
>
>


Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski



Yes, maybe there are two issues here: repair not running and maybe really
some index-thing.


Repair is fine - all the data seem to be in SSTables. I've checked it 
and while index tells me that I have 1 tombstone and 0 live cells for a 
key, I can _see_, thanks to sstable2json, that I have 3 "live cells" 
(assuming a cell is an entry in SSTable) and 0 tombstones. After being 
confused for the most of the day, now I'm almost sure it's a problem 
with index (re)building.



Maybe you can try a CL=ONE with cassandra-cli? So that we can see how it
works without index.


The same: for key-based query it returns correct result no matter if I 
use CL=ONE or TWO (or stronger). When querying by indexed column it 
works for CL=TWO or more, but returns no data for CL=ONE.



it tells me about "Key cache hit for  sstable" for SSTables 23 & 24, but I

don't have such SSTables for Users CF. However, I have SSTables like this
for index.


I think the Index-SSTables and the data SSTables are compacted separately
and the numbers can differ from the data, even though they are flushed
together. So the numbers can differ. (anybody feel free to correct me on
this)


Yes, I agree - they're counted separately, and it's quite clear to me 
that these numbers do not match. I was surprised, because I didn't 
understand this output before and I thought that this "live cell" and 
tombstone info referrs somehow to "real" data from "regular" SSTables, 
not the index CF.


M.




Re: Linear scalability problems

2013-04-04 Thread Anand Somani
We are using a single process with multiple threads, will look at client
side delays.

Thanks

On Wed, Apr 3, 2013 at 9:30 AM, Tyler Hobbs  wrote:

> If I had to guess, I would say that your client is the bottleneck, not the
> cluster.  Are you inserting data with multiple threads or processes?
>
>
> On Wed, Apr 3, 2013 at 8:49 AM, Anand Somani  wrote:
>
>> Hi,
>>
>> I am running some tests trying to scale out our application from using a
>> 3 node cluster to 6 node cluster. The thing I observed is that when using a
>> 3 node cluster I was able to handle abt 41 req/second, so I added 3 more
>> nodes thinking it should close to double, but instead it only goes upto bat
>> 47 req/second!! I am doing something wrong and it is not obvious, so wanted
>> some help in what stats could/should I monitor to tell me things like if a
>> node has more requests or if the load distribution is not random enough?
>>
>> Note I am using direct thrift (old code base) and cassandra 1.1.6. The
>> data model is for storing blobs (split across columns) and has around 6 CF,
>> RF=3 and all operations are at quorum. Also at the end of the run nodetool
>> ring reports the same data size.
>>
>> Thanks
>> Anand
>>
>
>
>
> --
> Tyler Hobbs
> DataStax 
>


Re: Cassandra freezes

2013-04-04 Thread Hiller, Dean
I am going to throw some info out there for you as it might help.

 1.  RAM usage grows with dataset size on that node(adding more nodes reduces 
the RAM used per node since each node has less rows).  index_interval can be 
upped to reduce RAM usage but be careful with it.  Switching to LCS and 
bloomfilter going to 0.1 can lower RAM usage but again, test, test, test first.

Aaron missed one other possibility(only because usually you increase to 8G RAM 
before doing this one)….add more nodes.  Cassandra performance stays extremely 
consistent up until you hit that per node limit.  It seems to me, you are 
researching what you can do per node which is a good thing and we had to o 
through that.  I think at some point, most teams do.  To accurately run the 
test though, you should be at 8G RAM as there are huge swing in RAM for 
compactions(maybe less so with LCS).

Recovering once you hit a RAM issue is not nice.  We had this in production 
once and had to temporarily increase to 12G and work out fixes until we got the 
new nodes.

Not sure if this helps at all, but it's good to know.

Later,
Dean

From: Joel Samuelsson 
mailto:samuelsson.j...@gmail.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Thursday, April 4, 2013 5:49 AM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: Cassandra freezes

Yes, both of those solutions seem fine.

My question still seems valid though; shouldn't the node recover and perform as 
well as it did during the first few tests? If not, what makes the node not have 
the same issues at a smaller load but after a longer period of time? Having 
nodes' performance drop radically over time seems unacceptable and not 
something most people experience.


2013/4/4 aaron morton mailto:aa...@thelastpickle.com>>
INFO [ScheduledTasks:1] 2013-04-03 08:47:40,757 GCInspector.java (line 122) GC 
for ParNew: 40370 ms for 3 collections, 565045688 used; max is 
2038693888
This is the JVM pausing the application for 40 seconds to complete GC.

You have two choices, use a bigger heap (4Gb to 8GB) or have a lower workload.

 cheers


-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/04/2013, at 7:54 PM, Joel Samuelsson 
mailto:samuelsson.j...@gmail.com>> wrote:

Thanks for your suggestions. I'll get back to you with the tests you suggested, 
probably tomorrow. In the meantime though I have a few questions. You say:
- 2GB of JVM heap to be insufficient to run this workload against Cassandra

I realise that the one node cluster has a maximum workload. It did however work 
fine for a few tests and then performance deteriorated. Currently I can't even 
complete a test run since the server won't respond in time - even though I 
haven't run a test since yesterday. Shouldn't the server "recover" sooner or 
later and perform as well as it did during the first few tests? If not 
automatically, what can I do to help it? Tried nodetool flush but with no 
performance improvement.

And just fyi in case it changes anything, I don't immediately read back the 
written rows. There are 100 000 rows being written and 100 000 rows being read 
in parallell. The rows being read were written to the cluster before the tests 
were run.


2013/4/3 Andras Szerdahelyi 
mailto:andras.szerdahe...@ignitionone.com>>
Wrt/ cfhistograms output, you are supposed to consider "Offset" x "column 
values of each column" a separate histogram. Also AFAIK, these counters are 
flushed after each invocation, so you are always looking at data from between 
two invocations of cfhistograms   - With that in mind, to me your cfhistograms 
say :
- you encountered 200k rows with a single column

- most of your write latencies are agreeable but – and I can not comment on how 
much a commit log write ( an append ) would affect this as I have 
durable_writes:false on all my data - that’s a long tail you have there, in to 
the hundreds of milliseconds which can not be OK.
Depending on how often your memtables are switched ( emptied and queued for 
flushing to disk ) and how valuable your updates received in between two of 
these are, you may want to disable durable writes on the KS with 
"durable_writes=false", or the very least place the commit log folder on its 
own drive. Again, I'm not absolutely sure this could affect write latency

- 38162520 reads served with a single sstable read, that’s great

- a big chunk of these reads are served from page cache or memtables ( the 
latter being more likely since, as I understand , you immediately read back the 
written column and you work with unique row keys ) , but again you have a long 
drop off

16410 mutations / sec, with 1k payload, lets say that is 20MB/s in to memory 
with overhead, 3rd of the 2GB heap for memtables = 666MB : a switch every ~30 
seconds.
I'm not sure if your write performance can

Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski

W dniu 04.04.2013 15:38, horschi pisze:

I'm glad to hear that. I feared my ticket might be responsible for your
data loss. I could not live the guilt ;-) Seriously: I'm glad we can rule
out the repair change.


Haha, I didn't notice before that it was your ticket! ;-)


Yes, if it works with CL=one, then it must be the index. Check the
mailing-list, I think someone else posted something similar the other day.


That was the first thing I checked yesterday, but, as I was not sure if 
that's the problem, I didn't pay too much attention to this ;-) I'll dig 
a bit more then. And I'll probably drop/recreate indexes tomorrow, as a 
"lest resort" if I don't find anything interesting :-)


Thanks for help :-)

BTW. there's still a question why CQL requests two nodes when using 
CL=ONE ;-) OK, I have read_repair_chance = 1.0 for this CF, so I might 
assume that tracing in cqlsh somehow "hacks" read_repair and also shows 
all "background" digest requests, but - still - if it matters, it should 
matter for index-based queries too, but it doesn't. Well, it's not my 
biggest problem today, so the answer can wait ;-)


M.



Re: Linear scalability problems

2013-04-04 Thread Cem Cayiroglu
What was the RF before adding nodes?

Sent from my iPhone

On 04 Apr 2013, at 15:12, Anand Somani  wrote:

> We are using a single process with multiple threads, will look at client side 
> delays.
> 
> Thanks
> 
> On Wed, Apr 3, 2013 at 9:30 AM, Tyler Hobbs  wrote:
> If I had to guess, I would say that your client is the bottleneck, not the 
> cluster.  Are you inserting data with multiple threads or processes?
> 
> 
> On Wed, Apr 3, 2013 at 8:49 AM, Anand Somani  wrote:
> Hi,
> 
> I am running some tests trying to scale out our application from using a 3 
> node cluster to 6 node cluster. The thing I observed is that when using a 3 
> node cluster I was able to handle abt 41 req/second, so I added 3 more nodes 
> thinking it should close to double, but instead it only goes upto bat 47 
> req/second!! I am doing something wrong and it is not obvious, so wanted some 
> help in what stats could/should I monitor to tell me things like if a node 
> has more requests or if the load distribution is not random enough?
> 
> Note I am using direct thrift (old code base) and cassandra 1.1.6. The data 
> model is for storing blobs (split across columns) and has around 6 CF, RF=3 
> and all operations are at quorum. Also at the end of the run nodetool ring 
> reports the same data size.
> 
> Thanks
> Anand
> 
> 
> 
> -- 
> Tyler Hobbs
> DataStax
> 


Really have to repair ?

2013-04-04 Thread cscetbon.ext
Hi,

I know that deleted rows can reappear if "node repair" is not run on every node 
before gc_grace_seconds seconds. However do we really need to obey this rule if 
we run "node repair" on node that are down for more than max_hint_window_in_ms 
milliseconds ?

Thanks
--
Cyril SCETBON


_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.



Is there guidance about compaction thresholds and setting minthreshold to 2?

2013-04-04 Thread Peter Haggerty
The default minthreshold for compactions is 4:
http://www.datastax.com/docs/1.1/references/nodetool#nodetool-setcompactionthreshold

Is there a reason that this value is not "2", the lowest possible value?
 If we change this to 2 what should we expect to see? Should we see less
growth in storage load and fewer files to seek through when reading but at
the cost of higher CPU usage?


Thanks,

Peter


Re: Why do Datastax docs recommend Java 6?

2013-04-04 Thread Shahryar Sedghi
I use IBM JVM 7, it is free,  and for VMs over 8 GB it has a garbage
collection policy that makes it almost pause-less. We also use some
security libraries that eliminates use of other libs that you need for
Oracle.


On Thu, Apr 4, 2013 at 12:59 AM, Edward Capriolo wrote:

> Hey guys. what gives!
> Apparently Cassandra ran on 1.7 like 5 years ago. Was their a regression?
>
> jk
>
> https://github.com/jbellis/helenus
>
> Installation
> 
> * Please use jdk 1.7; Cassandra will run with 1.6 but
>   frequently core dumps on quad-core machines
> * Unpack the tar ball:
>
>
>
>
> On Wed, Feb 6, 2013 at 12:56 PM, Wei Zhu  wrote:
>
>> Anyone has first hand experience with Zing JVM which is claimed to be
>> pauseless? How do they charge, per CPU?
>>
>> Thanks
>> -Wei
>>   --
>> *From:* Edward Capriolo 
>> *To:* user@cassandra.apache.org
>> *Sent:* Wednesday, February 6, 2013 7:07 AM
>> *Subject:* Re: Why do Datastax docs recommend Java 6?
>>
>> Oracle already did this once, It was called jrockit :)
>> http://www.oracle.com/technetwork/middleware/jrockit/overview/index.html
>>
>> Typically oracle acquires they technology and then the bits are merged
>> with the standard JVM.
>>
>> On Wed, Feb 6, 2013 at 2:13 AM, Viktor Jevdokimov <
>> viktor.jevdoki...@adform.com> wrote:
>>
>>  I would prefer Oracle to own an Azul’s Zing JVM over any other (GC) to
>> provide it for free for anyone :)
>>
>>Best regards / Pagarbiai
>> *Viktor Jevdokimov*
>> Senior Developer
>>
>> Email: viktor.jevdoki...@adform.com
>> Phone: +370 5 212 3063, Fax +370 5 261 0453
>> J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
>> Follow us on Twitter: @adforminsider
>> Take a ride with Adform's Rich Media Suite
>>  [image: Adform News] 
>> [image: Adform awarded the Best Employer 2012]
>> 
>>
>> Disclaimer: The information contained in this message and attachments is
>> intended solely for the attention and use of the named addressee and may be
>> confidential. If you are not the intended recipient, you are reminded that
>> the information remains the property of the sender. You must not use,
>> disclose, distribute, copy, print or rely on this e-mail. If you have
>> received this message in error, please contact the sender immediately and
>> irrevocably delete this message and any copies.
>>
>>   *From:* jef...@gmail.com [mailto:jef...@gmail.com]
>> *Sent:* Wednesday, February 06, 2013 02:23
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Why do Datastax docs recommend Java 6?
>>
>> Oracle now owns the sun hotspot team, which is inarguably the highest
>> powered java vm team in the world. Its still really the epicenter of all
>> java vm development.
>>  Sent from my Verizon Wireless BlackBerry
>>  --
>>  *From: *"Ilya Grebnov" 
>>  *Date: *Tue, 5 Feb 2013 14:09:33 -0800
>>  *To: *
>>  *ReplyTo: *user@cassandra.apache.org
>>  *Subject: *RE: Why do Datastax docs recommend Java 6?
>>
>>  Also, what is particular reason to use Oracle JDK over Open JDK? Sorry,
>> I could not find this information online.
>>
>> Thanks,
>> Ilya
>>  *From:* Michael Kjellman 
>> [mailto:mkjell...@barracuda.com]
>>
>> *Sent:* Tuesday, February 05, 2013 7:29 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Why do Datastax docs recommend Java 6?
>>
>>  There have been tons of threads/convos on this.
>>
>>  In the early days of Java 7 it was pretty unstable and there was pretty
>> much no convincing reason to use Java 7 over Java 6.
>>
>>  Now that Java 7 has stabilized and Java 6 is EOL it's a reasonable
>> decision to use Java 7 and we do it in production with no issues to speak
>> of.
>>
>>  That being said there was one potential situation we've seen as a
>> community where bootstrapping new node was using 3x more CPU and getting
>> significantly less throughput. However, reproducing this consistently never
>> happened AFAIK.
>>
>>  I think until more people use Java 7 in production and prove it doesn't
>> cause any additional bugs/performance issues Datastax will update their
>> docs. Until now I'd say it's a safe bet to use Java 7 with Vanilla C*
>> 1.2.1. I hope this helps!
>>
>>  Best,
>>  Michael
>>
>>  *From: *Baron Schwartz 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Tuesday, February 5, 2013 7:21 AM
>> *To: *"user@cassandra.apache.org" 
>> *Subject: *Why do Datastax docs recommend Java 6?
>>
>>   The Datastax docs repeatedly say (e.g.
>> http://www.datastax.com/docs/1.2/install/install_jre) that Java 7 is not
>> recommended, but they don't say why. It would be helpful to know this. Does
>> anyone know?
>>
>>  The same documentation is referenced from the Cassandra wiki, for
>> example, http://wiki.apache.org/cassandra/GettingStarted
>>
>>  - Baron
>>
>>
>>
>>
>>
>


-- 
"Life is what happens whil

Re: Is there guidance about compaction thresholds and setting minthreshold to 2?

2013-04-04 Thread Edward Capriolo
One would think, but remember only "like sized" sstables compact. You want
more files roughlt the same size rather then few big ones in most cases,
but there are no hard fast rules.


On Thu, Apr 4, 2013 at 11:36 AM, Peter Haggerty
wrote:

> The default minthreshold for compactions is 4:
>
> http://www.datastax.com/docs/1.1/references/nodetool#nodetool-setcompactionthreshold
>
> Is there a reason that this value is not "2", the lowest possible value?
>  If we change this to 2 what should we expect to see? Should we see less
> growth in storage load and fewer files to seek through when reading but at
> the cost of higher CPU usage?
>
>
> Thanks,
>
> Peter
>
>


Re: IndexOutOfBoundsException during repair, streaming

2013-04-04 Thread Dane Miller
On Wed, Apr 3, 2013 at 6:08 PM, aaron morton  wrote:
> We deleted and recreated those CFs before moving into
> production mode.
>
> We have a wiener.
>
> The comparator is applying the current schema to the byte value read from
> disk (schema on read) which describes a value with more than 2 components.
> It's then trying to apply the current schema so it can type cast the bytes
> for comparison.
>
> Something must have gone wrong in the "deleted" part of your statement
> above. We do not store schema with data, so this a problem of changing the
> schema in an incompatible way with existing data.
>
> nodetool scrub is probably your best bet. I've not checked that it handles
> this specific problem, but in general it will drop rows from SSTables that
> cannot be read or have some other problem. Best thing to do is snapshot and
> copy the data from one prod node to a QA box and run some tests.
>
> hope that helps.

I scrubbed the CF and was able to complete the repair.  Thanks Aaron!

Dane


Re: Is there guidance about compaction thresholds and setting minthreshold to 2?

2013-04-04 Thread Sylvain Lebresne
More importantly than CPU, you'll use more I/O. Say you have 4 (like-size)
sstables, compact them all into one file (which is really what
SizeTieredCompaction will try to do) will require twice as much I/O that
with min_compaction=2 versus 4.


On Thu, Apr 4, 2013 at 7:26 PM, Edward Capriolo wrote:

> One would think, but remember only "like sized" sstables compact. You want
> more files roughlt the same size rather then few big ones in most cases,
> but there are no hard fast rules.
>
>
> On Thu, Apr 4, 2013 at 11:36 AM, Peter Haggerty <
> peter.hagge...@librato.com> wrote:
>
>> The default minthreshold for compactions is 4:
>>
>> http://www.datastax.com/docs/1.1/references/nodetool#nodetool-setcompactionthreshold
>>
>> Is there a reason that this value is not "2", the lowest possible value?
>>  If we change this to 2 what should we expect to see? Should we see less
>> growth in storage load and fewer files to seek through when reading but at
>> the cost of higher CPU usage?
>>
>>
>> Thanks,
>>
>> Peter
>>
>>
>


Data Modeling: How to keep track of arbitrarily inserted column names?

2013-04-04 Thread Drew Kutcharian
Hey Guys,

I'm working on a project and one of the requirements is to have a schema free 
CF where end users can insert arbitrary key/value pairs per row. What would be 
the best way to know what are all the "keys" that were inserted (preferably w/o 
any locking). For example,

Row1 => key1 -> XXX, key2 -> XXX
Row2 => key1 -> XXX, key3 -> XXX
Row3 => key4 -> XXX, key5 -> XXX
Row4 => key2 -> XXX, key5 -> XXX
…

The query would be give me all the inserted keys and the response would be 
{key1, key2, key3, key4, key5}

Thanks,

Drew



Re: Data Modeling: How to keep track of arbitrarily inserted column names?

2013-04-04 Thread Edward Capriolo
You can not get only the column name (which you are calling a key) you can
use get_range_slice which returns all the columns. When you specify an
empty byte array (new byte[0]{}) as the start and finish you get back all
the columns. From there you can return only the columns to the user in a
format that you like.


On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian  wrote:

> Hey Guys,
>
> I'm working on a project and one of the requirements is to have a schema
> free CF where end users can insert arbitrary key/value pairs per row. What
> would be the best way to know what are all the "keys" that were inserted
> (preferably w/o any locking). For example,
>
> Row1 => key1 -> XXX, key2 -> XXX
> Row2 => key1 -> XXX, key3 -> XXX
> Row3 => key4 -> XXX, key5 -> XXX
> Row4 => key2 -> XXX, key5 -> XXX
> …
>
> The query would be give me all the inserted keys and the response would be
> {key1, key2, key3, key4, key5}
>
> Thanks,
>
> Drew
>
>


Re: Linear scalability problems

2013-04-04 Thread Anand Somani
RF=3.

On Thu, Apr 4, 2013 at 7:08 AM, Cem Cayiroglu  wrote:

> What was the RF before adding nodes?
>
> Sent from my iPhone
>
> On 04 Apr 2013, at 15:12, Anand Somani  wrote:
>
> We are using a single process with multiple threads, will look at client
> side delays.
>
> Thanks
>
> On Wed, Apr 3, 2013 at 9:30 AM, Tyler Hobbs  wrote:
>
>> If I had to guess, I would say that your client is the bottleneck, not
>> the cluster.  Are you inserting data with multiple threads or processes?
>>
>>
>> On Wed, Apr 3, 2013 at 8:49 AM, Anand Somani wrote:
>>
>>> Hi,
>>>
>>> I am running some tests trying to scale out our application from using a
>>> 3 node cluster to 6 node cluster. The thing I observed is that when using a
>>> 3 node cluster I was able to handle abt 41 req/second, so I added 3 more
>>> nodes thinking it should close to double, but instead it only goes upto bat
>>> 47 req/second!! I am doing something wrong and it is not obvious, so wanted
>>> some help in what stats could/should I monitor to tell me things like if a
>>> node has more requests or if the load distribution is not random enough?
>>>
>>> Note I am using direct thrift (old code base) and cassandra 1.1.6. The
>>> data model is for storing blobs (split across columns) and has around 6 CF,
>>> RF=3 and all operations are at quorum. Also at the end of the run nodetool
>>> ring reports the same data size.
>>>
>>> Thanks
>>> Anand
>>>
>>
>>
>>
>> --
>> Tyler Hobbs
>> DataStax 
>>
>
>


Re: how to stop out of control compactions?

2013-04-04 Thread William Oberman
Ah, 0 is the magic?  Odd email thread now I asked about the best
practice of disabling compactions, greg said he set threshold = 10, you
+1'd, I said I couldn't set > 32, and now we're at 0 ;-)

will

On Wed, Apr 3, 2013 at 8:50 PM, aaron morton wrote:

>  And it appears I can't set min > 32
>
> Why did you want to set it so high ?
> If you want to disable compaction set it to 0.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 2/04/2013, at 8:43 PM, William Oberman 
> wrote:
>
> I just tried to use this setting (I'm using 1.1.9).  And it appears I
> can't set min > 32, as that's the max max now (using nodetool at least).
>  Not sure if JMX would allow more access, but I don't like bypassing things
> I don't fully understand.  I think I'll just leave my compaction killers
> running instead (not that killing compactions constantly isn't messing with
> things as well).
>
> will
>
>
> On Tue, Apr 2, 2013 at 10:43 AM, William Oberman  > wrote:
>
>> Edward, you make a good point, and I do think am getting closer to having
>> to increase my cluster size (I'm around ~300GB/node now).
>>
>> In my case, I think it was neither.  I had one node OOM after working on
>> a large compaction but it continued to run in a zombie like state
>> (constantly GC'ing), which I didn't have an alert on.  Then I had the bad
>> luck of a "close token" also starting a large compaction.  I have RF=3 with
>> some of my R/W patterns at quorum, causing that segment of my cluster to
>> get slow (e.g. a % of of my traffic started to slow).  I was running 1.1.2
>> (I haven't had to poke anything for quite some time, obviously), so I
>> upgraded before moving on (as I saw a lot of bug fixes to compaction issues
>> in release notes).  But the upgrade caused even more nodes to start
>> compactions.  Which lead to my original email... I had a cluster where 80%
>> of my nodes were compacting, and I really needed to boost production
>> traffic and couldn't seem to "tamp cassandra down" temporarily.
>>
>> Thanks for the advice everyone!
>>
>> will
>>
>>
>> On Tue, Apr 2, 2013 at 10:20 AM, Edward Capriolo 
>> wrote:
>>
>>> Settings do not make compactions go away. If your compactions are "out
>>> of control" it usually means one of these things,
>>> 1)  you have a corrupt table that the compaction never finishes on,
>>> sstables count keep growing
>>> 2) you do not have enough hardware to handle your write load
>>>
>>>
>>> On Tue, Apr 2, 2013 at 7:50 AM, William Oberman <
>>> ober...@civicscience.com> wrote:
>>>
 Thanks Gregg & Aaron. Missed that setting!

 On Tuesday, April 2, 2013, aaron morton wrote:

> Set the min and max
> compaction thresholds for a given column family
>
> +1 for setting the max_compaction_threshold (as well as the min) on
> the a CF when you are getting behind. It can limit the size of the
> compactions and give things a chance to complete in a reasonable time.
>
> Cheers
>
>-
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 2/04/2013, at 3:42 AM, Gregg Ulrich  wrote:
>
> You may want to set compaction threshold and not throughput.  If you
> set the min threshold to something very large (10), compactions will
> not start until cassandra finds this many files to compact (which it 
> should
> not).
>
> In the past I have used this to stop compactions on a node, and then
> run an offline major compaction to get though the compaction, then set the
> min threshold back.  Not everyone likes major compactions though.
>
>
>
>   setcompactionthreshold   
>  - Set the min and max
> compaction thresholds for a given column family
>
>
>
> On Mon, Apr 1, 2013 at 12:38 PM, William Oberman <
> ober...@civicscience.com> wrote:
>
>> I'll skip the prelude, but I worked myself into a bit of a jam.  I'm
>> recovering now, but I want to double check if I'm thinking about things
>> correct.
>>
>> Basically, I was in a state where a majority of my servers wanted to
>> do compactions, and rather large ones.  This was impacting my site
>> performance.  I tried nodetool stop COMPACTION.  I tried
>> setcompactionthroughput=1.  I tried restarting servers, but they'd 
>> restart
>> the compactions pretty much immediately on boot.
>>
>> Then I realized that:
>> nodetool stop COMPACTION
>> only stopped running compactions, and then the compactions would
>> re-enqueue themselves rather quickly.
>>
>> So, right now I have:
>> 1.) scripts running on N-1 servers looping on "nodetool stop
>> COMPACTION" in a tight loop
>> 2.) On the "Nth" server I've disabled gossip/thrift and turned up
>> setcompacti

Re: Any plans for read-before-write update operations in CQL3?

2013-04-04 Thread Vitalii Tymchyshyn
Well, a schema've just came to my mind, that looks interesting, so I want
to share:
1) Actions are introduced. Each action receives unique I'd at coordinator
node. Client can ask for a block of ids beforehand, to make actions
idempotent.
2) Actions are applied to given row+column value. It's possible that
special column family type should be created that support actions.
3) Actions are stored for grace period to ensure repair will be working
well.
4) Along with all the actions for grace period, old value, current value
and old value hash is stored.
5) Old value is the value without currently stored actions, current value
has all currently stored actions applied
6) Old value hash has number of actions applied, time of last action
applied and hash of all the applied actions ids  (only actions applied to
old value of course).
7) Current value is updated on read. So there can be actions that are not
applied yet. So on read, if there are unapplied actions, they are applied
and information about current value/applied actions is updated.
8) Actions can rely on order or not rely on order. If actions rely on order
and during update it is needed to apply out of order action, value is
recalculated, starting from old value.
9) During repair, highest (based on number of actions applied, then lowest
by time) old value is selected. Then all actions older or of the same time
of old value are dropped as already applied. Newer are merged into union
set.
10) During compaction, old value is moved to the now-grace period time.
The schema looks solid. Minus is that all the values for grace period must
be stored. May be it should be combined with some auto confirmation
mechanism when coordinator, after receiving acks for all the writes does
the second round notifying that action is fully written. This should work
for hinted handoff too. Than, old value can be propagated to the last acked
action.

4 квіт. 2013 04:59, "aaron morton"  напис.
>
> I would guess not.
>
>> I know this goes against keeping updates idempotent,
>
> There are also issues with consistency. i.e. is the read local or does it
happen at the CL level ?
> And it makes things go slower.
>
>>  We currently do things like this in client code, but it would be great
to be able to this on the server side to minimize the chance of race
conditions.
>
> Sometimes you can write the plus one into a new column and then apply the
changes in the reading client thread.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 4/04/2013, at 12:48 AM, Drew Kutcharian  wrote:
>
>> Hi Guys,
>>
>> Are there any short/long term plans to support UPDATE operations that
require read-before-write, such as increment on a numeric non-counter
column?
>> i.e.
>>
>> UPDATE CF SET NON_COUNTER_NUMERIC_COLUMN = NON_COUNTER_NUMERIC_COLUMN +
1;
>>
>> UPDATE CF SET STRING_COLUMN = STRING_COLUMN + "postfix";
>>
>> etc.
>>
>> I know this goes against keeping updates idempotent, but there are times
you need to do these kinds of operations. We currently do things like this
in client code, but it would be great to be able to this on the server side
to minimize the chance of race conditions.
>>
>> -- Drew
>
>


Re: Data Modeling: How to keep track of arbitrarily inserted column names?

2013-04-04 Thread Drew Kutcharian
Hi Edward,

I anticipate that the column names will be reused a lot. For example, key1 will 
be in many rows. So I think the number of distinct column names will be much 
much smaller than the number of rows. Is there a way to have a separate CF that 
keeps track of the column names? 

What I was thinking was to have a separate CF that I write only the column name 
with a null value in there every time I write a key/value to the main CF. In 
this case if that column name exist, then it will just be overridden. Now if I 
wanted to get all the column names, then I can just query that CF. Not sure if 
that's the best approach at high load (100k inserts a second).

-- Drew


On Apr 4, 2013, at 12:02 PM, Edward Capriolo  wrote:

> You can not get only the column name (which you are calling a key) you can 
> use get_range_slice which returns all the columns. When you specify an empty 
> byte array (new byte[0]{}) as the start and finish you get back all the 
> columns. From there you can return only the columns to the user in a format 
> that you like.
> 
> 
> On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian  wrote:
> Hey Guys,
> 
> I'm working on a project and one of the requirements is to have a schema free 
> CF where end users can insert arbitrary key/value pairs per row. What would 
> be the best way to know what are all the "keys" that were inserted 
> (preferably w/o any locking). For example,
> 
> Row1 => key1 -> XXX, key2 -> XXX
> Row2 => key1 -> XXX, key3 -> XXX
> Row3 => key4 -> XXX, key5 -> XXX
> Row4 => key2 -> XXX, key5 -> XXX
> …
> 
> The query would be give me all the inserted keys and the response would be 
> {key1, key2, key3, key4, key5}
> 
> Thanks,
> 
> Drew
> 
> 



gossip not working

2013-04-04 Thread S C
I was in the middle of upgrade to 1.1.9. I brought one node with 1.1.9 while 
the other were running on 1.1.5. Once one of the node was on 1.1.9 it is no 
longer recognizing other nodes in the ring.
On 192.168.56.10 and 11
192.168.56.10  DC1-CassRAC1Up Normal  28.06 GB50.00%
  0   192.168.56.11  DC1-Cass   
 RAC1Up Normal  31.59 GB25.00%  
42535295865117307932921825928971026432  192.168.56.12  DC1-CassRAC1 
   Down   Normal  29.02 GB25.00%  
85070591730234615865843651857942052864

On 192.168.56.12
192.168.56.10  DC1-CassRAC1Down Normal  28.06 GB50.00%  
0   192.168.56.11  DC1-Cass 
   RAC1Down Normal  31.59 GB25.00%  
42535295865117307932921825928971026432  192.168.56.12  DC1-CassRAC1 
   Up   Normal  29.02 GB25.00%  
85070591730234615865843651857942052864

I do not see anything in the logs that tells me that there is a gossip issue.
nodetool infoToken: 85070591730234615865843651857942052864Gossip 
active: trueThrift active: trueLoad : 29.05 GBGeneration No 
   : 1365114563Uptime (seconds) : 2127Heap Memory (MB) : 848.71 / 
7945.94Exceptions   : 0Key Cache: size 2208 (bytes), capacity 
104857584 (bytes), 1056 hits, 1099 requests, 0.961 recent hit rate, 14400 save 
period in secondsRow Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 
0 requests, NaN recent hit rate, 0 save period in seconds
nodetool infoToken: 42535295865117307932921825928971026432Gossip 
active: trueThrift active: trueLoad : 31.59 GBGeneration No 
   : 1364413038Uptime (seconds) : 703904Heap Memory (MB) : 733.02 / 
7945.94Exceptions   : 1Key Cache: size 3693312 (bytes), capacity 
104857584 (bytes), 26071678 hits, 26616282 requests, 0.980 recent hit rate, 
14400 save period in secondsRow Cache: size 0 (bytes), capacity 0 
(bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds


There is no firewall between the nodes and I can reach each other on storage 
port. What else should I be looking at to find root cause? Appreciate your 
inputs.   

Re: Data Modeling: How to keep track of arbitrarily inserted column names?

2013-04-04 Thread Edward Capriolo
Your reverse index of "which rows contain a column named X" will have very
wide rows. You could look at cassandra's secondary indexing, or possibly
look at a solandra/solr approach. Another option is you can shift the
problem slightly, "which rows have column X that was added between time y
and time z". Remember with few distinct column names that reverse index of
column to row is going to be a very big list.


On Thu, Apr 4, 2013 at 5:45 PM, Drew Kutcharian  wrote:

> Hi Edward,
>
> I anticipate that the column names will be reused a lot. For example, key1
> will be in many rows. So I think the number of distinct column names will
> be much much smaller than the number of rows. Is there a way to have a
> separate CF that keeps track of the column names?
>
> What I was thinking was to have a separate CF that I write only the column
> name with a null value in there every time I write a key/value to the main
> CF. In this case if that column name exist, then it will just be
> overridden. Now if I wanted to get all the column names, then I can just
> query that CF. Not sure if that's the best approach at high load (100k
> inserts a second).
>
> -- Drew
>
>
> On Apr 4, 2013, at 12:02 PM, Edward Capriolo 
> wrote:
>
> You can not get only the column name (which you are calling a key) you can
> use get_range_slice which returns all the columns. When you specify an
> empty byte array (new byte[0]{}) as the start and finish you get back all
> the columns. From there you can return only the columns to the user in a
> format that you like.
>
>
> On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian  wrote:
>
>> Hey Guys,
>>
>> I'm working on a project and one of the requirements is to have a schema
>> free CF where end users can insert arbitrary key/value pairs per row. What
>> would be the best way to know what are all the "keys" that were inserted
>> (preferably w/o any locking). For example,
>>
>> Row1 => key1 -> XXX, key2 -> XXX
>> Row2 => key1 -> XXX, key3 -> XXX
>> Row3 => key4 -> XXX, key5 -> XXX
>> Row4 => key2 -> XXX, key5 -> XXX
>> …
>>
>> The query would be give me all the inserted keys and the response would
>> be {key1, key2, key3, key4, key5}
>>
>> Thanks,
>>
>> Drew
>>
>>
>
>


Re: gossip not working

2013-04-04 Thread Paul Sudol
What errors are you seeing in the log files of the down nodes? Did you run 
upgradesstables? You need to upgradesstables when moving from < 1.1.7 to 1.1.9

On Apr 4, 2013, at 6:11 PM, S C  wrote:

> I was in the middle of upgrade to 1.1.9. I brought one node with 1.1.9 while 
> the other were running on 1.1.5. Once one of the node was on 1.1.9 it is no 
> longer recognizing other nodes in the ring.
> 
> On 192.168.56.10 and 11
> 
> 192.168.56.10  DC1-CassRAC1Up Normal  28.06 GB50.00%  
> 0   
> 192.168.56.11  DC1-CassRAC1Up Normal  31.59 GB25.00%  
> 42535295865117307932921825928971026432  
> 192.168.56.12  DC1-CassRAC1Down   Normal  29.02 GB25.00%  
> 85070591730234615865843651857942052864
> 
> 
> On 192.168.56.12
> 
> 192.168.56.10  DC1-CassRAC1Down Normal  28.06 GB
> 50.00%  0   
> 192.168.56.11  DC1-CassRAC1Down Normal  31.59 GB
> 25.00%  42535295865117307932921825928971026432  
> 192.168.56.12  DC1-CassRAC1Up   Normal  29.02 GB25.00%
>   85070591730234615865843651857942052864
> 
> 
> I do not see anything in the logs that tells me that there is a gossip issue.
> 
> nodetool info
> Token: 85070591730234615865843651857942052864
> Gossip active: true
> Thrift active: true
> Load : 29.05 GB
> Generation No: 1365114563
> Uptime (seconds) : 2127
> Heap Memory (MB) : 848.71 / 7945.94
> Exceptions   : 0
> Key Cache: size 2208 (bytes), capacity 104857584 (bytes), 1056 hits, 
> 1099 requests, 0.961 recent hit rate, 14400 save period in seconds
> Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, 
> NaN recent hit rate, 0 save period in seconds
> 
> nodetool info
> Token: 42535295865117307932921825928971026432
> Gossip active: true
> Thrift active: true
> Load : 31.59 GB
> Generation No: 1364413038
> Uptime (seconds) : 703904
> Heap Memory (MB) : 733.02 / 7945.94
> Exceptions   : 1
> Key Cache: size 3693312 (bytes), capacity 104857584 (bytes), 26071678 
> hits, 26616282 requests, 0.980 recent hit rate, 14400 save period in seconds
> Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, 
> NaN recent hit rate, 0 save period in seconds
> 
> 
> 
> There is no firewall between the nodes and I can reach each other on storage 
> port. 
> What else should I be looking at to find root cause? Appreciate your inputs.



Re: Data Modeling: How to keep track of arbitrarily inserted column names?

2013-04-04 Thread Drew Kutcharian
I don't really need to answer "what rows contain column named X", so no need 
for a reverse index here. All I want is a distinct set of all the column names, 
so I can answer "what are all the available column names"


On Apr 4, 2013, at 4:20 PM, Edward Capriolo  wrote:

> Your reverse index of "which rows contain a column named X" will have very 
> wide rows. You could look at cassandra's secondary indexing, or possibly look 
> at a solandra/solr approach. Another option is you can shift the problem 
> slightly, "which rows have column X that was added between time y and time 
> z". Remember with few distinct column names that reverse index of column to 
> row is going to be a very big list.
> 
> 
> On Thu, Apr 4, 2013 at 5:45 PM, Drew Kutcharian  wrote:
> Hi Edward,
> 
> I anticipate that the column names will be reused a lot. For example, key1 
> will be in many rows. So I think the number of distinct column names will be 
> much much smaller than the number of rows. Is there a way to have a separate 
> CF that keeps track of the column names? 
> 
> What I was thinking was to have a separate CF that I write only the column 
> name with a null value in there every time I write a key/value to the main 
> CF. In this case if that column name exist, then it will just be overridden. 
> Now if I wanted to get all the column names, then I can just query that CF. 
> Not sure if that's the best approach at high load (100k inserts a second).
> 
> -- Drew
> 
> 
> On Apr 4, 2013, at 12:02 PM, Edward Capriolo  wrote:
> 
>> You can not get only the column name (which you are calling a key) you can 
>> use get_range_slice which returns all the columns. When you specify an empty 
>> byte array (new byte[0]{}) as the start and finish you get back all the 
>> columns. From there you can return only the columns to the user in a format 
>> that you like.
>> 
>> 
>> On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian  wrote:
>> Hey Guys,
>> 
>> I'm working on a project and one of the requirements is to have a schema 
>> free CF where end users can insert arbitrary key/value pairs per row. What 
>> would be the best way to know what are all the "keys" that were inserted 
>> (preferably w/o any locking). For example,
>> 
>> Row1 => key1 -> XXX, key2 -> XXX
>> Row2 => key1 -> XXX, key3 -> XXX
>> Row3 => key4 -> XXX, key5 -> XXX
>> Row4 => key2 -> XXX, key5 -> XXX
>> …
>> 
>> The query would be give me all the inserted keys and the response would be 
>> {key1, key2, key3, key4, key5}
>> 
>> Thanks,
>> 
>> Drew
>> 
>> 
> 
> 



RE: gossip not working

2013-04-04 Thread S C
I am not seeing anything in the logs other than "Starting up server gossip" and 
there is no firewall between the nodes.
From: paulsu...@gmail.com
Subject: Re: gossip not working
Date: Thu, 4 Apr 2013 18:49:29 -0500
To: user@cassandra.apache.org

What errors are you seeing in the log files of the down nodes? Did you run 
upgradesstables? You need to upgradesstables when moving from < 1.1.7 to 1.1.9
On Apr 4, 2013, at 6:11 PM, S C  wrote:I was in the middle 
of upgrade to 1.1.9. I brought one node with 1.1.9 while the other were running 
on 1.1.5. Once one of the node was on 1.1.9 it is no longer recognizing other 
nodes in the ring.
On 192.168.56.10 and 11
192.168.56.10  DC1-CassRAC1Up Normal  28.06 GB50.00%
  0   192.168.56.11  DC1-Cass   
 RAC1Up Normal  31.59 GB25.00%  
42535295865117307932921825928971026432  192.168.56.12  DC1-CassRAC1 
   Down   Normal  29.02 GB25.00%  
85070591730234615865843651857942052864

On 192.168.56.12
192.168.56.10  DC1-CassRAC1Down Normal  28.06 GB50.00%  
0   192.168.56.11  DC1-Cass 
   RAC1Down Normal  31.59 GB25.00%  
42535295865117307932921825928971026432  192.168.56.12  DC1-CassRAC1 
   Up   Normal  29.02 GB25.00%  
85070591730234615865843651857942052864

I do not see anything in the logs that tells me that there is a gossip issue.
nodetool infoToken: 85070591730234615865843651857942052864Gossip 
active: trueThrift active: trueLoad : 29.05 GBGeneration No 
   : 1365114563Uptime (seconds) : 2127Heap Memory (MB) : 848.71 / 
7945.94Exceptions   : 0Key Cache: size 2208 (bytes), capacity 
104857584 (bytes), 1056 hits, 1099 requests, 0.961 recent hit rate, 14400 save 
period in secondsRow Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 
0 requests, NaN recent hit rate, 0 save period in seconds
nodetool infoToken: 42535295865117307932921825928971026432Gossip 
active: trueThrift active: trueLoad : 31.59 GBGeneration No 
   : 1364413038Uptime (seconds) : 703904Heap Memory (MB) : 733.02 / 
7945.94Exceptions   : 1Key Cache: size 3693312 (bytes), capacity 
104857584 (bytes), 26071678 hits, 26616282 requests, 0.980 recent hit rate, 
14400 save period in secondsRow Cache: size 0 (bytes), capacity 0 
(bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds


There is no firewall between the nodes and I can reach each other on storage 
port. What else should I be looking at to find root cause? Appreciate your 
inputs.
  

Re: Cassandra 1.0.10 to 1.2.3 upgrade "post-mortem"

2013-04-04 Thread Rustam Aliyev

On 04/04/2013 02:24, aaron morton wrote:

I just wanted to share our experience of upgrading 1.0.10 to 1.2.3

In general it's dangerous to skip a major release when upgrading.


True. But in that case it was supposed to be fine.
ERROR [MutationStage:33] 2013-03-31 09:00:02,899 CassandraDaemon.java 
(line 164) Exception in thread Thread[MutationStage:33,5,main]

java.lang.AssertionError: Missing host ID for 10.0.1.8

Was 10.0.1.8 been updated ?
IIRC not at this stage. 10.0.1.8 was second seed server (at that moment 
1.0.10) and this particular error appeared on the first seed server 
after upgrade to 1.2.3.


Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/04/2013, at 4:09 AM, Rustam Aliyev > wrote:



Hi,

I just wanted to share our experience of upgrading 1.0.10 to 1.2.3. 
It happened that first we upgraded both of our two seeds to 1.2.3. 
And basically after that old nodes couldn't communicate with new ones 
anymore. Cluster was down until we upgraded all nodes to 1.2.3. We 
don't have many nodes and that process didn't took long. Yet it 
caused outage for ~10 mins.


Here are some logs:

On the new, freshly upgraded seed node (v1.2.3):

ERROR [OptionalTasks:1] 2013-03-31 08:48:19,370 CassandraDaemon.java 
(line 164) Exception in thread Thread[OptionalTasks:1,5,main]

java.lang.NullPointerException
at 
org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:137)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)
WARN [MutationStage:20] 2013-03-31 08:48:23,613 StorageProxy.java 
(line 577) Unable to store hint for host with missing ID, /10.0.1.8 
(old node?)




ERROR [MutationStage:33] 2013-03-31 09:00:02,899 CassandraDaemon.java 
(line 164) Exception in thread Thread[MutationStage:33,5,main]

java.lang.AssertionError: Missing host ID for 10.0.1.8
at 
org.apache.cassandra.service.StorageProxy.writeHintForMutation(StorageProxy.java:580)
at 
org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:555)
at 
org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:1643)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)



At the same time, old nodes (v1.0.10) were blinded:


ERROR [RequestResponseStage:441] 2013-03-31 09:04:07,955 
AbstractCassandraDaemon.java (line 139) Fatal exception in thread 
Thread[RequestResponseStage:441,5,main]

java.io.IOError: java.io.EOFException
at 
org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71)
at 
org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:132)
at 
org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:45)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at 
org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:100)
at 
org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:81)
at 
org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64)

... 6 more

.

INFO [GossipStage:3] 2013-03-31 09:06:08,885 Gossiper.java (line 804) 
InetAddress /10.0.1.8 is now UP
ERROR [GossipStage:3] 2013-03-31 09:06:08,885 
AbstractCassandraDaemon.java (line 139) Fatal exception in thread 
Thread[GossipStage:3,5,main]

java.lang.UnsupportedOperationException: Not a time-based UUID
at java.util.UUID.timestamp(UUID.java:308)
at 
org.apache.cassandra.service.MigrationManager.updateHighestKnown(MigrationManager.java:121)
at 
org.apache.cassandra.service.MigrationManager.rectify(MigrationManager.java:99)
at 
org.apache.cassandra

RE: gossip not working

2013-04-04 Thread S C
Is there a way to force gossip among the nodes?

From: as...@outlook.com
To: user@cassandra.apache.org
Subject: RE: gossip not working
Date: Thu, 4 Apr 2013 19:59:45 -0500




I am not seeing anything in the logs other than "Starting up server gossip" and 
there is no firewall between the nodes.
From: paulsu...@gmail.com
Subject: Re: gossip not working
Date: Thu, 4 Apr 2013 18:49:29 -0500
To: user@cassandra.apache.org

What errors are you seeing in the log files of the down nodes? Did you run 
upgradesstables? You need to upgradesstables when moving from < 1.1.7 to 1.1.9
On Apr 4, 2013, at 6:11 PM, S C  wrote:I was in the middle 
of upgrade to 1.1.9. I brought one node with 1.1.9 while the other were running 
on 1.1.5. Once one of the node was on 1.1.9 it is no longer recognizing other 
nodes in the ring.
On 192.168.56.10 and 11
192.168.56.10  DC1-CassRAC1Up Normal  28.06 GB50.00%
  0   192.168.56.11  DC1-Cass   
 RAC1Up Normal  31.59 GB25.00%  
42535295865117307932921825928971026432  192.168.56.12  DC1-CassRAC1 
   Down   Normal  29.02 GB25.00%  
85070591730234615865843651857942052864

On 192.168.56.12
192.168.56.10  DC1-CassRAC1Down Normal  28.06 GB50.00%  
0   192.168.56.11  DC1-Cass 
   RAC1Down Normal  31.59 GB25.00%  
42535295865117307932921825928971026432  192.168.56.12  DC1-CassRAC1 
   Up   Normal  29.02 GB25.00%  
85070591730234615865843651857942052864

I do not see anything in the logs that tells me that there is a gossip issue.
nodetool infoToken: 85070591730234615865843651857942052864Gossip 
active: trueThrift active: trueLoad : 29.05 GBGeneration No 
   : 1365114563Uptime (seconds) : 2127Heap Memory (MB) : 848.71 / 
7945.94Exceptions   : 0Key Cache: size 2208 (bytes), capacity 
104857584 (bytes), 1056 hits, 1099 requests, 0.961 recent hit rate, 14400 save 
period in secondsRow Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 
0 requests, NaN recent hit rate, 0 save period in seconds
nodetool infoToken: 42535295865117307932921825928971026432Gossip 
active: trueThrift active: trueLoad : 31.59 GBGeneration No 
   : 1364413038Uptime (seconds) : 703904Heap Memory (MB) : 733.02 / 
7945.94Exceptions   : 1Key Cache: size 3693312 (bytes), capacity 
104857584 (bytes), 26071678 hits, 26616282 requests, 0.980 recent hit rate, 
14400 save period in secondsRow Cache: size 0 (bytes), capacity 0 
(bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds


There is no firewall between the nodes and I can reach each other on storage 
port. What else should I be looking at to find root cause? Appreciate your 
inputs.

  

Re: Cassandra services down frequently [Version 1.1.4]

2013-04-04 Thread Bryan Talbot
On Thu, Apr 4, 2013 at 1:27 AM,  wrote:

>
> After some time (1 hour / 2 hour) cassandra shut services on one or two
> nodes with follwoing errors;
>


Wonder what the workload and schema is like ...

We can see from below that you've tweaked and disabled many of the memory
"safety valve" and other memory related settings.  Those could be causing
issues too.



> hinted_handoff_throttle_delay_**in_ms: 0
> flush_largest_memtables_at: 1.0
> reduce_cache_sizes_at: 1.0
> reduce_cache_capacity_to: 0.6
> rpc_keepalive: true
> rpc_server_type: sync
> rpc_min_threads: 16
> rpc_max_threads: 2147483647
> in_memory_compaction_limit_in_**mb: 256
> compaction_throughput_mb_per_**sec: 16
> rpc_timeout_in_ms: 15000
> dynamic_snitch_badness_**threshold: 0.0
>