Re: Cassandra won't restart : 7365....6c73 is not defined as a collection

2013-05-06 Thread aaron morton
Do you have the table definitions ? 
Any example data?
Something is confused about a set / map / list type. 

It's failing when replying the log, if you want to work around move the commit 
log file out of the directory. There is a chance of data loss if this row 
mutation is being replied on all nodes. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/05/2013, at 2:36 PM, Blair Zajac  wrote:

> Hello,
> 
> I'm running a 3-node development cluster on OpenStack VMs and recently 
> updated to DataStax's 1.2.4 debs on Ubuntu Raring after which the cluster was 
> fine.  I shut it down for a few days and after getting back to Cassandra 
> today and booting the VMs, Cassandra is unable to start. Below is the output 
> from output.log from one of the nodes.  None of the Cassandra nodes can start.
> 
> The deployment is pretty simple, two test keyspaces with a few column 
> families in each keyspace.  I am doing a lot of keyspace and column family 
> deletions as I'm testing some db style migration code to auto-setup a schema.
> 
> Any suggestions?
> 
> Blair
> 
> INFO 19:24:09,780 Logging initialized
> INFO 19:24:09,790 JVM vendor/version: Java HotSpot(TM) 64-Bit Server 
> VM/1.7.0_21
> INFO 19:24:09,791 Heap size: 880803840/880803840
> INFO 19:24:09,791 Classpath: 
> /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.6.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/guava-13.0.1.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.7.0.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/lz4-1.1.0.jar:/usr/share/cassandra/lib/metrics-core-2.0.3.jar:/usr/share/cassandra/lib/netty-3.5.9.Final.jar:/usr/share/cassand!
> ra/lib/ser
> vlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra/lib/snaptree-0.1.jar:/usr/share/cassandra/apache-cassandra-1.2.4.jar:/usr/share/cassandra/apache-cassandra-thrift-1.2.4.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/stress.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar
> INFO 19:24:09,987 JNA mlockall successful
> INFO 19:24:10,001 Loading settings from file:/etc/cassandra/cassandra.yaml
> INFO 19:24:10,371 Data files directories: [/var/lib/cassandra/data]
> INFO 19:24:10,372 Commit log directory: /var/lib/cassandra/commitlog
> INFO 19:24:10,372 DiskAccessMode 'auto' determined to be mmap, 
> indexAccessMode is mmap
> INFO 19:24:10,372 disk_failure_policy is stop
> INFO 19:24:10,377 Global memtable threshold is enabled at 280MB
> INFO 19:24:10,474 Not using multi-threaded compaction
> INFO 19:24:10,816 Initializing key cache with capacity of 42 MBs.
> INFO 19:24:10,822 Scheduling key cache save to each 14400 seconds (going to 
> save all keys).
> INFO 19:24:10,823 Initializing row cache with capacity of 0 MBs and provider 
> org.apache.cassandra.cache.SerializingCacheProvider
> INFO 19:24:10,827 Scheduling row cache save to each 0 seconds (going to save 
> all keys).
> INFO 19:24:10,958 Opening 
> /var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ib-165
>  (35 bytes)
> INFO 19:24:10,989 Opening 
> /var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ib-166
>  (168 bytes)
> INFO 19:24:10,991 Opening 
> /var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ib-164
>  (346 bytes)
> INFO 19:24:10,999 reading saved cache 
> /var/lib/cassandra/saved_caches/system-schema_keyspaces-KeyCache-b.db
> INFO 19:24:11,018 Opening 
> /var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-ib-461
>  (6562 bytes)
> INFO 19:24:11,024 reading saved cache 
> /var/lib/cassandra/saved_caches/system-schema_columnfamilies-KeyCache-b.db
> INFO 19:24:11,031 Opening 
> /var/lib/cassandra/data/system/schema_columns/system-schema_columns-ib-394 
> (465 bytes)
> INFO 19:24:11,032 Opening 
> /var/lib/cassandra/data/system/schema_columns/system-schema_columns-ib-395 
> (244 bytes)
> INFO 19:24:11,033 Opening 
> /var/lib/cassandra/data/system/schema_columns/system-schema_columns-ib-393 
> (3025 bytes)
> INFO 19:24:11,035 reading saved cache 
> /var/lib/cassandra/

Re: Repair session failed

2013-05-06 Thread aaron morton
Can your raise a ticket at https://issues.apache.org/jira/browse/CASSANDRA  and 
update the thread with the link?

Please include:
* nodetool status
* nodetool ring (so we have all the token assignments)
* The IP you started repair on 
* As much log as you can share, if you can run DEBUG for the 
org.apache.cassandra.service.AntiEntropyService it would be handy. 
* the command you used to start nodetool

A range selected for the repair is not fully contained by any of ranges the 
node replicates. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/05/2013, at 9:02 PM, Christopher Wirt  wrote:

> Hi Aaron,
>  
> We’re running 1.2.4, so with vNodes
>  
> We ran scrub but saw the issue again when repairing
>  
> nodetool status –
>  
> Datacenter: DC01
> =
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address   Load   Tokens  Owns   Host ID   
> Rack
> UN  10.70.48.23   35.16 GB   256 13.1%  
> 4a7bc489-25af-4c20-80f8-499ffcb18e2d  RAC1
> UN  10.70.6.7930.04 GB   256 12.6%  
> 98a1167f-cf75-4201-a454-695e0f7d2d72  RAC1
> UN  10.70.6.7841.94 GB   256 11.9%  
> 62a418b5-3c38-4f66-874d-8138d6d565e5  RAC1
> UN  10.70.47.66   54.79 GB   256 13.8%  
> ab564d16-4081-4866-b8ba-26461d9a93d7  RAC1
> UN  10.70.6.9146.96 GB   256 12.6%  
> 2e1e7179-82e6-4ae6-b986-383acc9fc8a2  RAC1
> UN  10.70.47.126  38.04 GB   256 11.8%  
> d4bed3b1-ffaf-4c68-b560-d270355c8c4b  RAC1
> Datacenter: DC02
> =
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address   Load   Tokens  Owns   Host ID   
> Rack
> UN  10.56.0.144   31.71 GB   256 12.0%  
> 1860011e-fa7c-4ce1-ad6b-c8a38a5ddd02  RAC1
> UN  10.56.0.140   86.28 GB   256 12.3%  
> f3fa985d-5056-4ddc-b146-d02432c3a86e  RAC1
>  
>  
> Thanks,
>  
> Chris
>  
>  
> From: aaron morton [mailto:aa...@thelastpickle.com] 
> Sent: 02 May 2013 19:31
> To: user@cassandra.apache.org
> Subject: Re: Repair session failed
>  
> Hold off on running scrub (but yes it's an online operation). This is an 
> issue with the token ranges. 
>  
> What version are you using ? 
> Are you using vNodes ?
> Can you share the output of nodetool ring (if no vnodes) or nodetool status 
> (if using vnodes) ?
>  
> Cheers
>  
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>  
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 2/05/2013, at 3:08 AM, Haithem Jarraya  wrote:
> 
> 
> Can I run scrub while the node is in the ring and receiving writes?
> Or I should disable thrift before?
>  
> 
> On 1 May 2013 15:52,  wrote:
> Sounds like a job for “nodetool scrub”, which rewrites the SStable rows in 
> the correct order. After the scrub, nodetool repair should succeed.
>  
> From: Haithem Jarraya [mailto:haithem.jarr...@struq.com] 
> Sent: Wednesday, May 01, 2013 5:46 PM
> To: user@cassandra.apache.org
> Subject: Repair session failed
>  
> Hi, 
>  
> I am seeing this error message during repair,
>  
>  INFO [AntiEntropyStage:1] 2013-05-01 14:30:54,300 AntiEntropyService.java 
> (line 764) [repair #ed104480-b26a-11e2-af9b-05179fa66b76] mycolumnfamily is 
> fully synced (1 remaining column family to sync for this session)
> ERROR [Thread-12725] 2013-05-01 14:30:54,304 StorageService.java (line 2420) 
> Repair session failed:
> java.lang.IllegalArgumentException: Requested range intersects a local range 
> but is not fully contained in one; this would lead to imprecise repair
> at 
> org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:175)
> at 
> org.apache.cassandra.service.AntiEntropyService$RepairSession.(AntiEntropyService.java:621)
> at 
> org.apache.cassandra.service.AntiEntropyService$RepairSession.(AntiEntropyService.java:610)
> at 
> org.apache.cassandra.service.AntiEntropyService.submitRepairSession(AntiEntropyService.java:127)
> at 
> org.apache.cassandra.service.StorageService.forceTableRepair(StorageService.java:2480)
> at 
> org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2416)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.lang.Thread.run(Thread.java:662)
>  
>  
> what does it mean imprecise repair?
> Is it maybe because I went over the gcgrade period?
> What you do if you go over that period?
> Any hint will be valuable, 
> Also I noticed when I run a repair on different node, I see a message like 
> this
>  
> [2013-05-01 14:30:54,305] Starting repair command #5, repairing 1120 ranges 
> for keyspace struqrea

Re: How does a healthy node look like?

2013-05-06 Thread aaron morton
Confirm if your write timeouts are client side socket time outs or the 
TimedOutException from the server. 

Typically write latency is related to GC problems, like you are seeing. 

I'm unsure how much CPU resources each cassandra instance has. Is there one 
node on a machine with 6 cores ? 
How many rows are on the node and how wide are the rows ? cfstats or 
cfhistgrams will help. 
Enable the gc logging, or use something like Data Stax OpsCentre, to see how 
low the heap gets after a CMS GC.

> The write-timeouts correlate with the hours of high (ca. >450/h) "GC for 
> ParNew". I never saw any read-timeouts. I set all timeouts to 20 seconds in 
> cassandra.yaml.
That'll do it. 

> To do so we iterate over all rows in the three time-line column families and 
> load the value of the column that is most recent given a cut-off timestamp.
…
> Every night we delete all events that are older than 2 days. Again in batches 
> of 100 rows.
Are you deleting rows from the CF's that you then do a range slice on ?
The tombstones may be hurting you on the range scans, can you remove them ? 

Hope that helps. 

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/05/2013, at 9:25 PM, Steppacher Ralf 
 wrote:

> Sure, I can do that.  
> 
> My main concern is write latency and the write timeouts we are experiencing. 
> Read latency is secondary, as long as we do not introduce timeouts on read 
> and do not exceed our sampling intervals (see below).
> 
> We are running Cassandra 1.2.1 on Ubuntu 12.04 with JDK 1.7.0_17 (64bit).
> The hardware is virtual but so far we are the only tenant on the physical 
> host. 
> 
> Hardware:
> - 1x6 cores with 2.3GHz 
> - 30GB RAM 
> - 1 physical disk for both the tx log and the data files
> - 2 x 1GB Ethernet combined into one virtual interface
> 
> Cassandra Config:
> Cassandra runs with 
> - 7.5GB of heap and 
> - 600MB of new gen space
> as calculated by the cassandra-env script.
> I have adjusted all cassandra.yaml settings where clear guidance is given, 
> e.g.  x .
> I have tried to increase and decrease heap (between 6 and 8GB) and new gen 
> size (between 300 and 1.1GB).
> I have tried compaction_throughput_mb_per_sec values between 16 and 48.
> I have disabled key caches.
> 
> Unfortunately Cassandra has to share the host with other Java processes, the 
> most resource demanding being ActiveMQ 5.8.
> 
> Log Output:
> Over the course of a day (08:00 to 22:00) I see in the logs
> - 280 and 760 "GC for ParNew" per hour (most around 300/h)
> - 60 and 180 "Completed flushing" per hour (most around 100/h)
> - 17 and 46 "Compacted N sstables to" per hour (most around 35/h)
> 
> Data Model:
> The data model is made up of 6 column families. 3 are dynamic to capture the 
> time-line of 3 event types; each event creates a new column and the value is 
> the row key of the event. 3 have a static schema and store the event itself.
> The largest event messages has 16 attributes. All are short text identifiers, 
> floating point numbers and timestamps. For storage in Cassandra every 
> attribute is converted to a string and stored with the utf8 validator.
> 
> Timeouts and Memory pressure:
> The write-timeouts correlate with the hours of high (ca. >450/h) "GC for 
> ParNew". I never saw any read-timeouts. I set all timeouts to 20 seconds in 
> cassandra.yaml.
> Cassandra comes under memory pressure ("Flushing CFS X to relieve memory 
> pressure") between 3 and 5 times a day. The tendency is for it to happen in 
> the afternoon and evening. But also sometimes right after 08:00 in the 
> morning. In about 75% of the cases it flushes one of the event column 
> families, in 25% a time-line column family.
> 
> Write Load:
> We collect events for a theoretical universe of 2.2 million items -> there 
> are a  max of 2.2 million rows in each of the time-line column families, but 
> I never saw an estimated row count in the cfstats of more than 1 million.
> Roughly 1/3 of the entities receive a maximum of 3 events, one of each event 
> type, in a 15 minutes interval from 08:00 to 22:00. The other 2/3 receive 3 
> events 3 times a day. About 16'000 entities receive only one event type, but 
> about once in 3 minutes. 
> On a typical day the load adds up to about 70 to 80 million messages.
> Not all messages are original though. The sources will re-send an event in 
> every interval if there are no new events. The noise ratio I do not know. I 
> guestimate it to be at least 50%. In case of a repeat the existing time-line 
> column and event row are updated with their previous values.
> 
> Read Load:
> In one hour intervals we sample a time coherent snapshot of the events. To do 
> so we iterate over all rows in the three time-line column families and load 
> the value of the column that is most recent given a cut-off timestamp. The 
> value is the row key of the actual event, which we then load as well. We do 
> that in batches of 100 

Re: Slow retrieval using secondary indexes

2013-05-06 Thread aaron morton
> 
> cqlsh:Sessions> select * from "Items" where "mahoutItemid" = 
> 610866442877251584;
> 
>  key| mahoutItemid
> +
>  687474703a2f2f6573706f7| 610866442877251584
> 
> unsupported operand type(s) for /: 'NoneType' and 'float'
Can you put together a process to replicate this and run cqlsh with the --debug 
command ? 
If so please write a ticket at https://issues.apache.org/jira/browse/CASSANDRA 

Thanks

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/05/2013, at 12:11 AM, Francisco Nogueira Calmon Sobral 
 wrote:

> Thanks!
> 
> The creation of the new CF worked pretty well and fast! Unfortunately, I was 
> unable to trace the request made using secondary indexes:
> 
> cqlsh:Sessions> select * from "Items" where key = '687474703a2f2f6573706f7';
> 
>  key| mahoutItemid
> +
>  687474703a2f2f6573706f7| 610866442877251584
> 
> 
> Tracing session: b0240a40-b3e9-11e2-a219-59599925ed5a
> 
>  activity   | timestamp| source   | source_elapsed
> +--+--+
>  execute_cql3_query | 09:05:03,845 | 10.32.63.148 |  0
>   Parsing statement | 09:05:03,845 | 10.32.63.148 | 36
>  Peparing statement | 09:05:03,845 | 10.32.63.148 |232
>   Row cache hit | 09:05:03,845 | 10.32.63.148 |577
>Request complete | 09:05:03,845 | 10.32.63.148 |785
> 
> cqlsh:Sessions> select * from "Items" where "mahoutItemid" = 
> 610866442877251584;
> 
>  key| mahoutItemid
> +
>  687474703a2f2f6573706f7| 610866442877251584
> 
> unsupported operand type(s) for /: 'NoneType' and 'float'
> 
> 
> Regards,
> Francisco Sobral
> 
> 
> On Apr 28, 2013, at 4:55 PM, aaron morton  wrote:
> 
>> Try the request tracing in 1.2 
>> http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2 it may point to 
>> the different. 
>> 
>>> In our model the secondary index in also unique, as the primary key is. Is 
>>> it better, in this case, to create another CF mapping the secondary index 
>>> to the key?
>> IMHO if you have a request that is frequently used as part of a hot code 
>> path it is still a good idea to support that with a custom CF. 
>> 
>> Cheers
>> 
>> -
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 27/04/2013, at 12:27 AM, Francisco Nogueira Calmon Sobral 
>>  wrote:
>> 
>>> Hi all!
>>> 
>>> We are using Cassandra 1.2.1 with a 8 node cluster running at Amazon. We 
>>> started with 6 nodes and added the 2 later. When performing some reads in 
>>> Cassandra, we observed a high difference between gets using the primary key 
>>> and gets using secondary indexes:
>>> 
>>> 
>>> [default@Sessions] get Users where mahoutUserid = 30127944399716352;
>>> ---
>>> RowKey: STQ0TTNII2LS211YYJI4GEV80M1SE8
>>> => (column=mahoutUserid, value=30127944399716352, 
>>> timestamp=1366820944696000)
>>> 
>>> 1 Row Returned.
>>> Elapsed time: 3508 msec(s).
>>> 
>>> [default@Sessions] get Users['STQ0TTNII2LS211YYJI4GEV80M1SE8'];
>>> => (column=mahoutUserid, value=30127944399716352, 
>>> timestamp=1366820944696000)
>>> Returned 1 results.
>>> 
>>> Elapsed time: 3.06 msec(s).
>>> 
>>> 
>>> In our model the secondary index in also unique, as the primary key is. Is 
>>> it better, in this case, to create another CF mapping the secondary index 
>>> to the key?
>>> 
>>> Best regards,
>>> Francisco Sobral.
>> 
> 



Re: Error on Range queries

2013-05-06 Thread aaron morton
> "Bad Request: No indexed columns present in by-columns clause with Equal 
> operator
> Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh."
> 
> My query is: select * from temp where min_update >10 limit 5;
You have to have at least one indexes column in the where clause that uses the 
equal operator. 

Cheers
 
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/05/2013, at 1:22 AM, himanshu.joshi  wrote:

> Hi,
> 
> I have created a 2 node test cluster in Cassandra version 1.2.3 with  
> Simple Strategy, Replication Factor 2 and ByteOrderedPartitioner(so as to get 
> Range Query functionality).
> 
> When i am using a range query on a secondary index in CQLSH,  I am getting 
> the error :
> 
> "Bad Request: No indexed columns present in by-columns clause with Equal 
> operator
> Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh."
> 
> My query is: select * from temp where min_update >10 limit 5;
> 
> 
> 
> My table structure is:
> 
> CREATE TABLE temp (
>   id bigint PRIMARY KEY,
>   archive_name text,
>   country_name text,
>   description text,
>   dt_stamp timestamp,
>   location_id bigint,
>   max_update bigint,
>   min_update bigint
> ) WITH COMPACT STORAGE AND
>   bloom_filter_fp_chance=0.01 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.00 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.10 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'SnappyCompressor'};
> 
> CREATE INDEX temp_min_update_idx ON temp (min_update);
> 
> 
> Range queries are working fine on primary key.
> 
> 
> I am getting the same error on another query of an another table temp2:
> 
> select * from temp2 where reffering_url='Some URL';
> 
> this table is also having the secondary index on this field("reffering_url")
> 
> Any help would be appreciated.
> -- 
> Thanks & Regards,
> Himanshu Joshi
> 



Re: Cassandra multi-datacenter

2013-05-06 Thread aaron morton
The broadcast_address can be set manually without using the 
EC2MultiRegionSnitch. It's the address the node wants other nodes to talk to it 
on 
http://www.datastax.com/docs/1.2/configuration/node_configuration#broadcast-address
 
You may find it easier to run a VPN between the colo nodes and the EC2 nodes.

Cheers
 
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/05/2013, at 5:25 AM, Daning Wang  wrote:

> Thanks Jabbar and Aaron.
> 
> Aaron - for broadcast_address , looks it is only working with 
> EC2MultiRegionSnitch. but in our case, we will have one center in colo, and 
> one center in ec2(sorry, did not make that clear, we'd like to replicate data 
> from colo to EC2)
> 
> So can we still use broadcast_address? or other solutions? Is that easy to 
> write a new Snitch for this?
> 
> Thanks,
> 
> Daning
> 
> 
> 
> On Thu, May 2, 2013 at 2:31 PM, aaron morton  wrote:
> Look at the broadcast_address in the yaml file
> Cheers
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 3/05/2013, at 9:10 AM, Jabbar Azam  wrote:
> 
>> I'm not sure why you want to use public Ip's in the other data centre. 
>> You're cassandra nodes in the other datacentre will be accessible from the 
>> internet
>> 
>> Personally I would use private IP addresses in the second data centre, on a 
>> different IP subnet.
>> 
>> A VPN is your only solution if you want to keep your data private and 
>> unhackable, as it's tunneling it's way through the internet
>> 
>> A slow network connection will mean your data is not in sync in both 
>> datacentres unless you explicitly specify quorum as your consisteny level in 
>> your mutation requests but your database throughput will be affected by this.
>> 
>> You bandwidth to the second datacentre and the quantity of your mutation 
>> requests will dictate how long it will take the second datacentre to get in 
>> sync with the primary datacentre.
>> 
>> 
>> I've probably missed something but there are plenty of intelligent people in 
>> this mailing list to fill the blanks :)
>> 
>> Thanks
>> 
>> Jabbar Azam
>> 
>> 
>> On 2 May 2013 20:28, Daning Wang  wrote:
>> Hi all,
>> 
>> We are deploying Cassandra on two data centers. there is slower network 
>> connection between data centers. 
>> 
>> Looks casandra should use internal ip to communicate with nodes in the same 
>> data center, and public ip to talk to nodes in other data center. We know 
>> VPN is a solution, but want to know if there is other idea.
>> 
>> Thanks in advance,
>> 
>> Daning
>> 
> 
> 



Re: How much heap does Cassandra 1.1.11 really need ?

2013-05-06 Thread aaron morton
My general "I can haz heap space?" approach. 

* determine total row count for the node from cfstats
* determine if wide (10's of MB) rows are in use
* determine total bloom filter space for the node from cfstats
* enable full GC logging as cassandra-env.sh
* determine tenured heap low point not long after startup and after running for 
a while. 

Consider locking the memtable_total_space_in_mb to 2048 rather than 1/3 heap 
while tuning. 

Consider changing JVM GC as below to check for premature tenuring (possibility 
with wide rows and wide reads):
HEAP_NEWSIZE = "1200M"
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4" 
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=4"

^ Look at the tenuring distribution to see how many objects are making it 
through 4 ParNew passes. You will want to return the settings to something 
closer to the defaults, maybe 1000M, SurvivorRatio 4, MaxTenuringThreshold 2

If > 500 million rows and/or bloom filter size if > 750 MB consider:
reduce bloom_filter_fp_chance (per cf) to 0.01 or 0.1 and nodetool 
upgradesstables
increase index_interval in yaml to reduce number of samples
watch keycache hit rate and consider increasing to 200MB

If you have a high tenured heap that is not decreasing after CMS the first 
place to look at the bloom filter and index samples. If this is an CF where the 
value is not specified then it's 0.000744 

Hope that helps. 
  
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/05/2013, at 7:20 AM, Oleg Dulin  wrote:

> What constitutes an "extreme write" ?
> 
> On 2013-05-03 15:45:33
>  +, Edward Capriolo said:
> 
> If your writes are so extreme that metables are flushing all the time, the 
> best you can do is turn off all caches, do bloom filters off heap, and then 
> instruct cassandra to use large portions of the heap as memtables. 
> 
> 
> On Fri, May 3, 2013 at 11:40 AM, Bryan Talbot  wrote:
> It's true that a 16GB heap is generally not a good idea; however, it's not 
> clear from the data provided what problem you're trying to solve.
> 
> What is it that you don't like about the default settings?
> 
> -Bryan
> 
> 
> 
> On Fri, May 3, 2013 at 4:27 AM, Oleg Dulin  wrote:
> Here is my question. It can't possibly be a good set up to use 16gig heap 
> space, but this is the best I can do. Setting it to default never worked well 
> for me, setting it to 8g doesn't work well either. It can't keep up with 
> flushing memtables. It is possibly that someone at some point may have broken 
> something in the config files. If I were to look for hints there, what should 
> I look at ?
> 
> Look at my gc log from Cassandra:
> 
> Starts off like this:
> 
> 2013-04-29T08:53:44.548-0400: 5.386: [GC 1677824K->11345K(16567552K), 
> 0.0509880 secs]
>2 2013-04-29T08:53:47.701-0400: 8.539: [GC 1689169K->42027K(16567552K), 
> 0.1269180 secs]
>3 2013-04-29T08:54:05.361-0400: 26.199: [GC 1719851K->231763K(16567552K), 
> 0.1436070 secs]
>4 2013-04-29T08:55:44.797-0400: 125.635: [GC 
> 1909587K->1480096K(16567552K), 1.2626270 secs]
>5 2013-04-29T08:58:44.367-0400: 305.205: [GC 
> 3157920K->2358588K(16567552K), 1.1198150 secs]
>6 2013-04-29T09:01:12.167-0400: 453.005: [GC 
> 4036412K->3634298K(16567552K), 1.0098650 secs]
>7 2013-04-29T09:03:35.204-0400: 596.042: [GC 
> 5312122K->4339703K(16567552K), 0.4597180 secs]
>8 2013-04-29T09:04:51.562-0400: 672.400: [GC 
> 6017527K->4956381K(16567552K), 0.5361800 secs]
>9 2013-04-29T09:04:59.205-0400: 680.043: [GC 
> 6634205K->5131825K(16567552K), 0.1741690 secs]
>   10 2013-04-29T09:05:06.638-0400: 687.476: [GC 
> 6809649K->5027933K(16567552K), 0.0607470 secs]
>   11 2013-04-29T09:05:13.908-0400: 694.747: [GC 
> 6705757K->5012439K(16567552K), 0.0624410 secs]
>   12 2013-04-29T09:05:20.909-0400: 701.747: [GC 
> 6690263K->5039538K(16567552K), 0.0618750 secs]
>   13 2013-04-29T09:06:35.914-0400: 776.752: [GC 
> 6717362K->5819204K(16567552K), 0.5738550 secs]
>   14 2013-04-29T09:08:05.589-0400: 866.428: [GC 
> 7497028K->6678597K(16567552K), 0.6781900 secs]
>   15 2013-04-29T09:08:12.458-0400: 873.296: [GC 
> 8356421K->6865736K(16567552K), 0.1423040 secs]
>   16 2013-04-29T09:08:18.690-0400: 879.529: [GC 
> 8543560K->6742902K(16567552K), 0.0516470 secs]
>   17 2013-04-29T09:08:24.914-0400: 885.752: [GC 
> 8420726K->6725877K(16567552K), 0.0517290 secs]
>   18 2013-04-29T09:08:31.008-0400: 891.846: [GC 
> 8403701K->6741781K(16567552K), 0.0532540 secs]
>   19 2013-04-29T09:08:37.201-0400: 898.039: [GC 
> 8419605K->6759614K(16567552K), 0.0563290 secs]
>   20 2013-04-29T09:08:43.493-0400: 904.331: [GC 
> 8437438K->6772147K(16567552K), 0.0569580 secs]
>   21 2013-04-29T09:08:49.757-0400: 910.595: [GC 
> 8449971K->6776883K(16567552K), 0.0558070 secs]
>   22 2013-04-29T09:08:55.973-0400: 916.812: [GC 
> 8454707K->6789404K(16567552K), 0.0577230 secs]
> 
> ……
> 
> 
> look what it is today:
> 
> 41536 2013-

Re: multitenant support with key spaces

2013-05-06 Thread Brian O'Neill

You may want to look at using virtual keyspaces:
http://hector-client.github.io/hector/build/html/content/virtual_keyspaces.html

And follow these tickets:
http://wiki.apache.org/cassandra/MultiTenant

-brian


On May 6, 2013, at 2:37 AM, Darren Smythe wrote:

> How many keyspaces can you reasonably have? We have around 500 customers and 
> expect that to double end of year. We're looking into C* and wondering if it 
> makes sense for a separate KS per customer?
> 
> If we have 1000 customers, so one KS per customer is 1000 keyspaces. Is that 
> something C* can handle efficiently? Each customer has about 10 GB of data 
> (not taking replication into account).
> 
> Or is this symptomatic of a bad design?
> 
> I guess the same question applies to our notion of breaking up the column 
> families into time ranges. We're naively trying to avoid having few large 
> CFs/KSs. Is/should that be a concern?
> 
> What are the tradeoffs of a smaller number of heavyweight KS/CFs vs. manually 
> sharding the data into more granular KSs/CFs?
> 
> Thanks for any info.

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/



Re: hector or astyanax

2013-05-06 Thread Hiller, Dean
I was under the impression that it is multiple requests using a single 
connectin PARALLEL not serial as they have request ids and the responses do as 
well so you can send a request while a previous request has no response just 
yet.

I think you do get a big speed advantage from the asynchronous nature as you do 
not need to hold up so many threads in your webserver while you have 
outstanding requests being processed.  The thrift async was not exactly async 
like I am suspecting the new java driver is, but have not verified(I hope it is)

Dean

From: Aaron Turner mailto:synfina...@gmail.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Sunday, May 5, 2013 5:27 PM
To: cassandra users 
mailto:user@cassandra.apache.org>>
Subject: Re: hector or astyanax



On Sun, May 5, 2013 at 1:09 PM, Derek Williams 
mailto:de...@fyrie.net>> wrote:
The binary protocol is able to multiplex multiple requests using a single 
connection, which can lead to much better performance (similar to HTTP vs 
SPDY). This is without comparing the performance of thrift vs binary protocol, 
which I assume the binary protocol would be faster since it is specialized for 
cassandra requests.


Curious why you think multiplexing multiple requests over a single connection 
(serial) is faster then multiple requests over multiple connections (parallel)?

And isn't Thrift a binary protocol?


--
Aaron Turner
http://synfin.net/ Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
"carpe diem quam minimum credula postero"


RE: Node went down and came back up

2013-05-06 Thread Dan Kogan
It seems that we did not have the JMX ports (1024+) opened in our firewall.  
Once we opened ports 1024+ the hinted handoffs completed and it seems that the 
cluster went back to normal.
Does that make sense?

Thanks,
Dan

This is what we saw in the logs after opening the ports:

INFO [HintedHandoff:1] 2013-05-05 14:52:41,925 ColumnFamilyStore.java (line 
659) Enqueuing flush of Memtable-HintsColumnFamily@726541064(33313153/41641441 
serialized/live bytes, 18009 ops)
 INFO [FlushWriter:4] 2013-05-05 14:52:41,926 Memtable.java (line 264) Writing 
Memtable-HintsColumnFamily@726541064(33313153/41641441 serialized/live bytes, 
18009 ops)
 INFO [FlushWriter:4] 2013-05-05 14:52:42,961 Memtable.java (line 305) 
Completed flushing 
/data/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-he-10-Data.db
 (33344642 bytes) for commitlog position 
ReplayPosition(segmentId=1367725930067, position=12449833)
 INFO [CompactionExecutor:16] 2013-05-05 14:52:42,969 CompactionTask.java (line 
109) Compacting 
[SSTableReader(path='/data/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-he-10-Data.db'),
 
SSTableReader(path='/data/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-he-9-Data.db')]
 INFO [HintedHandoff:1] 2013-05-05 14:52:43,419 HintedHandOffManager.java (line 
390) Finished hinted handoff of 7945 rows to endpoint /107.20.45.6


-Original Message-
From: Dan Kogan [mailto:d...@iqtell.com] 
Sent: Sunday, May 05, 2013 8:24 AM
To: user@cassandra.apache.org
Subject: Node went down and came back up

Hello,

Last night one of our nodes froze and the server had to be rebooted.  After it 
came up, the node joined the ring and everything looked normal.
However, this morning there seem to be some inconsistencies in the data (e.g. 
some nodes don't have a given record or have a different version of the record 
than other node).

There are also a lot of messages about hinted handoff in the logs that started 
after the node failure.
Like these:

INFO [HintedHandoff:1] 2013-05-05 11:22:23,339 HintedHandOffManager.java (line 
294) Started hinted handoff for token: 56713727820156410577229101238628035242 
with IP: /107.20.45.6  INFO [HintedHandoff:1] 2013-05-05 11:22:33,343 
HintedHandOffManager.java (line 372) Timed out replaying hints to /107.20.45.6; 
aborting further deliveries  INFO [HintedHandoff:1] 2013-05-05 11:22:33,344 
HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to 
endpoint /107.20.45.6  INFO [HintedHandoff:1] 2013-05-05 11:22:33,344 
HintedHandOffManager.java (line 294) Started hinted handoff for token: 0 with 
IP: /67.202.15.178  INFO [HintedHandoff:1] 2013-05-05 11:22:43,348 
HintedHandOffManager.java (line 372) Timed out replaying hints to 
/67.202.15.178; aborting further deliveries  INFO [HintedHandoff:1] 2013-05-05 
11:22:43,348 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 
rows to endpoint /67.202.15.178

Do we need to run repair on all nodes to get the cluster back to "normal" state?

Thanks for the help.

Dan Kogan


Cleanup the peers columnfamily

2013-05-06 Thread Shahryar Sedghi
I had a 4 node cluster in my dev environment and due to resource
limitation, I had to remove two nodes. Nodetool status shows only two nodes
on both machines , but peers table on one machine still shows entries of
the nodes with a null  rpc address. Thrift has no problem with it but new
Binary protocol client is slow connecting to that node because of the
entries.

Nodetool remove does recognize those removed nodes.It there a way through
the commands to remove those entries, Or I have to delete the row in the
table.

Thanks  in advance

Shahryar


Re: Cleanup the peers columnfamily

2013-05-06 Thread Sylvain Lebresne
What version of Cassandra are you using. If you're using 1.2.0 (or *were*
using 1.2.0 when the 2 nodes were removed), you might be seeing
https://issues.apache.org/jira/browse/CASSANDRA-5167.

> Or I have to delete the row in the table

That should work.


On Mon, May 6, 2013 at 4:22 PM, Shahryar Sedghi  wrote:

> I had a 4 node cluster in my dev environment and due to resource
> limitation, I had to remove two nodes. Nodetool status shows only two nodes
> on both machines , but peers table on one machine still shows entries of
> the nodes with a null  rpc address. Thrift has no problem with it but new
> Binary protocol client is slow connecting to that node because of the
> entries.
>
> Nodetool remove does recognize those removed nodes.It there a way through
> the commands to remove those entries, Or I have to delete the row in the
> table.
>
> Thanks  in advance
>
> Shahryar
>


Re: Hadoop jobs and data locality

2013-05-06 Thread cscetbon.ext
Unfortunately I've just tried with a new cluster with RandomPartitioner and it 
doesn't work better :

it may come from hadoop/pig modifications :

18:02:53|elia:hadoop cyril$ git diff --stat cassandra-1.1.5..cassandra-1.2.1 .
 .../apache/cassandra/hadoop/BulkOutputFormat.java  |   27 +--
 .../apache/cassandra/hadoop/BulkRecordWriter.java  |   55 +++---
 .../cassandra/hadoop/ColumnFamilyInputFormat.java  |  102 ++
 .../cassandra/hadoop/ColumnFamilyOutputFormat.java |   31 ++--
 .../cassandra/hadoop/ColumnFamilyRecordReader.java |   76 
 .../cassandra/hadoop/ColumnFamilyRecordWriter.java |   24 +--
 .../apache/cassandra/hadoop/ColumnFamilySplit.java |   32 ++--
 .../org/apache/cassandra/hadoop/ConfigHelper.java  |   73 ++--
 .../cassandra/hadoop/pig/CassandraStorage.java |  214 +---
 9 files changed, 380 insertions(+), 254 deletions(-)

Can anyone help on getting more mapper running ? Maybe we should open a bug 
report ?
--
Cyril SCETBON

On May 5, 2013, at 8:45 AM, Shamim mailto:sre...@yandex.ru>> 
wrote:

Hello,
  We have also came across this issue in our dev environment, when we upgrade 
Cassandra from 1.1.5 to 1.2.1 version. I have mentioned this issue in few times 
in this forum but haven't got any answer yet. For quick work around you can use 
pig.splitCombination false in your pig script to avoid this issue, but it will 
make one of your task with a very big amount of data. I can't figure out why 
this happening in newer version of Cassandra, strongly guess some thing goes 
wrong in Cassandra implementation of LoadFunc or in Murmur3Partition (it's my 
guess).
Here is my earliar post
http://www.mail-archive.com/user@cassandra.apache.org/msg28016.html
http://www.mail-archive.com/user@cassandra.apache.org/msg29425.html

Any comment from authors will be highly appreciated
P.S. please keep me in touch with any solution or hints.

--
Best regards
  Shamim A.



03.05.2013, 19:25, "cscetbon@orange.com" :
Hi,
I'm using Pig to calculate the sum of a columns from a columnfamily (scan of 
all rows) and I've read that input data locality is supported at 
http://wiki.apache.org/cassandra/HadoopSupport
However when I execute my Pig script Hadoop assigns only one mapper to the task 
and not one mapper on each node (replication factor = 1).  FYI, I've 8 mappers 
available (2 per node).
Is there anything that can disable the data locality feature ?

Thanks
--
Cyril SCETBON

_
 Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites 
ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez 
le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les 
messages electroniques etant susceptibles d'alteration, France Telecom - Orange 
decline toute responsabilite si ce message a ete altere, deforme ou falsifie. 
Merci. This message and its attachments may contain confidential or privileged 
information that may be protected by law; they should not be distributed, used 
or copied without authorisation. If you have received this email in error, 
please notify the sender and delete this message and its attachments. As emails 
may be altered, France Telecom - Orange is not liable for messages that have 
been modified, changed or falsified. Thank you.


_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.



Re: hector or astyanax

2013-05-06 Thread Aaron Turner
Just because you can batch queries or have the server process them out of
order doesn't make it fully "parellel".  You're still using a single TCP
connection which is by definition a serial data stream.  Basically, if you
send a bunch of queries which each return a large amount of data you've
effectively limited your query throughput to a single TCP connection.
 Using Thrift, each query result is returned in it's own TCP stream in
*parallel*.

Not saying the new API isn't great, doesn't have it's place or may have
better performance in certain situations, but generally speaking I would
refrain from making general claims without actual benchmarks to back them
up.   I do completely agree that Async interfaces have their place and have
certain advantages over multi-threading models, but it's just another tool
to be used when appropriate.

Just my .02. :)



On Mon, May 6, 2013 at 5:08 AM, Hiller, Dean  wrote:

> I was under the impression that it is multiple requests using a single
> connectin PARALLEL not serial as they have request ids and the responses do
> as well so you can send a request while a previous request has no response
> just yet.
>
> I think you do get a big speed advantage from the asynchronous nature as
> you do not need to hold up so many threads in your webserver while you have
> outstanding requests being processed.  The thrift async was not exactly
> async like I am suspecting the new java driver is, but have not verified(I
> hope it is)
>
> Dean
>
> From: Aaron Turner mailto:synfina...@gmail.com>>
> Reply-To: "user@cassandra.apache.org" <
> user@cassandra.apache.org>
> Date: Sunday, May 5, 2013 5:27 PM
> To: cassandra users  user@cassandra.apache.org>>
> Subject: Re: hector or astyanax
>
>
>
> On Sun, May 5, 2013 at 1:09 PM, Derek Williams  de...@fyrie.net>> wrote:
> The binary protocol is able to multiplex multiple requests using a single
> connection, which can lead to much better performance (similar to HTTP vs
> SPDY). This is without comparing the performance of thrift vs binary
> protocol, which I assume the binary protocol would be faster since it is
> specialized for cassandra requests.
>
>
> Curious why you think multiplexing multiple requests over a single
> connection (serial) is faster then multiple requests over multiple
> connections (parallel)?
>
> And isn't Thrift a binary protocol?
>
>
> --
> Aaron Turner
> http://synfin.net/ Twitter: @synfinatic
> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
> Windows
> Those who would give up essential Liberty, to purchase a little temporary
> Safety, deserve neither Liberty nor Safety.
> -- Benjamin Franklin
> "carpe diem quam minimum credula postero"
>



-- 
Aaron Turner
http://synfin.net/ Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
"carpe diem quam minimum credula postero"


Re: Cassandra won't restart : 7365....6c73 is not defined as a collection

2013-05-06 Thread Blair Zajac

Hi Aaron,

The keyspace consistent of 3 column families for user management, see below.

I have dropped these tables multiple times since I'm testing a script to 
automatically create the column families if they do not exists.  I have 
also been changing types, e.g. lock_tokens__ from MAP to 
MAP.


I have tar copies of /var/lib/cassandra from all three nodes if somebody 
wants to look.  Since making the tarballs, I blew the cluster away and 
re-initialized it from scratch.


BTW, would a drain before running '/etc/init.d/cassandra stop' have helped?

Regards,
Blair


CREATE TABLE account (
  pk_account UUID PRIMARY KEY,
  last_login_using TEXT,
  first_name TEXT,
  last_name TEXT,
  full_name TEXT,
  created_micros BIGINT,
  modified_micros BIGINT,
  lock_tokens__ MAP
);


CREATE TABLE external_account (
  pk_external_username TEXT PRIMARY KEY,
  pk_account UUID,
  primary_email_address TEXT,
  secondary_email_addresses SET,
  first_name TEXT,
  last_name TEXT,
  full_name TEXT,
  last_login_micros BIGINT,
  created_micros BIGINT,
  modified_micros BIGINT,
  lock_tokens__ MAP
);

CREATE TABLE email_address (
  pk_email_address TEXT PRIMARY KEY,
  pk_account UUID,
  pk_external_username SET,
  lock_tokens__ MAP
);


On 05/06/2013 01:14 AM, aaron morton wrote:

Do you have the table definitions ?
Any example data?
Something is confused about a set / map / list type.

It's failing when replying the log, if you want to work around move the
commit log file out of the directory. There is a chance of data loss if
this row mutation is being replied on all nodes.

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/05/2013, at 2:36 PM, Blair Zajac mailto:bl...@orcaware.com>> wrote:


Hello,

I'm running a 3-node development cluster on OpenStack VMs and recently
updated to DataStax's 1.2.4 debs on Ubuntu Raring after which the
cluster was fine.  I shut it down for a few days and after getting
back to Cassandra today and booting the VMs, Cassandra is unable to
start. Below is the output from output.log from one of the nodes.
 None of the Cassandra nodes can start.

The deployment is pretty simple, two test keyspaces with a few column
families in each keyspace.  I am doing a lot of keyspace and column
family deletions as I'm testing some db style migration code to
auto-setup a schema.

Any suggestions?

Blair

INFO 19:24:09,780 Logging initialized
INFO 19:24:09,790 JVM vendor/version: Java HotSpot(TM) 64-Bit Server
VM/1.7.0_21
INFO 19:24:09,791 Heap size: 880803840/880803840
INFO 19:24:09,791 Classpath:
/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.6.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/guava-13.0.1.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.7.0.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/lz4-1.1.0.jar:/usr/share/cassandra/lib/metrics-core-2.0.3.jar:/usr/share/cassandra/lib/netty-3.5.9.Final.jar:/usr/share/cass!

and!

ra/lib/ser
vlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra/lib/snaptree-0.1.jar:/usr/share/cassandra/apache-cassandra-1.2.4.jar:/usr/share/cassandra/apache-cassandra-thrift-1.2.4.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/stress.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar
INFO 19:24:09,987 JNA mlockall successful
INFO 19:24:10,001 Loading settings from file:/etc/cassandra/cassandra.yaml
INFO 19:24:10,371 Data files directories: [/var/lib/cassandra/data]
INFO 19:24:10,372 Commit log directory: /var/lib/cassandra/commitlog
INFO 19:24:10,372 DiskAccessMode 'auto' determined to be mmap,
indexAccessMode is mmap
INFO 19:24:10,372 disk_failure_policy is stop
INFO 19:24:10,377 Global memtable threshold is enabled at 280MB
INFO 19:24:10,474 Not using multi-threaded compaction
INFO 19:24:10,816 Initializing key cache with capacity of 42 MBs.
INFO 19:24:10,822 Scheduling key cache save to each 14400 seconds
(going to save all keys).
INFO 19:24:10,823 Initializing row cache with capacity of 0 MBs and
provider org.apache.cassandra.cache.SerializingCacheProvider
INFO 19:24:10,827 Scheduling row cache

Re: multitenant support with key spaces

2013-05-06 Thread Robert Coli
On Sun, May 5, 2013 at 11:37 PM, Darren Smythe  wrote:
> How many keyspaces can you reasonably have?

"Very Low Hundreds", though this relates more to CFs than Ks.

> If we have 1000 customers, so one KS per customer is 1000 keyspaces. Is that
> something C* can handle efficiently?

No.

> I guess the same question applies to our notion of breaking up the column
> families into time ranges. We're naively trying to avoid having few large
> CFs/KSs. Is/should that be a concern?

Very large rows are significantly worse than very large CFs or KS.

=Rob


Re: Node went down and came back up

2013-05-06 Thread Robert Coli
On Mon, May 6, 2013 at 6:20 AM, Dan Kogan  wrote:
> It seems that we did not have the JMX ports (1024+) opened in our firewall.  
> Once we opened ports 1024+ the hinted handoffs completed and it seems that 
> the cluster went back to normal.
> Does that make sense?

No, JMX should not be required for normal operation of Hinted Handoff.

=Rob


Re: SSTables not opened on new cluste

2013-05-06 Thread Robert Coli
On Sat, May 4, 2013 at 5:41 AM, Philippe  wrote:
> After trying every possible combination of parameters, config and the rest,
> I ended up downgrading the new node from 1.1.11 to 1.1.2 to match the
> existing 3 nodes. And that solved the issue immediately : the schema was
> propagated and the node started handling reads & writes.

As you have discovered...

Trying to upgrade Cassandra by :

1) Adding new node at new version
2) Upgrading old nodes

Is far less likely to work than :

1) Add new node at old version
2) Upgrade all nodes

=Rob


Re: Cassandra running High Load with no one using the cluster

2013-05-06 Thread Robert Coli
On Sat, May 4, 2013 at 9:22 PM, Aiman Parvaiz  wrote:
> We are using cassandra 1.1.0 and open-6-jdk

1.1.0 has significant issues, including non-working Hinted Handoff.

Also, OpenJDK is not officially supported.

Upgrade to 1.1.11 and Sun JDK.

=Rob


Re: hector or astyanax

2013-05-06 Thread Hiller, Dean
You have me thinking more.  I wonder in practice if 3 sockets is any faster 
than 1 socket when doing nio.  If your buffer sizes were small, maybe that 
would be the case.  Usually the nic buffers are big so when the selector fires 
it is reading from 3 buffers for 3 sockets or 1 buffer for one socket.  In both 
cases, all 3 requests are there in the buffers.  At any rate, my belief is it 
probably is still basically parallel performance on one socket though I have 
not tested my theory…..My theory being the real bottleneck on performance being 
the work cassandra has to do on the reads and such.

What about 20 sockets then(like someone has a pool).  Will it be any faster…not 
really sure as in the end you are still held up by the real bottleneck of 
reading from disk on the cassandra side.  We went to 20 threads in one case 
using 20 sockets with astyanax and received no performance 
improvement(synchronous but more sockets did not improve our performance).  Ie. 
It may be the case 90% of the time, one socket is just as fast as 10/20…..I 
would love to know the truth/answer to that though.

Later,
Dean


From: Aaron Turner mailto:synfina...@gmail.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Monday, May 6, 2013 10:57 AM
To: cassandra users 
mailto:user@cassandra.apache.org>>
Subject: Re: hector or astyanax

Just because you can batch queries or have the server process them out of order 
doesn't make it fully "parellel".  You're still using a single TCP connection 
which is by definition a serial data stream.  Basically, if you send a bunch of 
queries which each return a large amount of data you've effectively limited 
your query throughput to a single TCP connection.  Using Thrift, each query 
result is returned in it's own TCP stream in *parallel*.

Not saying the new API isn't great, doesn't have it's place or may have better 
performance in certain situations, but generally speaking I would refrain from 
making general claims without actual benchmarks to back them up.   I do 
completely agree that Async interfaces have their place and have certain 
advantages over multi-threading models, but it's just another tool to be used 
when appropriate.

Just my .02. :)



On Mon, May 6, 2013 at 5:08 AM, Hiller, Dean 
mailto:dean.hil...@nrel.gov>> wrote:
I was under the impression that it is multiple requests using a single 
connectin PARALLEL not serial as they have request ids and the responses do as 
well so you can send a request while a previous request has no response just 
yet.

I think you do get a big speed advantage from the asynchronous nature as you do 
not need to hold up so many threads in your webserver while you have 
outstanding requests being processed.  The thrift async was not exactly async 
like I am suspecting the new java driver is, but have not verified(I hope it is)

Dean

From: Aaron Turner 
mailto:synfina...@gmail.com>>>
Reply-To: 
"user@cassandra.apache.org>"
 
mailto:user@cassandra.apache.org>>>
Date: Sunday, May 5, 2013 5:27 PM
To: cassandra users 
mailto:user@cassandra.apache.org>>>
Subject: Re: hector or astyanax



On Sun, May 5, 2013 at 1:09 PM, Derek Williams 
mailto:de...@fyrie.net>>>
 wrote:
The binary protocol is able to multiplex multiple requests using a single 
connection, which can lead to much better performance (similar to HTTP vs 
SPDY). This is without comparing the performance of thrift vs binary protocol, 
which I assume the binary protocol would be faster since it is specialized for 
cassandra requests.


Curious why you think multiplexing multiple requests over a single connection 
(serial) is faster then multiple requests over multiple connections (parallel)?

And isn't Thrift a binary protocol?


--
Aaron Turner
http://synfin.net/ Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
"carpe diem quam minimum credula postero"



--
Aaron Turner
http://synfin.net/ Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
"carpe diem quam minimum credula postero"


Re: multitenant support with key spaces

2013-05-06 Thread Hiller, Dean
Another option may be virtual column families with PlayOrm.  We currently
do around 60,000 column families to store data from 60,000 different
sensors that keep feeding us information.

Dean

On 5/6/13 11:18 AM, "Robert Coli"  wrote:

>On Sun, May 5, 2013 at 11:37 PM, Darren Smythe 
>wrote:
>> How many keyspaces can you reasonably have?
>
>"Very Low Hundreds", though this relates more to CFs than Ks.
>
>> If we have 1000 customers, so one KS per customer is 1000 keyspaces. Is
>>that
>> something C* can handle efficiently?
>
>No.
>
>> I guess the same question applies to our notion of breaking up the
>>column
>> families into time ranges. We're naively trying to avoid having few
>>large
>> CFs/KSs. Is/should that be a concern?
>
>Very large rows are significantly worse than very large CFs or KS.
>
>=Rob



index_interval

2013-05-06 Thread Hiller, Dean
I heard a rumor that index_interval is going away?  What is the replacement for 
this?  (we have been having to play with this setting a lot lately as too big 
and it gets slow yet too small and cassandra uses way too much RAM…we are still 
trying to find the right balance with this setting).

Thanks,
Dean


RE: cost estimate about some Cassandra patchs

2013-05-06 Thread DE VITO Dominique
> De : aaron morton [mailto:aa...@thelastpickle.com]
> Envoyé : dimanche 28 avril 2013 22:54
> À : user@cassandra.apache.org
> Objet : Re: cost estimate about some Cassandra patchs
>
> > Does anyone know enough of the inner working of Cassandra to tell me how 
> > much work is needed to patch Cassandra to enable such communication 
> > vectorization/batch ?
>

> Assuming you mean "have the coordinator send multiple row read/write requests 
> in a single message to replicas"
>
> Pretty sure this has been raised as a ticket before but I cannot find one now.
>
> It would be a significant change and I'm not sure how big the benefit is. To 
> send the messages the coordinator places them in a queue, there is little 
> delay sending. Then it waits on them async. So there may be some saving on 
> networking but from the coordinators point of view I think the impact is 
> minimal.
>
> What is your use case?

Use case = rows with rowkey like (folder id, file id)
And operations read/write multiple rows with same folder id => so, it could 
make sense to have a partitioner putting rows with same "folder id" on the same 
replicas.

But so far, Cassandra is not able to exploit this locality as batch effect ends 
at the coordinator node.

So, my question about the cost estimate for patching Cassandra.

The closest (or exactly corresponding to my need ?) JIRA entries I have found 
so far are:

CASSANDRA-166: Support batch inserts for more than one key at once
https://issues.apache.org/jira/browse/CASSANDRA-166
=> "WON'T FIX" status

CASSANDRA-5034: Refactor to introduce Mutation Container in write path
https://issues.apache.org/jira/browse/CASSANDRA-5034
=> I am not very sure if it's related to my topic

Thanks.

Dominique



>
> Cheers
>
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com

On 27/04/2013, at 4:04 AM, DE VITO Dominique 
mailto:dominique.dev...@thalesgroup.com>> 
wrote:


Hi,

We are created a new partitioner that groups some rows with **different** row 
keys on the same replicas.

But neither the batch_mutate, or the multiget_slice are able to take 
opportunity of this partitioner-defined placement to vectorize/batch 
communications between the coordinator and the replicas.

Does anyone know enough of the inner working of Cassandra to tell me how much 
work is needed to patch Cassandra to enable such communication 
vectorization/batch ?

Thanks.

Regards,
Dominique





Re: Cassandra running High Load with no one using the cluster

2013-05-06 Thread Aiman Parvaiz
Correction, there was a typo in my original question, we are running cassandra 
1.1.10

Thanks and sorry for the inconvenience.
On May 6, 2013, at 10:23 AM, Robert Coli  wrote:

> including non-working Hinted Handoff



Re: hector or astyanax

2013-05-06 Thread Aaron Turner
>From my experience, your NIC buffers generally aren't the problem (or at
least it's easy to tune them to fix).  It's TCP.  Simply put, your raw NIC
throughput > single TCP socket throughput on most modern hardware/OS
combinations.  This is especially true as latency increases between the two
hosts.  This is why Bittorrent or "download accellerators" are often faster
then just downloading a large file via your browser or ftp client- they're
running multiple TCP connections in parallel compared to only one.

TCP is great for reliable, bi-directional, stream based communication.  Not
the best solution for high throughput though.  UDP is much better for that,
but then you loose all the features that TCP gives you and so then people
end up re-inventing the wheel (poorly I might add).

So yeah, I think the answer to the question of "which is faster" the answer
is "it depends on your queries".



On Mon, May 6, 2013 at 10:24 AM, Hiller, Dean  wrote:

> You have me thinking more.  I wonder in practice if 3 sockets is any
> faster than 1 socket when doing nio.  If your buffer sizes were small,
> maybe that would be the case.  Usually the nic buffers are big so when the
> selector fires it is reading from 3 buffers for 3 sockets or 1 buffer for
> one socket.  In both cases, all 3 requests are there in the buffers.  At
> any rate, my belief is it probably is still basically parallel performance
> on one socket though I have not tested my theory…..My theory being the real
> bottleneck on performance being the work cassandra has to do on the reads
> and such.
>
> What about 20 sockets then(like someone has a pool).  Will it be any
> faster…not really sure as in the end you are still held up by the real
> bottleneck of reading from disk on the cassandra side.  We went to 20
> threads in one case using 20 sockets with astyanax and received no
> performance improvement(synchronous but more sockets did not improve our
> performance).  Ie. It may be the case 90% of the time, one socket is just
> as fast as 10/20…..I would love to know the truth/answer to that though.
>
> Later,
> Dean
>
>
> From: Aaron Turner mailto:synfina...@gmail.com>>
> Reply-To: "user@cassandra.apache.org" <
> user@cassandra.apache.org>
> Date: Monday, May 6, 2013 10:57 AM
> To: cassandra users  user@cassandra.apache.org>>
> Subject: Re: hector or astyanax
>
> Just because you can batch queries or have the server process them out of
> order doesn't make it fully "parellel".  You're still using a single TCP
> connection which is by definition a serial data stream.  Basically, if you
> send a bunch of queries which each return a large amount of data you've
> effectively limited your query throughput to a single TCP connection.
>  Using Thrift, each query result is returned in it's own TCP stream in
> *parallel*.
>
> Not saying the new API isn't great, doesn't have it's place or may have
> better performance in certain situations, but generally speaking I would
> refrain from making general claims without actual benchmarks to back them
> up.   I do completely agree that Async interfaces have their place and have
> certain advantages over multi-threading models, but it's just another tool
> to be used when appropriate.
>
> Just my .02. :)
>
>
>
> On Mon, May 6, 2013 at 5:08 AM, Hiller, Dean  dean.hil...@nrel.gov>> wrote:
> I was under the impression that it is multiple requests using a single
> connectin PARALLEL not serial as they have request ids and the responses do
> as well so you can send a request while a previous request has no response
> just yet.
>
> I think you do get a big speed advantage from the asynchronous nature as
> you do not need to hold up so many threads in your webserver while you have
> outstanding requests being processed.  The thrift async was not exactly
> async like I am suspecting the new java driver is, but have not verified(I
> hope it is)
>
> Dean
>
> From: Aaron Turner mailto:synfina...@gmail.com
> >>>
> Reply-To: "user@cassandra.apache.org >>" <
> user@cassandra.apache.org user@cassandra.apache.org>>
> Date: Sunday, May 5, 2013 5:27 PM
> To: cassandra users  user@cassandra.apache.org>>>
> Subject: Re: hector or astyanax
>
>
>
> On Sun, May 5, 2013 at 1:09 PM, Derek Williams  de...@fyrie.net>>> wrote:
> The binary protocol is able to multiplex multiple requests using a single
> connection, which can lead to much better performance (similar to HTTP vs
> SPDY). This is without comparing the performance of thrift vs binary
> protocol, which I assume the binary protocol would be faster since it is
> specialized for cassandra request

Re:Hadoop jobs and data locality

2013-05-06 Thread Shamim
I think It will be better to open a issue in jira 
Best regards
  Shamim A.

 > Unfortunately I've just tried with a new cluster with RandomPartitioner and 
 > it doesn't work better : > > it may come from hadoop/pig modifications : > > 
 > 18:02:53|elia:hadoop cyril$ git diff --
stat cassandra-1.1.5..cassandra-1.2.1 . > 
.../apache/cassandra/hadoop/BulkOutputFormat.java | 27 +-- > 
.../apache/cassandra/hadoop/BulkRecordWriter.java | 55 +++--- > 
.../cassandra/hadoop/ColumnFamilyInputFormat.java | 102 ++ > 
.../cassandra/hadoop/ColumnFamilyOutputFormat.java | 31 ++-- > 
.../cassandra/hadoop/ColumnFamilyRecordReader.java | 76  > 
.../cassandra/hadoop/ColumnFamilyRecordWriter.java | 24 +-- > 
.../apache/cassandra/hadoop/ColumnFamilySplit.java | 32 ++-- > 
.../org/apache/cassandra/hadoop/ConfigHelper.java | 73 ++-- > 
.../cassandra/hadoop/pig/CassandraStorage.java | 214 +--- > 9 
files changed, 380 insertions(+), 254 deletions(-) > > Can anyone help on 
getting more mapper running ? Maybe we should open a bug report ? > > -- > 
Cyril SCETBON > > On May 5, 2013, at 8:45 AM, Shamim  wrote: > >> Hello, >> We 
have also came across this issue in our dev environment, when we upgrade 
Cassandra from 1.1.5 to 1.2.1 version. I have mentioned this issue in few times 
in this forum but haven't got any answer yet. For quick work around you can use 
pig.splitCombination false in your pig script to avoid this issue, but it will 
make one of your task with a very big amount of data. I can't figure out why 
this happening in newer version of Cassandra, strongly guess some thing goes 
wrong in Cassandra implementation of LoadFunc or in Murmur3Partition (it's my 
guess). >> Here is my earliar post >> 
http://www.mail-archive.com/user@cassandra.apache.org/msg28016.html >> 
http://www.mail-archive.com/user@cassandra.apache.org/msg29425.html >> >> Any 
comment from authors will be highly appreciated >> P.S. please keep me in touch 
with any solution or hints. >> >> -- >> Best regards >> Shamim A. >> >> 
03.05.2013, 19:25, "cscetbon@orange.com" : >> >>> Hi, >>> I'm using Pig to 
calculate the sum of a columns from a columnfamily (scan of all rows) and I've 
read that input data locality is supported at 
http://wiki.apache.org/cassandra/HadoopSupport >>> However when I execute my 
Pig script Hadoop assigns only one mapper to the task and not one mapper on 
each node (replication factor = 1). FYI, I've 8 mappers available (2 per node). 
>>> Is there anything that can disable the data locality feature ? >>> >>> 
Thanks >>> -- >>> Cyril SCETBON >>> >>> 
_
 Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites 
ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez 
le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les 
messages electroniques etant susceptibles d'alteration, France Telecom - Orange 
decline toute responsabilite si ce message a ete altere, deforme ou falsifie. 
Merci. This message and its attachments may contain confidential or privileged 
information that may be protected by law; they should not be distributed, used 
or copied without authorisation. If you have received this email in error, 
please notify the sender and delete this message and its attachments. As emails 
may be altered, France Telecom - Orange is not liable for messages that have 
been modified, changed or falsified. Thank you. > > 
_
 > > Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc > pas etre diffuses, 
exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, 
veuillez le signaler > a l'expediteur et le detruire ainsi que les pieces 
jointes. Les messages electroniques etant susceptibles d'alteration, > France 
Telecom - Orange decline toute responsabilite si ce message a ete altere, 
deforme ou falsifie. Merci. > > This message and its attachments may contain 
confidential or privileged information that may be protected by law; > they 
should not be distributed, used or copied without authorisation. > If you have 
received this email in error, please notify the sender and delete this message 
and its attachments. > As emails may be altered, France Telecom - Orange is not 
liable for messages that have been modified, changed or falsified. > Thank you. 
-- Best regards
  Shamim A.

Re: hector or astyanax

2013-05-06 Thread Derek Williams
Also have to keep in mind that it should be rare to only use a single
socket since you are usually making at least 1 connection per node in the
cluster (or local datacenter). There is also nothing enforcing that a
single client cannot open more than 1 connection to a node. In the end it
should come down to which protocol implementation is faster.


On Mon, May 6, 2013 at 11:58 AM, Aaron Turner  wrote:

> From my experience, your NIC buffers generally aren't the problem (or at
> least it's easy to tune them to fix).  It's TCP.  Simply put, your raw NIC
> throughput > single TCP socket throughput on most modern hardware/OS
> combinations.  This is especially true as latency increases between the two
> hosts.  This is why Bittorrent or "download accellerators" are often faster
> then just downloading a large file via your browser or ftp client- they're
> running multiple TCP connections in parallel compared to only one.
>
> TCP is great for reliable, bi-directional, stream based communication.
>  Not the best solution for high throughput though.  UDP is much better for
> that, but then you loose all the features that TCP gives you and so then
> people end up re-inventing the wheel (poorly I might add).
>
> So yeah, I think the answer to the question of "which is faster" the
> answer is "it depends on your queries".
>
>
>
> On Mon, May 6, 2013 at 10:24 AM, Hiller, Dean wrote:
>
>> You have me thinking more.  I wonder in practice if 3 sockets is any
>> faster than 1 socket when doing nio.  If your buffer sizes were small,
>> maybe that would be the case.  Usually the nic buffers are big so when the
>> selector fires it is reading from 3 buffers for 3 sockets or 1 buffer for
>> one socket.  In both cases, all 3 requests are there in the buffers.  At
>> any rate, my belief is it probably is still basically parallel performance
>> on one socket though I have not tested my theory…..My theory being the real
>> bottleneck on performance being the work cassandra has to do on the reads
>> and such.
>>
>> What about 20 sockets then(like someone has a pool).  Will it be any
>> faster…not really sure as in the end you are still held up by the real
>> bottleneck of reading from disk on the cassandra side.  We went to 20
>> threads in one case using 20 sockets with astyanax and received no
>> performance improvement(synchronous but more sockets did not improve our
>> performance).  Ie. It may be the case 90% of the time, one socket is just
>> as fast as 10/20…..I would love to know the truth/answer to that though.
>>
>> Later,
>> Dean
>>
>>
>> From: Aaron Turner mailto:synfina...@gmail.com>>
>> Reply-To: "user@cassandra.apache.org" <
>> user@cassandra.apache.org>
>> Date: Monday, May 6, 2013 10:57 AM
>> To: cassandra users > user@cassandra.apache.org>>
>> Subject: Re: hector or astyanax
>>
>> Just because you can batch queries or have the server process them out of
>> order doesn't make it fully "parellel".  You're still using a single TCP
>> connection which is by definition a serial data stream.  Basically, if you
>> send a bunch of queries which each return a large amount of data you've
>> effectively limited your query throughput to a single TCP connection.
>>  Using Thrift, each query result is returned in it's own TCP stream in
>> *parallel*.
>>
>> Not saying the new API isn't great, doesn't have it's place or may have
>> better performance in certain situations, but generally speaking I would
>> refrain from making general claims without actual benchmarks to back them
>> up.   I do completely agree that Async interfaces have their place and have
>> certain advantages over multi-threading models, but it's just another tool
>> to be used when appropriate.
>>
>> Just my .02. :)
>>
>>
>>
>> On Mon, May 6, 2013 at 5:08 AM, Hiller, Dean > > wrote:
>> I was under the impression that it is multiple requests using a single
>> connectin PARALLEL not serial as they have request ids and the responses do
>> as well so you can send a request while a previous request has no response
>> just yet.
>>
>> I think you do get a big speed advantage from the asynchronous nature as
>> you do not need to hold up so many threads in your webserver while you have
>> outstanding requests being processed.  The thrift async was not exactly
>> async like I am suspecting the new java driver is, but have not verified(I
>> hope it is)
>>
>> Dean
>>
>> From: Aaron Turner mailto:synfina...@gmail.com
>> >>>
>> Reply-To: "user@cassandra.apache.org> >>" <
>> user@cassandra.apache.org> user@cassandra.apache.org>>
>> Date: Sunday, May 5, 2013 5:27 PM
>> To: cassandra users > user@cassandra.apache.org> user@

RE: Node went down and came back up

2013-05-06 Thread Dan Kogan
Thanks.  So then, Hinted Handoff should be sent over port 7000 (or 7001 with 
SSL), correct?

-Original Message-
From: Robert Coli [mailto:rc...@eventbrite.com] 
Sent: Monday, May 06, 2013 1:19 PM
To: user@cassandra.apache.org
Subject: Re: Node went down and came back up

On Mon, May 6, 2013 at 6:20 AM, Dan Kogan  wrote:
> It seems that we did not have the JMX ports (1024+) opened in our firewall.  
> Once we opened ports 1024+ the hinted handoffs completed and it seems that 
> the cluster went back to normal.
> Does that make sense?

No, JMX should not be required for normal operation of Hinted Handoff.

=Rob


Re: Node went down and came back up

2013-05-06 Thread Robert Coli
On Mon, May 6, 2013 at 12:31 PM, Dan Kogan  wrote:
> Thanks.  So then, Hinted Handoff should be sent over port 7000 (or 7001 with 
> SSL), correct?

Yes, hinted handoff goes over the "storage protocol" port, which is
shared with the "gossip" port, 7000/1.

=Rob


Re: Cassandra running High Load with no one using the cluster

2013-05-06 Thread Bryan Talbot
On Sat, May 4, 2013 at 9:22 PM, Aiman Parvaiz  wrote:

>
> When starting this cluster we set
> > JVM_OPTS="$JVM_OPTS -Xss1000k"
>
>
>

Why did you increase the stack-size to 5.5 times greater than recommended?
 Since each threads now uses 1000KB minimum just for the stack, a large
number of threads will use a large amount of memory.

-Bryan


Re: How to use Write Consistency 'ANY' with SSTABLELOADER - DSE Cassandra 1.1.9

2013-05-06 Thread aaron morton

> While reading we are planning to use a CL of Quorum. So, we are hoping we
> will not hit any consistency issues before repair is run.
There will be a chance of getting inconsistencies if less then QUORUM nodes 
were involved in the load for each row. Assuming RF 3, if you have two adjacent 
nodes down then you have lost QUOURM. Otherwise you should be ok.

One thing I forgot to mention, you may get some value in increasing the 
phi_convict_threshold in the yaml or 
org.apache.cassandra.net:type=FailureDetector MBean, to 16. This will make it 
harder for a node to be ejected from the cluster, may want to turn it back to 8 
to 12 after the bulk load. 

> Do you see any
> better way of doing this?
In the case with 3 nodes and RF 3, put the a copy of the SSTables on each node 
and use nodetool refresh. That will almost instantly add them to the nodes. You 
sill have to handle the issues with down nodes in the same way as using bulk 
loader.  

Or run 10 million writes into the system. 

Cheers



-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 6/05/2013, at 12:34 PM, praveen.akun...@wipro.com wrote:

> Hi Aaron, Rob,
> 
> Thank you for your responses. Sorry about the delay in getting back to
> you. To answer your questions:
> 
> "Is this a once off data load or something you need to do regularly?"
> 
 This will be a regular load. We will have to do a load with 10 million
 records once in every 2 hours on our Production Cluster.
> 
 3 node cluster with CL 3 is our development environment. Our production
 will be a 10 node cluster with CL 3. We will not be able to use
 nodetool refresh there. Sorry, I should have been more clear.
> 
> At the moment, we are doing the below in the job shell script to work
> around this:
> 
> 1. At the start of the SSTable load job, the script checks if any of the
> nodes are down.
> 2. If any node is down, it runs SSTableloader with '-I' option. If all
> nodes are up, it runs the load normally.
> 3. In case any node goes down during the load and the load job fails, the
> script restarts the job from the beginning. This time it will use the -I
> option.
> 4. We are planning to schedule nodetool repair once everyday to handle
> these situations.
> 
> While reading we are planning to use a CL of Quorum. So, we are hoping we
> will not hit any consistency issues before repair is run. Do you see any
> better way of doing this?
> 
> Thanks & Best Regards,
> Praveen
> 
> 
> 
> 
> 
> From: aaron morton
> mailto:aa...@thelastpickle.com>>
> Reply-To: "user@cassandra.apache.org"
> mailto:user@cassandra.apache.org>>
> Date: Tuesday, April 30, 2013 1:47 AM
> To: "user@cassandra.apache.org"
> mailto:user@cassandra.apache.org>>
> Subject: Re: How to use Write Consistency 'ANY' with SSTABLELOADER - DSE
> Cassandra 1.1.9
> 
> One option you have with RF3 and 3 Nodes is to place a copy of all the
> SSTables on each node and use nodetool refresh to directly load the
> sstables into the node without any streaming.
> 
> 1. Please can anyone suggest how we can enforce Write Consistency level
> when using SSTABLELOADER?
> Bulk Loader does not use CL, it's more like a repair / bootstrap.
> If you have to skip a node then use repair.
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 30/04/2013, at 1:05 AM,
> praveen.akun...@wipro.com wrote:
> 
> Hi All,
> 
> We have a requirement to load approximately 10 million records, each
> record with approximately 100 columns. We are planning to use the
> Bulk-loader program to convert the data into SSTables and then load them
> using SSTABLELOADER.
> 
> Everything is working fine when all nodes are up and running and the
> performance is very good. However, when a node is down, the streaming
> fails and the operation stops. We have to run the SSTABLELOADER with
> option 'I' to exclude the node that is down. I was wondering if we can
> enforce Consistency level of 'ANY' with SSTABLELOADER as well.
> 
> We tried specifying the consistency level 'ANY' at Keyspace level.
> However, this is not being used by the SSTABLELOADER. It is still looking
> for all the nodes to be available.
> 
> 1. Please can anyone suggest how we can enforce Write Consistency level
> when using SSTABLELOADER?
> 
> 2. Will Sqoop be a good option in these scenarios? Do we have any
> performance stats generated while loading data into Cassandra with Sqoop?
> 
> Environment:
> 
> Cassandra 1.1.9 provided as part of DSE 3.0
> 3 Nodes
> Replication Factor ­ 3
> Consistency Level ­ ANY
> 
> Regards,
> Praveen
> 
> Wipro Limited (Company Regn No in UK - FC 019088)
> Address: Level 2, West wing, 3 Sheldon Square, London W2 6PS, United
> Kingdom. Tel +44 20 7432 8500 Fax: +44 20 7286 5703
> 
> VAT Number: 5

Re: index_interval

2013-05-06 Thread aaron morton
This is the closest I can find in Jira 
https://issues.apache.org/jira/browse/CASSANDRA-4478

It's a pretty handy tool to have in your tool kit, specially when you start to 
have over 1 billion rows per node.

A
 
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/05/2013, at 5:27 AM, "Hiller, Dean"  wrote:

> I heard a rumor that index_interval is going away?  What is the replacement 
> for this?  (we have been having to play with this setting a lot lately as too 
> big and it gets slow yet too small and cassandra uses way too much RAM…we are 
> still trying to find the right balance with this setting).
> 
> Thanks,
> Dean



Re: Error on Range queries

2013-05-06 Thread himanshu.joshi

Thanks aaron..

--
Regards
Himanshu Joshi


On 05/06/2013 02:22 PM, aaron morton wrote:

"Bad Request: No indexed columns present in by-columns clause with Equal 
operator
Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh."

My query is: select * from temp where min_update >10 limit 5;

You have to have at least one indexes column in the where clause that uses the 
equal operator.

Cheers
  
-

Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/05/2013, at 1:22 AM, himanshu.joshi  wrote:


Hi,

 I have created a 2 node test cluster in Cassandra version 1.2.3 with  
Simple Strategy, Replication Factor 2 and ByteOrderedPartitioner(so as to get 
Range Query functionality).

When i am using a range query on a secondary index in CQLSH,  I am getting the 
error :

"Bad Request: No indexed columns present in by-columns clause with Equal 
operator
Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh."

My query is: select * from temp where min_update >10 limit 5;



My table structure is:

CREATE TABLE temp (
   id bigint PRIMARY KEY,
   archive_name text,
   country_name text,
   description text,
   dt_stamp timestamp,
   location_id bigint,
   max_update bigint,
   min_update bigint
) WITH COMPACT STORAGE AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

CREATE INDEX temp_min_update_idx ON temp (min_update);


Range queries are working fine on primary key.


I am getting the same error on another query of an another table temp2:

select * from temp2 where reffering_url='Some URL';

this table is also having the secondary index on this field("reffering_url")

Any help would be appreciated.
--
Thanks & Regards,
Himanshu Joshi






--
Thanks & Regards,
Himanshu Joshi