CQL3 dynamic CF design questions
Hi, I'm currently designing a backend service that would store user profile information for different applications. Most of the properties in a user profile would be unknown to the service and specified by the applications using the service, so the properties would need to be added dynamically. I was planning to use CQL3 and a dynamic column family defined something like this: CREATE TABLE user ( id UUID, propertyset_key TEXT, propertyset_val TEXT, PRIMARY KEY (id, propertyset_key) ); There would be N (assuming < 50) property sets associated with a user. The property set values would be complex object graphs represented as JSON. Which would lead to the storage engine storing rows similar to this (AFAIK): 8b2c0b60-977a-11e2-99c2-c8bcc8dc5d1d - basic_info:propertyset_val = { firstName:"john", lastName:"smith", ...} - contact_info:propertyset_val = { address: {streetAddr:"1 infinite loop", postalCode: ""}, ... } - meal_prefs:propertyset_val = { ... } - ... Any comments on this design? Another option would be to use the Cassandra map type for storing property sets like this: CREATE TABLE user ( id UUID, property_sets MAP, PRIMARY KEY (id) ); Based on the documentation I understood that each map element would internally be stored as separate a column, so are these user table definitions equivalent from the storage engine perspective? I'm using Astyanax which seems to support Cassandra collections. With the second definition, it should be possible to later migrate a dynamic property e.g. job_title to a static property, so that I could execute CQL queries like this: SELECT * FROM user WHERE job_title = 'developer'; but is it possible to accomplish that somehow with the first definition? Or should I create separate column families for static and dynamic properties instead? marko
Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10
Yes, I know blowing them away would fix it and that is what I did, but I want to understand why this happens in first place. I was upgrading from 1.1.10 to 1.2.3 On Fri, Apr 5, 2013 at 2:53 PM, Edward Capriolo wrote: > This has happened before the save caches files were not compatible between > 0.6 and 0.7. I have ran into this a couple other times before. The good > news is the save key cache is just an optimization, you can blow it away > and it is not usually a big deal. > > > > > On Fri, Apr 5, 2013 at 2:55 PM, Arya Goudarzi wrote: > >> Here is a chunk of bloom filter sstable skip messages from the node I >> enabled DEBUG on: >> >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39459 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39483 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39332 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39335 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39438 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39478 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39456 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39469 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39334 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39406 >> >> This is the last chunk of log before C* gets stuck, right before I stop >> the process, remove key caches and start again (This is from another node >> that I upgraded 2 days ago): >> >> INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,769 SSTableReader.java >> (line 166) Opening >> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-316499 >> (5273270 bytes) >> INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,858 SSTableReader.java >> (line 166) Opening >> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314755 >> (5264359 bytes) >> INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,894 SSTableReader.java >> (line 166) Opening >> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314762 >> (5260887 bytes) >> INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,980 SSTableReader.java >> (line 166) Opening >> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-315886 >> (5262864 bytes) >> INFO [OptionalTasks:1] 2013-04-03 01:59:40,298 AutoSavingCache.java >> (line 112) reading saved cache >> /var/lib/cassandra/saved_caches/cardspring_production-UniqueIndexes-KeyCache >> >> >> I finally upgrade all 12 nodes in our test environment yesterday. This >> issue seemed to exists on 7 nodes out of 12 nodes. They didn't alway get >> stuck on the same CF loading its saved KeyCache. >> >> >> On Fri, Apr 5, 2013 at 9:56 AM, aaron morton wrote: >> >>> skipping sstable due to bloom filter debug messages >>> >>> What were these messages? >>> >>> Do you have the logs from the start up ? >>> >>> Cheers >>> >>>- >>> Aaron Morton >>> Freelance Cassandra Consultant >>> New Zealand >>> >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 4/04/2013, at 6:11 AM, Arya Goudarzi wrote: >>> >>> Hi, >>> >>> I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to >>> 1.2.3. During startup while tailing C*'s system.log, I observed a series of >>> SSTable batch load messages and skipping sstable due to bloom filter debug >>> messages which is normal for startup, but when it reached loading saved key >>> caches, it gets stuck forever. The I/O wait stays high in the CPU graph and >>> I/O ops are sent to disk, but C* never passes that step of loading the key >>> cache file successfully. The saved key cache file was about 75MB on one >>> node and 125MB on the other node and they were for different CFs. >>> >>> >>> >>> The CPU I/O wait constantly stayed at 40%~ while system.log was stuck at >>> loading one saved key cache file. I have marked that on the graph above. >>> The workaround was to delete the saved cache files and things loaded fine >>> (See marked Normal Startup). >>> >>> These machines are m1.xlarge EC2 instances. And this issue happened on >>> both nodes upgraded. This did not happen during exercise of upgrade from >>> 1.1.6 to 1.2.2 using the same snapshot. >>> >>> Should I raise a JIRA? >>> >>> -Arya >>> >>> >>> >> >
Problems with shuffle
Hi, After upgrading to the vnodes I created and enabled shuffle operation as suggested. After running for a couple of hours I had to disable it because nodes were not catching up with compactions. I repeated this process 3 times (enable/disable). I have 5 nodes and each of them had ~35GB. After shuffle operations described above some nodes are now reaching ~170GB. In the log files I can see same files transferred 2-4 times to the same host within the same shuffle session. Worst of all, after all of these I had only 20 vnodes transferred out of 1280. So if it will continue at the same speed it will take about a month or two to complete shuffle. I had few question to better understand shuffle: 1. Does disabling and re-enabling shuffle starts shuffle process from scratch or it resumes from the last point? 2. Will vnode reallocations speedup as shuffle proceeds or it will remain the same? 3. Why I see multiple transfers of the same file to the same host? e.g.: INFO [Streaming to /10.0.1.8:6] 2013-04-07 14:27:10,038 StreamReplyVerbHandler.java (line 44) Successfully sent /u01/cassandra/data/Keyspace/Metadata/Keyspace-Metadata-ib-111-Data.db to /10.0.1.8 INFO [Streaming to /10.0.1.8:7] 2013-04-07 16:27:07,427 StreamReplyVerbHandler.java (line 44) Successfully sent /u01/cassandra/data/Keyspace/Metadata/Keyspace-Metadata-ib-111-Data.db to /10.0.1.8 4. When I enable/disable shuffle I receive warning message such as below. Do I need to worry about it? cassandra-shuffle -h localhost disable Failed to enable shuffling on 10.0.1.1! Failed to enable shuffling on 10.0.1.3! I couldn't find many docs on shuffle, only read through JIRA and original proposal by Eric. BR, Rustam.
Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10
It was not something you did wrong. The key cache format/classes involved changed, there are a few jira issues around this: https://issues.apache.org/jira/browse/CASSANDRA-4916 https://issues.apache.org/jira/browse/CASSANDRA-5253 Depending on how you moved between version you may or may not have been effected. On Sun, Apr 7, 2013 at 4:56 AM, Arya Goudarzi wrote: > Yes, I know blowing them away would fix it and that is what I did, but I > want to understand why this happens in first place. I was upgrading from > 1.1.10 to 1.2.3 > > > On Fri, Apr 5, 2013 at 2:53 PM, Edward Capriolo wrote: > >> This has happened before the save caches files were not compatible >> between 0.6 and 0.7. I have ran into this a couple other times before. The >> good news is the save key cache is just an optimization, you can blow it >> away and it is not usually a big deal. >> >> >> >> >> On Fri, Apr 5, 2013 at 2:55 PM, Arya Goudarzi wrote: >> >>> Here is a chunk of bloom filter sstable skip messages from the node I >>> enabled DEBUG on: >>> >>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >>> 737) Bloom filter allows skipping sstable 39459 >>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >>> 737) Bloom filter allows skipping sstable 39483 >>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >>> 737) Bloom filter allows skipping sstable 39332 >>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >>> 737) Bloom filter allows skipping sstable 39335 >>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >>> 737) Bloom filter allows skipping sstable 39438 >>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >>> 737) Bloom filter allows skipping sstable 39478 >>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >>> 737) Bloom filter allows skipping sstable 39456 >>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >>> 737) Bloom filter allows skipping sstable 39469 >>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line >>> 737) Bloom filter allows skipping sstable 39334 >>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line >>> 737) Bloom filter allows skipping sstable 39406 >>> >>> This is the last chunk of log before C* gets stuck, right before I stop >>> the process, remove key caches and start again (This is from another node >>> that I upgraded 2 days ago): >>> >>> INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,769 SSTableReader.java >>> (line 166) Opening >>> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-316499 >>> (5273270 bytes) >>> INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,858 SSTableReader.java >>> (line 166) Opening >>> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314755 >>> (5264359 bytes) >>> INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,894 SSTableReader.java >>> (line 166) Opening >>> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314762 >>> (5260887 bytes) >>> INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,980 SSTableReader.java >>> (line 166) Opening >>> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-315886 >>> (5262864 bytes) >>> INFO [OptionalTasks:1] 2013-04-03 01:59:40,298 AutoSavingCache.java >>> (line 112) reading saved cache >>> /var/lib/cassandra/saved_caches/cardspring_production-UniqueIndexes-KeyCache >>> >>> >>> I finally upgrade all 12 nodes in our test environment yesterday. This >>> issue seemed to exists on 7 nodes out of 12 nodes. They didn't alway get >>> stuck on the same CF loading its saved KeyCache. >>> >>> >>> On Fri, Apr 5, 2013 at 9:56 AM, aaron morton wrote: >>> skipping sstable due to bloom filter debug messages What were these messages? Do you have the logs from the start up ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 6:11 AM, Arya Goudarzi wrote: Hi, I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to 1.2.3. During startup while tailing C*'s system.log, I observed a series of SSTable batch load messages and skipping sstable due to bloom filter debug messages which is normal for startup, but when it reached loading saved key caches, it gets stuck forever. The I/O wait stays high in the CPU graph and I/O ops are sent to disk, but C* never passes that step of loading the key cache file successfully. The saved key cache file was about 75MB on one node and 125MB on the other node and they were for different CFs. The CPU I/O wait constantly stayed at 40%~ while syste
Re: Problems with shuffle
I am not familiar with shuffle, but if you attempt a shuffle and it fails if would be a good idea to let compaction die down, or even trigger major compaction on the nodes where the size grew. The reason is because once the data files are on disk, even if they are duplicates, cassandra does not know that fact. Thus if you do a move or shuffle again cassandra will try to move all that duplicated data again. In other words, if some failed operation grows the size of your data, deal with that first before trying that same operation again. FOr now your best bet is to run major compact on each node and get the data sizes small again. On Sun, Apr 7, 2013 at 8:43 AM, Rustam Aliyev wrote: > Hi, > > After upgrading to the vnodes I created and enabled shuffle operation as > suggested. After running for a couple of hours I had to disable it because > nodes were not catching up with compactions. I repeated this process 3 > times (enable/disable). > > I have 5 nodes and each of them had ~35GB. After shuffle operations > described above some nodes are now reaching ~170GB. In the log files I can > see same files transferred 2-4 times to the same host within the same > shuffle session. Worst of all, after all of these I had only 20 vnodes > transferred out of 1280. So if it will continue at the same speed it will > take about a month or two to complete shuffle. > > I had few question to better understand shuffle: > >1. Does disabling and re-enabling shuffle starts shuffle process from >scratch or it resumes from the last point? > > 2. Will vnode reallocations speedup as shuffle proceeds or it will >remain the same? > > 3. Why I see multiple transfers of the same file to the same host? >e.g.: > >INFO [Streaming to /10.0.1.8:6] 2013-04-07 14:27:10,038 >StreamReplyVerbHandler.java (line 44) Successfully sent >/u01/cassandra/data/Keyspace/Metadata/Keyspace-Metadata-ib-111-Data.db >to /10.0.1.8 >INFO [Streaming to /10.0.1.8:7] 2013-04-07 16:27:07,427 >StreamReplyVerbHandler.java (line 44) Successfully sent > /u01/cassandra/data/ >Keyspace/Metadata/Keyspace-Metadata-ib-111-Data.db to /10.0.1.8 > > 4. When I enable/disable shuffle I receive warning message such as >below. Do I need to worry about it? > > cassandra-shuffle -h localhost disable > Failed to enable shuffling on 10.0.1.1! > Failed to enable shuffling on 10.0.1.3! > > I couldn't find many docs on shuffle, only read through JIRA and original > proposal by Eric. > > BR, > Rustam. > >
Re: Cassandra services down frequently [Version 1.1.4]
It also use off-heap memory out of JVM. SerializingCacheProvider should be one of the case. Best Regards! Jian Jin 2013/4/6 > Thank you Aaron and Bryan for your advice. > > I have changed following parameters and now Cassandra running absolutely > fine. Please review below setting and advice am I right or right direction. > > cassandra-env.sh > #JVM_OPTS="$JVM_OPTS -ea" > MAX_HEAP_SIZE="6G" > HEAP_NEWSIZE="500M" > > cassandra.yaml > # do not persist caches to disk > key_cache_save_period: 0 > row_cache_save_period: 0 > > key_cache_size_in_mb: 512 > row_cache_size_in_mb: 14336 > row_cache_provider: SerializingCacheProvider > > I have a querry, if Cassandra is using JVM for all operations then why we > need change above parameters separately in cassandra.yaml. > > > Thanks & Regards > > Adeel Akbar > > > Quoting aaron morton : > > We can see from below that you've tweaked and disabled many of the >>> memory "safety valve" and other memory related settings. >>> >> Agree. >> Also you are running with JVM heap size of 3.81GB which is non default. >> For a 16GB node I would expect 8GB. >> >> Try restoring the yaml values to the defaults and allowing the >> cassandra-env.sh file to determine the memory size. >> >> Cheers >> >> - >> Aaron Morton >> Freelance Cassandra Consultant >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 5/04/2013, at 12:36 PM, Bryan Talbot wrote: >> >> On Thu, Apr 4, 2013 at 1:27 AM, wrote: >>> >>> After some time (1 hour / 2 hour) cassandra shut services on one or two >>> nodes with follwoing errors; >>> >>> >>> Wonder what the workload and schema is like ... >>> >>> We can see from below that you've tweaked and disabled many of the >>> memory "safety valve" and other memory related settings. Those could be >>> causing issues too. >>> >>> >>> hinted_handoff_throttle_delay_**in_ms: 0 >>> flush_largest_memtables_at: 1.0 >>> reduce_cache_sizes_at: 1.0 >>> reduce_cache_capacity_to: 0.6 >>> rpc_keepalive: true >>> rpc_server_type: sync >>> rpc_min_threads: 16 >>> rpc_max_threads: 2147483647 >>> in_memory_compaction_limit_in_**mb: 256 >>> compaction_throughput_mb_per_**sec: 16 >>> rpc_timeout_in_ms: 15000 >>> dynamic_snitch_badness_**threshold: 0.0 >>> >> >> >> >