Re: Use of SSD for commitlog

2012-08-08 Thread Terje Marthinussen
Probably you can get an intel 320 160GB or a Samsung 830 for the same price as the 146GB 15k rpm drive. Overprovision the SSD 20% and off you go. It will beat the HDD both sequentially and randomly. Terje On Aug 8, 2012, at 11:41 PM, Amit Kumar wrote: > > There is a really good presentation

Re: quick question about data layout on disk

2012-08-10 Thread Terje Marthinussen
Rowkey is stored only once in any sstable file. That is, in the spesial case where you get sstable file per column/value, you are correct, but normally, I guess most of us are storing more per key. Regards, Terje On 11 Aug 2012, at 10:34, Aaron Turner wrote: > Curious, but does cassandra stor

Re: Throughput decreases as latency increases with YCSB

2012-10-30 Thread Terje Marthinussen
Check how many concurrent real requests you have vs size of thread pools. Regards, Terje On 30 Oct 2012, at 13:28, Peter Bailis wrote: >> I'm using YCSB on EC2 with one m1.large instance to drive client load > > To add, I don't believe this is due to YCSB. I've done a fair bit of > client-sid

Re: Disable FS journaling

2014-05-20 Thread Terje Marthinussen
Journal enabled is faster on almost all operations. Recovery here is more about saving you from waiting 1/2 hour from a traditional full file system check. Feel free to wait if you want though! :) Regards, Terje > On 21 May 2014, at 01:11, Paulo Ricardo Motta Gomes > wrote: > > Thanks for

Re: compaction strategy

2011-05-09 Thread Terje Marthinussen
Yes, agreed. I actually think cassandra has to. And if you do not go down to that single file, how do you avoid getting into a situation where you can very realistically end up with 4-5 big sstables each having its own copy of the same data massively increasing disk requirements? Terje On Mon,

Re: compaction strategy

2011-05-09 Thread Terje Marthinussen
(non-overlapping!) pieces instead. > > > On Mon, May 9, 2011 at 12:46 PM, Terje Marthinussen < > tmarthinus...@gmail.com> wrote: > >> Yes, agreed. >> >> I actually think cassandra has to. >> >> And if you do not go down to that single file, how d

column bloat

2011-05-10 Thread Terje Marthinussen
Hi, If you make a supercolumn today, what you end up with is: - short + "Super Column name" - int (local deletion time) - long (delete time) Byte array of columns each with: - short + "column name" - int (TTL) - int (local deletion time) - long (timestamp) - int + "value of column" Th

Re: column bloat

2011-05-10 Thread Terje Marthinussen
> Anyway, to sum that up, expiring columns are 1 byte more and > non-expiring ones are 7 bytes > less. Not arguing, it's still fairly verbose, especially with tons of > very small columns. > Yes, you are right, sorry. Trying to do one thing to many at the same time. My brain filtered out part of t

Re: compaction strategy

2011-05-10 Thread Terje Marthinussen
> Everyone may be well aware of that, but I'll still remark that a minor > compaction > will try to merge "as many 20MB sstables as it can" up to the max > compaction > threshold (which is configurable). So if you do accumulate some newly > created > sstable at some point in time, the next minor co

Re: compaction strategy

2011-05-11 Thread Terje Marthinussen
> > > Not sure I follow you. 4 sstables is the minimum compaction look for > (by default). > If there is 30 sstables of ~20MB sitting there because compaction is > behind, you > will compact those 30 sstables together (unless there is not enough space > for > that and considering you haven't change

Re: column bloat

2011-05-11 Thread Terje Marthinussen
On Wed, May 11, 2011 at 8:06 AM, aaron morton wrote: > For a reasonable large amount of use cases (for me, 2 out of 3 at the > moment) supercolumns will be units of data where the columns (attributes) > will never change by themselves or where the data does not change anyway > (archived data). > >

Re: Excessive allocation during hinted handoff

2011-05-12 Thread Terje Marthinussen
Just out of curiosity is this on the receiver or sender side? I have been wondering a bit if the hint playback could need some adjustment. There is potentially quite big differences on how much is sent per throttle delay time depending on what your data looks like. Early 0.7 releases also built u

Re: Excessive allocation during hinted handoff

2011-05-12 Thread Terje Marthinussen
An if you have 10 nodes, do all of them happen to send hints to the two with GC? Terje On Thu, May 12, 2011 at 6:10 PM, Terje Marthinussen wrote: > Just out of curiosity is this on the receiver or sender side? > > I have been wondering a bit if the hint playback could need some >

Re: Memory Usage During Read

2011-05-14 Thread Terje Marthinussen
Out of curiosity, could you try to disable mmap as well? I had some problems here some time back and I wanted to see better what was going on and disabled the mmap. I actually don't think I have the same problem again, but I have seen javavm sizes up in 30-40MB with a heap of just 16. Haven't pa

Re: [RELEASE] 0.8.0

2011-06-05 Thread Terje Marthinussen
0.8 under load may turn out to be more stable and well behaving than any release so far Been doing a few test runs stuffing more than 1 billion records into a 12 node cluster and thing looks better than ever. VM's stable and nice at 11GB. No data corruptions, dead nodes, full GC's or any of the ot

Re: [RELEASE] 0.8.0

2011-06-06 Thread Terje Marthinussen
e had the per-CF memtable settings applied?) > > On Mon, Jun 6, 2011 at 12:00 AM, Terje Marthinussen > wrote: > > 0.8 under load may turn out to be more stable and well behaving than any > > release so far > > Been doing a few test runs stuffing more than 1 billion re

Re: [RELEASE] 0.8.0

2011-06-06 Thread Terje Marthinussen
id, I wonder why the hint has 0 subcolumns in the first place? Is that expected behaviour? Regards, Terje On Mon, Jun 6, 2011 at 10:09 PM, Terje Marthinussen wrote: > Of course I talked too soon. > I saw a corrupted commitlog some days back after killing cassandra and I > just came

Re: [RELEASE] 0.8.0

2011-06-06 Thread Terje Marthinussen
Yes, I am aware of it but it was not an alternative for this project which will face production soon. The patch I have is fairly non-intrusive (especially vs. 674) so I think it can be interesting depending on how quickly 674 will be integrated into cassandra releases. I plan to take a closer loo

Re: Troubleshooting IO performance ?

2011-06-07 Thread Terje Marthinussen
If you run iostat without output every few second, is the I/O stable or do you see very uneven I/O? Regards, Terje On Tue, Jun 7, 2011 at 11:12 AM, aaron morton wrote: > There is a big IO queue and reads are spending a lot of time in the queue. > > Some more questions: > - what version are you

Re: insufficient space to compact even the two smallest files, aborting

2011-06-10 Thread Terje Marthinussen
bug in the 0.8.0 release version. Cassandra splits the sstables depending on size and tries to find (by default) at least 4 files of similar size. If it cannot find 4 files of similar size, it logs that message in 0.8.0. You can try to reduce the minimum required files for compaction and it wil

Re: insufficient space to compact even the two smallest files, aborting

2011-06-10 Thread Terje Marthinussen
12 sounds perfectly fine in this case. 4 buckets, 3 in each bucket, the minimum default threshold _per is 4. Terje 2011/6/10 Héctor Izquierdo Seliva > > > El vie, 10-06-2011 a las 20:21 +0900, Terje Marthinussen escribió: > > bug in the 0.8.0 release version. > > > &g

Re: insufficient space to compact even the two smallest files, aborting

2011-06-10 Thread Terje Marthinussen
nor > compaction frequency, won't it? > > maki > > > 2011/6/10 Terje Marthinussen : > > bug in the 0.8.0 release version. > > Cassandra splits the sstables depending on size and tries to find (by > > default) at least 4 files of similar size. > > If it cannot fin

Re: insufficient space to compact even the two smallest files, aborting

2011-06-13 Thread Terje Marthinussen
That most likely happened just because after scrub you had new files and got over the "4" file minimum limit. https://issues.apache.org/jira/browse/CASSANDRA-2697 Is the bug report. 2011/6/13 Héctor Izquierdo Seliva > Hi All. I found a way to be able to compact. I have to call scrub on > the

repair and amount of transfers

2011-06-14 Thread Terje Marthinussen
Hi, I have been testing repairs a bit in different ways on 0.8.0 and I am curious on what to really expect in terms of data transferred. I would expect my data to be fairly consistent in this case from the start. More than a billion supercolumns, but there was no errors in feed and we have seen m

Re: repair and amount of transfers

2011-06-14 Thread Terje Marthinussen
Ah.. I just found Cassandra-2698 (I thought I had seen something about this)... I guess that means I have too see if I can find time to investigate if I have a reproducible case? Terje On Tue, Jun 14, 2011 at 4:21 PM, Terje Marthinussen wrote: > Hi, > > I have been testing repairs

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Terje Marthinussen
Even if the gc call cleaned all files, it is not really acceptable on a decent sized cluster due to the impact full gc has on performance. Especially non-needed ones. The delay in file deletion can also at times make it hard to see how much spare disk you actually have. We easily see 100% increas

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Terje Marthinussen
On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen < tmarthinus...@gmail.com> wrote: > Even if the gc call cleaned all files, it is not really acceptable on a > decent sized cluster due to the impact full gc has on performance. > Especially non-needed ones. > > Not accept

What triggers hint delivery?

2011-06-15 Thread Terje Marthinussen
Hi, I was looking quickly at source code tonight. As far as I could see from a quick code scan, hint delivery is only triggered as a state change from a node is down to when it enters up state? If this is indeed the case, it would potentially explain why we sometimes have hints on machines which

Re: What triggers hint delivery?

2011-06-15 Thread Terje Marthinussen
heartbeats maybe (potentially not all of them, but at a regular interval)? Terje On Thu, Jun 16, 2011 at 2:08 AM, Jonathan Ellis wrote: > On Wed, Jun 15, 2011 at 10:53 AM, Terje Marthinussen > wrote: > > I was looking quickly at source code tonight. > > As far as I could see from

Re: downgrading from cassandra 0.8 to 0.7.3

2011-06-15 Thread Terje Marthinussen
Can't help you with that. You may have to go the json2sstable route and re-import into 0.7.3 But... why would you want to go back to 0.7.3? Terje On Thu, Jun 16, 2011 at 10:30 AM, Anurag Gujral wrote: > Hi All, > I moved to cassandra 0.8.0 from cassandra-0.7.3 when I try to > move ba

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Terje Marthinussen
Watching this on a node here right now and it sort of shows how bad this can get. This node still has 109GB free disk by the way... INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space INFO [CompactionExecutor:5] 2011-06-16 09:12:23,

snitch & thrift

2011-06-16 Thread Terje Marthinussen
Hi all! Assuming a node ends up in GC land for a while, there is a good chance that even though it performs terribly and the dynamic snitching will help you to avoid it on the gossip side, it will not really help you much if thrift still accepts requests and the thrift interface has choppy perform

Re: Cassandra ACID

2011-06-26 Thread Terje Marthinussen
> > That being said, we do not provide isolation, which means in particular > that > reads *can* return a state where only parts of a batch update seems applied > (and it would clearly be cool to have isolation and I'm not even > saying this will > never happen). Out of curiosity, do you see any

Re: RAID or no RAID

2011-06-27 Thread Terje Marthinussen
If you have a quality HW raid controller with proper performance (and far from all have good performance) you cam definitely benefit from a battery backed up write cache on it, although the benefits will not be huge on raid 0. Unless you get a really good price on that high performance HW raid

Re: Re : get_range_slices result

2011-06-30 Thread Terje Marthinussen
It should of course be noted that how hard it is to load balance depends a lot on your dataset Some datasets load balances reasonably well even when ordered and use of the OPP is not a big problem at all (on the contrary) and in quite a few use cases with current HW, read performance really is

Re: Alternative Row Cache Implementation

2011-06-30 Thread Terje Marthinussen
We had a visitor from Intel a month ago. One question from him was "What could you do if we gave you a server 2 years from now that had 16TB of memory" I went Eh... using Java? 2 years is maybe unrealistic, but you can already get some quite acceptable prices even on servers in the 100GB

Re: Repair doesn't work after upgrading to 0.8.1

2011-06-30 Thread Terje Marthinussen
Unless it is a 0.8.1 RC or beta On Fri, Jul 1, 2011 at 12:57 PM, Jonathan Ellis wrote: > This isn't 2818 -- (a) the 0.8.1 protocol is identical to 0.8.0 and > (b) the whole cluster is on the same version. > > On Thu, Jun 30, 2011 at 9:35 PM, aaron morton > wrote: > > This seems to be a known is

Re: memory_locking_policy parameter in cassandra.yaml for disabling swap - has this variable been renamed?

2011-07-28 Thread Terje Marthinussen
On Jul 28, 2011, at 9:52 PM, Jonathan Ellis wrote: > This is not advisable in general, since non-mmap'd I/O is substantially > slower. I see this again and again as a claim here, but it is actually close to 10 years since I saw mmap'd I/O have any substantial performance benefits on any real

Re: memory_locking_policy parameter in cassandra.yaml for disabling swap - has this variable been renamed?

2011-07-28 Thread Terje Marthinussen
i/o itself (even on ssds). > On Jul 28, 2011 9:04 AM, "Terje Marthinussen" wrote: > > > > On Jul 28, 2011, at 9:52 PM, Jonathan Ellis wrote: > > > >> This is not advisable in general, since non-mmap'd I/O is substantially > >> slower. > >

Re: memory_locking_policy parameter in cassandra.yaml for disabling swap - has this variable been renamed?

2011-07-29 Thread Terje Marthinussen
On Fri, Jul 29, 2011 at 6:29 AM, Peter Schuller wrote: > > I would love to understand how people got to this conclusion however and > try to find out why we seem to see differences! > > I won't make any claims with Cassandra because I have never bothered > benchmarking the different in CPU usage

Re: For multi-tenant, is it good to have a key space for each tenant?

2011-08-25 Thread Terje Marthinussen
Depends of course a lot on how many tenants you have. Hopefully the new off heap memtables is 1.0 may help as well as java gc on large heaps is getting a much bigger issue than memory cost. Regards, Terje On 25 Aug 2011, at 14:20, Himanshi Sharma wrote: > > I am working on similar sort of st

Re: Using 5-6 bytes for cassandra timestamps vs 8…

2011-08-29 Thread Terje Marthinussen
I have a patch for trunk which I just have to get time to test a bit before I submit. It is for super columns and will use the super columns timestamp as the base and only store variant encoded offsets in the underlying columns. If the timestamp equals that of the SC, it will store nothing (ju

Re: hw requirements

2011-08-31 Thread Terje Marthinussen
SSD's definitely makes live simpler as you will get a lot less trouble with impact from things like compactions. Just beware that Cassandra expands data a lot due to storage overhead (for small columns), replication and needed space for compactions and repairs. It is well worth doing some rea

Re: Hinted handoff bug?

2011-12-01 Thread Terje Marthinussen
Sorry for not checking source to see if things have changed but i just remembered an issue I have forgotten to make jira for. In old days, nodes would periodically try to deliver queues. However, this was at some stage changed so it only deliver if a node is being marked up. However, you can d

Re: [RELEASE] Apache Cassandra 1.0.6 released

2011-12-16 Thread Terje Marthinussen
Works if you turn off mmap? We run without mmap and see hardly any difference in performance, but with huge benefits in the form of a memory consumption which can actually be monitored easily and it just seem like things are more stable this way in general. Just turn off and see how that work

Re: What is the future of supercolumns ?

2012-01-06 Thread Terje Marthinussen
Please realize that I do not make any decisions here and I am not part of the core Cassandra developer team. What has been said before is that they will most likely go away and at least under the hood be replaced by composite columns. Jonathan have however stated that he would like the supercol

Re: two dimensional slicing

2012-01-29 Thread Terje Marthinussen
On Sun, Jan 29, 2012 at 7:26 PM, aaron morton wrote: > and compare them, but at this point I need to focus on one to get > things working, so I'm trying to make a best initial guess. > > I would go for RP then, BOP may look like less work to start with but it > *will* bite you later. If you use an

Re: Much more native memory used by Cassandra then the configured JVM heap size

2012-06-21 Thread Terje Marthinussen
We run some fairly large and busy Cassandra setups. All of them without mmap. I have yet to see a benchmark which conclusively can say mmap is better (or worse for that matter) than standard ways of doing I/O and we have done many of them last 2 years by different people, with different tools an

Re: Cassandra as storage for cache data

2013-07-02 Thread Terje Marthinussen
If this is a tombstone problem as suggested by some, and it is ok to turn of replication as suggested by others, it may be an idea to do an optimization in cassandra where if replication_factor < 1: do not create tombstones Terje On Jul 2, 2013, at 11:11 PM, Dmitry Olshansky wrote: >

Re: Compression in Cassandra

2011-01-20 Thread Terje Marthinussen
Perfectly normal with 3-7x increase in data size depending on you data schema. Regards, Terje On 20 Jan 2011, at 23:17, "akshatbakli...@gmail.com" wrote: > I just did a du -h DataDump which showed 40G > and du -h CassandraDataDump which showed 170G > > am i doing something wrong. > have you o

Fill disks more than 50%

2011-02-23 Thread Terje Marthinussen
Hi, Given that you have have always increasing key values (timestamps) and never delete and hardly ever overwrite data. If you want to minimize work on rebalancing and statically assign (new) token ranges to new nodes as you add them so they always get the latest data Lets say you add a new n

Re: Fill disks more than 50%

2011-02-25 Thread Terje Marthinussen
I am suggesting that your probably want to rethink your scheme design > since partitioning by year is going to be bad performance since the > old servers are going to be nothing more then expensive tape drives. > You fail to see the obvious It is just the fact that most of the data is stale

Re: Fill disks more than 50%

2011-02-25 Thread Terje Marthinussen
> > > @Thibaut Britz > Caveat:Using simple strategy. > This works because cassandra scans data at startup and then serves > what it finds. For a join for example you can rsync all the data from > the node below/to the right of where the new node is joining. Then > join without bootstrap then cleanu

Re: 2x storage

2011-02-25 Thread Terje Marthinussen
Cassandra never compacts more than one column family at the time? Regards, Terje On 26 Feb 2011, at 02:40, Robert Coli wrote: > On Fri, Feb 25, 2011 at 9:22 AM, A J wrote: >> I read in some cassandra notes that each node should be allocated >> twice the storage capacity you wish it to contain.

Re: Argh: Data Corruption (LOST DATA) (0.7.0)

2011-03-04 Thread Terje Marthinussen
Hi, Did you get anywhere on this problem? I am seeing similar errors unfortunately :( I tried to add some quick error checking to the serialization, and it seems like the data is ok there. Some indication that this occurs in compaction and maybe in hinted handoff, but no indication that it occu

Re: Argh: Data Corruption (LOST DATA) (0.7.0)

2011-03-04 Thread Terje Marthinussen
We are seeing various other messages as well related to deserialization, so this seems to be some random corruption somewhere, but so far it may seem to be limited to supercolumns. Terje On Sat, Mar 5, 2011 at 2:26 AM, Terje Marthinussen wrote: > Hi, > > Did you get anywhere on thi

Re: Argh: Data Corruption (LOST DATA) (0.7.0)

2011-03-05 Thread Terje Marthinussen
Mar 4, 2011 at 7:04 PM, Benjamin Coverston < > ben.covers...@datastax.com> wrote: > >> Hi Terje, >> >> Can you attach the portion of your logs that shows the exceptions >> indicating corruption? Which version are you on right now? >> >> Ben >>

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Terje Marthinussen
I had similar errors in late 0.7.3 releases related to testing I did for the mails with subject "Argh: Data Corruption (LOST DATA) (0.7.0)". I do not see these corruptions or the above error anymore with 0.7.3 release as long as the dataset is created from scratch. The patch (2104) mentioned in th

secondary indexes on data imported by json2sstable

2011-03-14 Thread Terje Marthinussen
Hi, Should it be expected that secondary indexes are automatically regenerated when importing data using json2sstable? Or is there some manual procedure that needs to be done to generate them? Regards, Terje

balance between concurrent_[reads|writes] and feeding/reading threads i clients

2011-03-28 Thread Terje Marthinussen
Hi, I was pondering about how the concurrent_read and write settings balances towards max read/write threads in clients. Lets say we have 3 nodes, and concurrent read/write set to 8. That is, 8*3=24 threads for reading and writing. Replication factor is 3. Lets say we have clients that in total

Re: How to repair HintsColumnFamily?

2011-04-01 Thread Terje Marthinussen
Seeing similar errors on another system (0.7.4). Maybe something bogus with the hint columnfamilies. Terje On Mon, Mar 28, 2011 at 7:15 PM, Shotaro Kamio wrote: > I see. Then, I'll remove the HintsColumnFamily. > > Because our cluster has a lot of data, running repair takes much time > (more th

Re: balance between concurrent_[reads|writes] and feeding/reading threads i clients

2011-04-01 Thread Terje Marthinussen
ferenced on this page was > the model for using thread pools to manage access to resources > http://wiki.apache.org/cassandra/ArchitectureInternals > > In summary, don't worry about it unless you see the thread pools backing up > and messages being dropped. > > Hope that hel

Re: Timeout during stress test

2011-04-11 Thread Terje Marthinussen
I notice you have pending hinted handoffs? Look for errors related to that. We have seen occasional corruptions in the hinted handoff sstables, If you are stressing the system to its limits, you may also consider playing with more with the number of read/write threads (concurrent_reads/writes)

value of hinted handoff column not really empty...?

2011-04-13 Thread Terje Marthinussen
Hi, we do see occasional row corruptions now and then and especially in hinted handoffs. This may be related to fairly long rows (millions of columns) I was dumping one corrupted hint .db file and I noticed that they do in fact have values. The doc say Subcolumn values are always empty; instead

Re: raid 0 and ssd

2011-04-14 Thread Terje Marthinussen
Hm... You should notice that unless you have TRIM, which I don't think any OS support with any raid functionality yet, then once you have written once to the whole SSD, it is always full! That is, when you delete a file, you don't "clear" the blocks on the SSD so as far as the SSD goes, the data

Re: Multi-DC Deployment

2011-04-18 Thread Terje Marthinussen
Hum... Seems like it could be an idea in a case like this with a mode where result is always returned (if possible), but where a flay saying if the consistency level was met, or to what level it was met (number of nodes answering for instance).? Terje On Tue, Apr 19, 2011 at 1:13 AM, Jonathan El

Re: Multi-DC Deployment

2011-04-19 Thread Terje Marthinussen
EC2 nodes for > this. > > Adrian > > On Mon, Apr 18, 2011 at 11:16 PM, Terje Marthinussen > wrote: > > Hum... > > Seems like it could be an idea in a case like this with a mode where > result > > is always returned (if possible), but where a flay saying if

Re: Multi-DC Deployment

2011-04-20 Thread Terje Marthinussen
three nodes dead at once you don't lose 1% of the data (3/300) I > think you lose 1/(300*300*300) of the data (someone check my math?). > > If you want to always get a result, then you use "read one", if you > want to get a highly available better quality result use local

Re: Multi-DC Deployment

2011-04-20 Thread Terje Marthinussen
. Our code has some local dependencies, but could be the > basis for a generic solution. > > Adrian > > On Wed, Apr 20, 2011 at 6:08 PM, Terje Marthinussen > wrote: > > Assuming that you generally put an API on top of this, delivering to two > or > > more systems th

Re: Compacting single file forever

2011-04-22 Thread Terje Marthinussen
I think the really interesting part is how this node ended up in this state in the first place. There should be somewhere in the area of 340-500GB of data on it in when everything is 100% compacted. Problem now is that it used (we wiped it last night to test some 0.8 stuff) more then 1TB. To me,

multithreaded compaction causes mutation storms?

2011-04-24 Thread Terje Marthinussen
Tested out multithreaded compaction in 0.8 last night. We had first fed some data with compaction disabled so there was 1000+ sstables on the nodes and I decided to enable multithreaded compaction on one of them to see how it performed vs. nodes that had no compaction at all. Since this was sort

0.8 loosing nodes?

2011-04-24 Thread Terje Marthinussen
World as seen from .81 in the below ring .81 Up Normal 85.55 GB8.33% Token(bytes[30]) .82 Down Normal 83.23 GB8.33% Token(bytes[313230]) .83 Up Normal 70.43 GB8.33% Token(bytes[313437]) .84 Up Normal 81.7 GB 8.33% Token(bytes

Re: 0.7.4 Bad sstables?

2011-04-25 Thread Terje Marthinussen
I have been hunting similar looking corruptions, especially in the hints column family, but I believe it occurs somewhere while compacting. I looked in greater detail on one sstable and the row length was longer than the actual data in the row, and as far as I could see, either the length was wro

Re: 0.8 loosing nodes?

2011-04-25 Thread Terje Marthinussen
heartbeats again and other nodes log that they receive the heartbeats, but this will not get it marked as UP again until restarted. So, seems like 2 issues: - Nodes pausing (may be just node overload) - Nodes are not marked as UP unless restarted Regards, Terje On 24 Apr 2011, at 23:24, Terje

Re: 0.7.4 Bad sstables?

2011-04-25 Thread Terje Marthinussen
ome cases. > > On Mon, Apr 25, 2011 at 11:47 AM, Terje Marthinussen > wrote: >> I have been hunting similar looking corruptions, especially in the hints >> column family, but I believe it occurs somewhere while compacting. >> I looked in greater detail on one sstable and

Re: 0.7.4 Bad sstables?

2011-04-25 Thread Terje Marthinussen
First column in the row has offset in the file of 190226525, last valid column is at 380293592, about 181MB from first column to last. in_memory_compaction_limit was 128MB, so almost certainly above the limit. Terje On Tue, Apr 26, 2011 at 8:53 AM, Terje Marthinussen wrote: > In my c

multithreaded compaction

2011-04-26 Thread Terje Marthinussen
Hi, I was testing the multithreaded compactions and with 2x6 cores (24 with HT) it does seem a bit crazy with 24 compactions running concurrently. It is probably not very good in terms of random I/O. As such, I think I agree with the argument in 2191 that there should be a config option for this.

Re: multithreaded compaction

2011-04-26 Thread Terje Marthinussen
11 at 4:35 PM, Sylvain Lebresne wrote: > On Tue, Apr 26, 2011 at 9:01 AM, Terje Marthinussen > wrote: > > Hi, > > I was testing the multithreaded compactions and with 2x6 cores (24 with > HT) > > it does seem a bit crazy with 24 compactions running concurrently. > &g

memtablePostFlusher blocking writes?

2011-04-27 Thread Terje Marthinussen
0.8 trunk: When playing back a fairly large chunk of hints, things basically locks up under load. The hints are never processed successfully. Lots of Mutations dropped. One thing is that maybe the default 10k columns per send with 50ms delays is a bit on the aggressive side (10k*20 =200.000 colum

Re: memtablePostFlusher blocking writes?

2011-04-27 Thread Terje Marthinussen
shes, and probably compaction activity as well. > > (Also, if each of those pending mutations is 10,000 columns, you may > be causing yourself memory pressure as well.) > > On Wed, Apr 27, 2011 at 11:01 AM, Terje Marthinussen > wrote: > > 0.8 trunk: > > > > When playin

MemtablePostFlusher with high number of pending calls?

2011-05-03 Thread Terje Marthinussen
Cassandra 0.8 beta trunk from about 1 week ago: Pool NameActive Pending Completed ReadStage 0 0 5 RequestResponseStage 0 0 87129 MutationStage 0 0 187298 ReadRe

Re: MemtablePostFlusher with high number of pending calls?

2011-05-03 Thread Terje Marthinussen
are there any exceptions in the log? > > On Tue, May 3, 2011 at 1:01 PM, Jonathan Ellis wrote: > > Does it resolve down to 0 eventually if you stop doing writes? > > > > On Tue, May 3, 2011 at 12:56 PM, Terje Marthinussen > > wrote: > >> Cassandra

Re: MemtablePostFlusher with high number of pending calls?

2011-05-03 Thread Terje Marthinussen
So yes, there is currently some 200GB empty disk. On Wed, May 4, 2011 at 3:20 AM, Terje Marthinussen wrote: > Just some very tiny amount of writes in the background here (some hints > spooled up on another node slowly coming in). > No new data. > > I thought there was no except

Re: MemtablePostFlusher with high number of pending calls?

2011-05-03 Thread Terje Marthinussen
rophically fail, its corresponding > post-flush task will be stuck. > > On Tue, May 3, 2011 at 1:20 PM, Terje Marthinussen > wrote: > > Just some very tiny amount of writes in the background here (some hints > > spooled up on another node slowly coming in). > >

Re: MemtablePostFlusher with high number of pending calls?

2011-05-03 Thread Terje Marthinussen
debug logging and see if I get lucky and run out of disk again. Terje On Wed, May 4, 2011 at 5:06 AM, Jonathan Ellis wrote: > Compaction does, but flush didn't until > https://issues.apache.org/jira/browse/CASSANDRA-2404 > > On Tue, May 3, 2011 at 2:26 PM, Terje Marthinussen >

Re: MemtablePostFlusher with high number of pending calls?

2011-05-03 Thread Terje Marthinussen
. Terje On Wed, May 4, 2011 at 6:34 AM, Terje Marthinussen wrote: > Hm... peculiar. > > Post flush is not involved in compactions, right? > > May 2nd > 01:06 - Out of disk > 01:51 - Starts a mix of major and minor compactions on different column > families > It then sta

Re: MemtablePostFlusher with high number of pending calls?

2011-05-04 Thread Terje Marthinussen
gt; (line 2066) requesting GC to free disk space > [lots of sstables deleted] > > After this is starts running again (although not fine it seems). > > So the disk seems to have been full for 35 minutes due to un-deleted > sstables. > > Terje > > On Wed, May 4, 2011 at 6:34

Re: MemtablePostFlusher with high number of pending calls?

2011-05-04 Thread Terje Marthinussen
completely run out of disk space Regards, Terje On Wed, May 4, 2011 at 10:09 PM, Jonathan Ellis wrote: > Or we could "reserve" space when starting a compaction. > > On Wed, May 4, 2011 at 2:32 AM, Terje Marthinussen > wrote: > > Partially, I guess this may be a

compaction strategy

2011-05-07 Thread Terje Marthinussen
Even with the current concurrent compactions, given a high speed datafeed, compactions will obviously start lagging at some stage, and once it does, things can turn bad in terms of disk usage and read performance. I have not read the compaction code well, but if http://wiki.apache.org/cassandra/Me

Re: compaction strategy

2011-05-07 Thread Terje Marthinussen
job. Terje On Sat, May 7, 2011 at 9:54 PM, Jonathan Ellis wrote: > On Sat, May 7, 2011 at 2:01 AM, Terje Marthinussen > wrote: > > 1. Would it make sense to make full compactions occur a bit more > aggressive. > > I'd rather reduce the performance impact of being b

Re: Digg 4 Preview on TWiT

2010-07-09 Thread Terje Marthinussen
http://twitter.com/nk/status/17903187277 Another "not using" joke?

Re: column family names

2010-08-30 Thread Terje Marthinussen
atch the \w reg ex class, which includes the underscore > character. > > > Aaron > > > On 30 Aug 2010, at 21:01, Terje Marthinussen > wrote: > > Hi, > > Now that we can make columns families on the fly, it gets interesting to > use > column families more

Re: cassandra disk usage

2010-08-30 Thread Terje Marthinussen
On Mon, Aug 30, 2010 at 10:10 PM, Jonathan Ellis wrote: > column names are stored per cell > > (moving to user@) > I think that is already accommodated for in my numbers? What i listed was measured from the actual SSTable file (using the output from "strings ), so multiples of the supercolumn

Re: column family names

2010-08-30 Thread Terje Marthinussen
Beyond aesthetics, specific reasons? Terje On Tue, Aug 31, 2010 at 11:54 AM, Benjamin Black wrote: > URL encoding. > >

Re: column family names

2010-08-30 Thread Terje Marthinussen
al names which map to pretty incomprehensible sequences that are > laborious to look up). > > So my experience suggests to avoid it for ops reasons, and just go with > simplicity. > > /Janne > > On Aug 31, 2010, at 08:39 , Terje Marthinussen wrote: > > Beyond aesthet

Re: column family names

2010-08-31 Thread Terje Marthinussen
> > b > > On Mon, Aug 30, 2010 at 11:55 PM, Terje Marthinussen > wrote: > > Another option would of course be to store a mapping between > dir/filenames > > and Keyspace/columns familes together with other info related to > keyspaces > > and column families

Re: column family names

2010-08-31 Thread Terje Marthinussen
23:03, David Boxenhorn wrote: > It's not so hard to implement your mapping suggestion in your application, > rather than in Cassandra, if you really want it. > > On Tue, Aug 31, 2010 at 1:05 PM, Terje Marthinussen > wrote: > No benefit? > Making it easier to use colum

order of mutations in batch_mutate

2010-09-01 Thread Terje Marthinussen
Hi, Just a curiosity. I should probably read some code and write a test to make sure, but not important enough right now for that :) - void batch_mutate(string keyspace, map>> mutation_map, ConsistencyLevel consistency_level) Will performance of a batch_mutate be affected by the order of

Re: about insert benchmark

2010-09-02 Thread Terje Marthinussen
1000 and 1 records take too short time to really benchmark anything. You will use 2 seconds just for stuff like tcp_windows sizes to adjust to the level were you get throughput. The difference between 100k and 500k is less than 10%. Could be anything. Filesystem caches, sizes of memtables (de

network configurations in medium to large size installations

2010-10-27 Thread Terje Marthinussen
Hi, Just curious if anyone has any best practices/experiences/thoughts to share on network configurations for cassandra setups with tens to hundreds of nodes and high traffic (thousands of requests/sec)? For instance: - Do you just "hook it all together"? - If you have 2 interfaces, do you prefer

  1   2   >