NPT while get_range_slices in 0.8.1

2011-08-26 Thread Evgeniy Ryabitskiy
Hi,

we have 4 node Cassandra (version 0.8.1) cluster. 2 CF inside. While first
CF is working properly (read/store), get_range_slices query on second CF
return NPE error.
Any idea why it happen? Maybe some known bug and fixed in 0.8.3 ?



ERROR [pool-2-thread-51] 2011-08-25 15:02:04,360 Cassandra.java (line 3210)
Internal error processing get_range_slices
java.lang.NullPointerException
at org.apache.cassandra.db.ColumnFamily.diff(ColumnFamily.java:298)
at org.apache.cassandra.db.ColumnFamily.diff(ColumnFamily.java:406)
at
org.apache.cassandra.service.RowRepairResolver.maybeScheduleRepairs(RowRepairResolver.java:103)
at
org.apache.cassandra.service.RangeSliceResponseResolver$2.getReduced(RangeSliceResponseResolver.java:120)
at
org.apache.cassandra.service.RangeSliceResponseResolver$2.getReduced(RangeSliceResponseResolver.java:85)
at
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
at
org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:715)
at
org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617)
at
org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202)
at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)


Re: Cassandra Node Requirements

2011-08-26 Thread Philippe
>
> Sort of.  There's some fine print, such as the 50% number is only if
> you're manually forcing major compactions, which is not recommended,
> but a bigger thing to know is that 1.0 will introduce "leveled
> compaction" [1] inspired by leveldb.  The free space requirement will
> then be a small number of megabytes.
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-1608

And in the mean time, plan for more storage as I and others have reported in
other threads that repairs have caused disks to fill up (in my case, I think
it's because I had multiple repairs running at the same time).


Re: Cassandra 082 - Large swap memory

2011-08-26 Thread Brandon Williams
On Thu, Aug 25, 2011 at 11:42 PM, King JKing  wrote:
> Dear Jonathan,
> Cassandra process has 63.5 GB virtual size.
> I mention about RES column in top. RES is 8.3G. Very large than 2.5G Used
> Memory Used show in JConsole.

https://issues.apache.org/jira/browse/CASSANDRA-2868

-Brandon


Re: Is Cassandra suitable for this use case?

2011-08-26 Thread Edward Capriolo
On Fri, Aug 26, 2011 at 12:18 AM, Eric Evans  wrote:

> On Thu, Aug 25, 2011 at 6:31 AM, Ruby Stevenson  wrote:
> > - Although Cassandra (and other decentralized NoSQL data store) has
> > been reported to handle very large data in total, my preliminary
> > understanding is the individual "column value" is quite limited. I
> > have read some posts saying you shouldn't store file this big in
> > Cassandra for example, use a path instead and let file system handle
> > it. Is this true?
>
> http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu
>

It is important to note that even distributed storage file solutions like
GlusterFS, NFS, Iscsi are not as good as local disk either. The reason is
simple, best case scenario on a local file system file lives in VFS cache
you maybe be talking like micro or nanoseconds to read a block. Even if not
in vfs cache you are bounded by BUS speeds and disk speeds.

Network disks solutions like (isci) require dedicated expensive infini-ban
or ethernet networks to work well. Also that when using something like ISCI
your system gets to leverage its local VFS cache so not all the read traffic
has to cross the network.

The best case scenario for Cassandra would be a block of data living in the
row cache on a node. This data still has to traverse the network. That is
going to be slower then a local file.

However depending on what you are doing storing file data in cassandra could
be a big win.


Re: Commit log fills up in less than a minute

2011-08-26 Thread Anand Somani
Sure I can fill in the ticket. Here is what I have noticed so far, the count
of HH is not going up, which is good. I think what must have happened is
that after I restarted the cluster, no new hints were added just the old
one's are still around and not cleaned up, is that possible? Cannot say for
sure, since I only looked at this JMX bean abt 36 hours after restart.

Can I just clean this up using the JMX call? I do not want to turn off HH,
since that can handle the intermittent network hiccups well, right?

On Thu, Aug 25, 2011 at 2:47 PM, aaron morton wrote:

> Could you put together some information on this in a ticket and references
> this one https://issues.apache.org/jira/browse/CASSANDRA-3071
>
> The short term fix is to disable HH. You will still get consistent reads.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 25/08/2011, at 3:22 AM, Anand Somani wrote:
>
> So I have looked at the cluster from
>
>- Cassandra-client - describe cluster => shows correctly - 3 nodes
>- used the StorageService - JMX bean =>UnreachableNodes - shows 0
>
>
> If all these show the correct ring state, why are hints being maintained,
> looks like that is the only way to find out about "phantom" nodes.
>
> On Wed, Aug 24, 2011 at 8:01 AM, Anand Somani wrote:
>
>> So, I restarted the cluster (not rolling), but it is still maintaining
>> hints for the IP's that are no longer part of the ring. nodetool ring shows
>> things correctly (as only 3 nodes).
>> When I check thru the jmx hintedhandoff manager, it shows it is
>> maintaining the hints for those non existent IP's. So the question is
>>  - How can I remove these IP permanently, so hints do not get saved?
>>  - Not all nodes see the same list of IP's
>>
>>
>>
>>
>> On Sun, Aug 21, 2011 at 3:10 PM, aaron morton wrote:
>>
>>> Yup, you can check the what HH is doing via JMX.
>>>
>>> there is a bug in 0.7 that can result in log files not been deleted
>>> https://issues.apache.org/jira/browse/CASSANDRA-2829
>>>
>>> Cheers
>>>
>>>  -
>>> Aaron Morton
>>> Freelance Cassandra Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 22/08/2011, at 4:56 AM, Anand Somani wrote:
>>>
>>> We have a lot of space on /data, and looks like it was flushing data fine
>>> from file timestamps.
>>>
>>> We did have a bit of goofup with IP's when bringing up a down node (and
>>> the commit files have been around since then). Wonder if that is what
>>> triggered it and we have a bunch of hinted handoff's being backed up.
>>>
>>> For hinted hand off - how do I check if the nodes are collecting hints (
>>> I do have it turned on). I noticed console bean HintedHandManager, is that
>>> the only way to find out.
>>>
>>> On Sun, Aug 21, 2011 at 9:20 AM, Peter Schuller <
>>> peter.schul...@infidyne.com> wrote:
>>>
 > When does the actual commit-data file get deleted.
 >
 > The flush interval on all my memtables is 60 minutes

 They *should* be getting deleted when they no longer contain any data
 that has not been flushed to disk. Are flushes definitely still
 happening? Is it possible flushing has started failing (e.g. out of
 disk)?

 The only way I can think of over nodes directly affecting the commit
 log size on your node would be e.g. hinted handoff resulting in burst
 of writes.

 --
 / Peter Schuller (@scode on twitter)

>>>
>>>
>>>
>>
>
>


Re: Occasionally getting old data back with ConsistencyLevel.ALL

2011-08-26 Thread Kyle Gibson
Update:

I scaled my cluster down from 7 nodes to 3 nodes, and kept RF=3. I did
a complete cluster rebuild, so everything was fresh. Kept my reads and
writes at CL.ALL. For a while there it seemed like I had succeeded in
eliminating the problem. Unfortunately about an hour ago a duplicate
came through, and the same IPN was processed twice.

Does anyone have any more suggestions as to what is going on here?

On Mon, Aug 22, 2011 at 1:59 PM, Kyle Gibson
 wrote:
> Thanks for the reply.
>
> On Mon, Aug 22, 2011 at 1:11 PM, Dominic Williams
>  wrote:
>> Hi there, here's my tuppence!
>> 1. Something to look at first:
>>
>> If you write two different values to the same column quickly in succession,
>> if both writes go out with the same timestamp, then it is indeterminate
>> which one wins i.e. write order doesn't necessarily matter.
>
> Understood. In the code example I provided, I am writing the same
> value, but I am doing so in quick succession, so perhaps a few second
> sleep might be helpful. It is worth noting also that the code I
> provided is only the second step 2 in the process. There is a php
> script that receives the post request from Paypal which inserts the
> IPN data into the IPN column family. Before it does this, it sets the
> "processed" column to "no"
>
>> 2. Don't trust PayPal (anyone using PayPal should really read this)
>> We are / were relying on IPNs to manage our website's recurring
>> subscriptions list. We experienced this weird thing where the
>> recurring_payment_profile_created IPN was missing, and got thought maybe
>> Cassandra was losing it because PayPal is a financial system and it couldn't
>> possibly fail to generate an IPN, right!!?
>> Anyway, it turns out that after exhaustive discussions with PayPal
>> engineers, and having proved this from the PayPal logs, that sometimes IPNs
>> fail to get generated. Yup. Read that again Sometimes the fail to get
>> generated and in fact this is happening to us quite regularly now.
>> They justify this (while acknowledging this issue should be in their
>> documentation) by saying that because HTTP delivery is unreliable (hmmm
>> isn't this what the retry queue is for..) we shouldn't be relying entirely
>> on IPNs and should regularly download the logs and run them through scripts
>> to catch problems (this is idiotic, since the angry customer will get on our
>> case immediately when they pay and membership doesn't start)
>> Not sure whether PayPal or database failing is best option. Look forward to
>> hearing resolution.
>
> I have experienced a failing to receive an IPN event before. In this
> case the IPN even is never saved to the IPN column family, and the
> cron script doesn't process it once, or twice, for that matter. Odd
> thing about the failed IPN event is that it didn't even show up in the
> IPN history, so i couldn't "replay" the event.
>
> I am fairly positive that the problem is either with my environment or
> cassandra and not paypal in this case. I am hoping it is my
> environment because i suspect that will be easier to fix.
>
> Oddly enough, the second time the IPN is processed, the column write
> succeeds. This always happens 5 minutes after the first one is
> processed.
>
> I neglected to mention an important part of the process: after the IPN
> event is processed (e.g. a new payment), an email is sent out to
> myself and the sender. This is how I know for sure the event is being
> processed twice, because not only do I receive two emails (spaced 5
> minutes apart) but does the individual who paid. This is often
> embarrassing to explain and somewhat difficult, customers get confused
> as to which account they are supposed to use, etc.
>
> Thanks
>
>> Best,
>> Dominic
>> On 22 August 2011 17:49, Kyle Gibson  wrote:
>>>
>>> I made some changes to my code base that uses cassandra. I went back
>>> to using the "processed" column, but instead of using "0" or "1" I
>>> decided to use "no" and "yes"
>>>
>>> You can view the code here: http://pastebin.com/gRBC16e7
>>>
>>> As you can see from the code, I perform an insert, get, check the
>>> result, if it didn't work, I try to insert again, and check the get.
>>> Each time I do a print out to see what the result is. Each operation
>>> is a CL.ALL.
>>>
>>> A few successful IPNs did come through before this one was generated:
>>>
>>> IPN-5943a4adc8eab68cdbc9d9eff7fa7dc669fa0bce
>>> OrderedDict([..., (u'processed', u'no'), ...])
>>> Failed to set processed to yes
>>> IPN-5943a4adc8eab68cdbc9d9eff7fa7dc669fa0bce insert 1314012603578714
>>> Failed to set processed to yes
>>> IPN-5943a4adc8eab68cdbc9d9eff7fa7dc669fa0bce insert2 1314012603586201
>>>
>>> As expected, this IPN was processed twice.
>>>
>>> On Sat, Aug 20, 2011 at 5:37 PM, Peter Schuller
>>>  wrote:
>>> >> Do you mean the cassandra log, or just logging in the script itself?
>>> >
>>> > The script itself. I.e, some "independent" verification that the line
>>> > of code after the insert is in fact running, just in cas

Range scan

2011-08-26 Thread Bill Hastings
How does range scan work in Cassandra? Does the read of a key perform the
read across all the SSTables that contain the key and return the row or are
SSTables processed sequentially? If I have a key k and its columns are
spread across N SSTables then does the read of key k return the row with all
the columns spread across N SSTables or does it process each SSTable and
return a row with whatever columns it finds in the processed SSTable.