Thanks for the reply Peter, you may have discovered the problem, I'll
explain below.

On Sun, Aug 28, 2011 at 8:51 AM, Peter Schuller
<peter.schul...@infidyne.com> wrote:
>> Understood. In the code example I provided, I am writing the same
>> value, but I am doing so in quick succession, so perhaps a few second
>> sleep might be helpful. It is worth noting also that the code I
>> provided is only the second step 2 in the process. There is a php
>> script that receives the post request from Paypal which inserts the
>> IPN data into the IPN column family. Before it does this, it sets the
>> "processed" column to "no"
>
> Is it at all possible that this step happens twice? I have no idea
> what Paypal does or document, but in general with an HTTP based
> callback you (in Paypal's position) would either have to accept that
> human intervention is necessary on any transaction where the callback
> fails, or else implement some kind of re-try and keep submitting to
> the customer until the callback is successful. Keep in mind that the
> other end (meaning you in this case) can perceive to receive a
> successful HTTP request and send back a response, even though Paypal
> may perceive an error on their end.

> If you haven't, I'd definitely recommend checking logs at this step,
> or adding logging if required, to make sure that the callback is not
> happening twice.

You are correct here. If PayPal fails to get a positive response from
my callback, it will retry the IPN event until it gets a successful
response. When this happens, a new column appears "retry_count" which
is set to the positive integer representing the number of tries
attempted. Given that this column has always shown as 0 and the IPN
event log on paypal.com also shows no retries attempted, I believe I
am correct in assuming that this isn't the case.

> How much traffic do you have to this cluster? Is it feasable to run
> Cassandra with full debug enabled (spammings lots of text in your
> logs)? That might be one way to ascertain, once you have one of these
> cases happening, whether Cassandra is mentioning any activity
> pertaining to the row that might explain this, such as it being
> re-written by a client.

Not really a lot of traffic, and if my fix below doesn't work I will
definitely give this a shot.

> Another suggestion: Is it possible you do not have clocks synchronized
> among your clients? Suppose that that Paypal *is* submitting twice
> sometimes, and e.g. one of your PHP front-ends (or whoever is talking
> to Cassandra to insert the data) has clock drift. This would render
> the insert from your code snippet obsolete, if there is already a
> value inserted with a timestamp in the future.

This appears to be the case. The server that the PHP front-end resides
on was 80 seconds into the future. The server that handles IPN
processing was sync'd with NTP to ntp.ubuntu.com. So if a processing
event occurred within 80 seconds of inserting the IPN event, and the
'processed' column was updated, the time stamp for that column would
be earlier than the original insert and thus obsolete. That's why it
always worked on the second attempt, because by then, enough time had
passed to make the original insert obsolete despite the drift.

For some reason the PHP front-end server was lacking any time
synchronization. I have corrected this, it now syncs to ntp.ubuntu.com
just like all the others. I will post back on this topic if it appears
to have solved the problem.

>
> --
> / Peter Schuller (@scode on twitter)
>

Reply via email to