That makes sense. My data is coming in from the internet and is being
processed in chunks as it is using Active MQ with the stomp package. I'm
getting the log lines in 20-1000 line chunks (depending on the busyness of
customer sites) so there definitely is the potential for a lot of
parallelism. Some of my data will likely be in cache during write because of
the nature of the work. It's a reputation system so I first get a query from
the customer for the reputation and then afterwards within a minute or so
I'll get feedback back from them what the current events "score" was that
feedbacks into the system for updates of the value. Anyways lots of
parallelism opportunities.

2010/4/9 Ted Zlatanov <t...@lifelogs.com>

> On Thu, 08 Apr 2010 11:50:38 -0700 Mike Gallamore <
> mike.e.gallam...@googlemail.com> wrote:
>
> MG> Yes I agree single threaded is probably not the best. I wonder how
> MG> much of a performance hit it is on a single CPU machine though? I
> MG> guess I still would be blocking on ram writes but isn't like there is
> MG> multiple CPUs I need to keep busy or anything.
>
> Cassandra may have to load data from disk for a particular query but
> another may already be in memory.  A third may cause a hit on another
> cluster node.  So if you issue queries serially you'll see performance
> drop off with the total number of queries because they are dependent on
> each other's performance, while the distribution of the performance of
> independent parallel queries will have skew and kurtosis much closer to
> a normal distribution.  In other words, your slowest (or unluckiest)
> queries are less damaging when you issue them in parallel.
>
> On the client side you still have slow serialization/deserialization and
> not much can be done about that.
>
> Ted
>
>

Reply via email to