Les,
maybe it's worth looking into Beetle [1] which is a HA messaging solution built
on RabbitMQ and Redis. It supports multiple brokers and message de-duplication,
using Redis. It's written in Ruby, but should either way give you some
inspiration on how something like this could be achieved.
I'd like to have fully redundant feeds with no single point of failure,
but avoid the work of indexing the duplicate copy and having it written
to a bitcask even if it would eventually be cleaned up.
On 6/21/2011 4:43 PM, Sylvain Niles wrote:
Why not write to a queue bucket with a timestamp a
Why not write to a queue bucket with a timestamp and have a queue
processor move writes to the "final" bucket once they're over a
certain age? It can dedup/validate at that point too.
On Tue, Jun 21, 2011 at 2:26 PM, Les Mikesell wrote:
> Where can I find the redis hacks that get close to cluste
Where can I find the redis hacks that get close to clustering? Would
membase work with syncronous replication on a pair of nodes for a
reliable atomic 'check and set' operation to dedup redundant data before
writing to riak? Conceptually I like the 'smart client' fault
tolerance of memcache/
The "real" queues like HornetQ and others can take care of this without a
single point of failure but it's a pain (in my opinion) to set them up that
way, and usually with all the cluster and failover features active they get
quite slow for writes.We use Redis for this because it's simpler and
ligh
Is there a good way to handle something like this with redundancy all the way
through? On simple key/value items you could have two readers write the same
things to riak and let bitcask cleanup eventually discard one, but with indexing
you probably need to use some sort of failover approach up
Why not decouple the twitter stream processing from the indexing? More than
likely you have a single process consuming the spritzer stream, so you can
put the fetched results in a queue (hornetq, beanstalk, or even a simple
Redis queue) and then have workers pull from the queue and insert into Riak
Hi Steve,
Thanks for sending over more details.
The pre- vs. post-commit hook question is a good one. The reason we chose a
pre-commit hook over a post-commit hook for Riak Search indexing is because
a post commit hook doesn't currently provide back-pressure to the Riak KV
side of the system. It
Ok, I've changed my two VMs to each have:
3 CPUs, 1GB ram, 120GB disk
I'm ingesting the twitter spritzer stream (about 10-20 tweets per second,
approx 2k of data per tweet). One bucket is storing the non-indexed
tweets in full. Another bucket is storing the indexed tweet string, id,
date an
Hi Steve,
Riak does best with a lot of memory and a fast disk. Depending on how much
data you have in the system, putting two nodes into 1GB of memory on a
single VM may be causing the system to overrun available resources and page
out to disk, and depending on how you've set up your virtualized
e
Hey there.
I'm inserting twitter spritzer tweets into a bucket that doesn't have a
precommit index hook, and a few fields from the tweet into a second bucket
that does have the precommit hook.
Speeds on the inserts into the indexed bucket are an order or magnitude
slower than the non-indexed
11 matches
Mail list logo