I'm working towards implementing on Google App Engine.   Google
stresses the importance of sharding counters in their environment.
http://code.google.com/appengine/articles/sharding_counters.html

Why?   Well, Craigslist keeps a counter for their ads.  Every time
someone posts an ad, that counter gets incremented.  As of a few weeks
ago, they released Sphinx as their search engine, which means that you
can use that 9-digit ad number in any city.

On GAE, writes are slow - you might have to wait a second or more to
write out one record.  Other records will have to wait for the first
one to finish.   You can't run Craigslist on GAE.

What does sharding counters mean?  As I interpret it, it means knowing
"about" how many records you have.  Not exactly.   So you get 10 or 20
sub-counters and write to one at random.   If you need to know about
how many records you have, you total the 10 or 20 sub-counters to get
an answer.  It's an approximate answer but if you've got a lot of
data, hey, it's going to be close enough.   And you can get 10 or 20
or 40 writes/second because each time you're grabbing a different
counter.

At least that's the theory.   I'm trying to translate the Google
implementation into web2py.  I've got increment working except for
memcache.  It's doing test0, test1, etc.  with the counts properly.
get_count is not working and I'm having trouble figuring out
why         counters = db(db.shards.name==name).select() is not
returning any results.   That seems to be the most accurate way to
translate GAE/webapp to web2py but my suspicion is it's failing
because test0 is not equal to test.   I'm trying to avoid using two
versions of routines - one for web2py w/o GAE, one with - but it might
be necessary.

def test_it():
    count=get_count('test')
    session.flash = count is + `count`
    increment('test')
    return

def get_count(name):
    """Retrieve the value for a given sharded counter.

    Parameters:
        name - The name of the counter
    """
    total = memcache.get(name)
    if total is None:
        print "none"
        total = 0
        counters = db(db.shards.name==name).select()
        for counter in counters:
            total += counter.count
            print counter.name
        memcache.add(name, str(total), 60)
    return total

def increment(name):
    """Increment the value for a given sharded counter.

    Parameters:
    name - The name of the counter
    """

    index = random.randint(0, NUM_SHARDS - 1)
    shard_name = name + str(index)
    try:
        counter = db(db.shards.name==shard_name).select()[0]
        temp=counter.count+1
        counter.update_record(count=temp)
    except:
        db.shards.insert(name=shard_name, count=1)

#    memcache.incr(name)

Eventually I hope to implement the version that allows for increasing
of the number of shards.

Thanks.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"web2py Web Framework" group.
To post to this group, send email to web2py@googlegroups.com
To unsubscribe from this group, send email to 
web2py+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/web2py?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to