Hi, Doug. I've written an app that does something similar - it keeps lists of tags on persisted objects. You could bang your head into exploding indices and performance problems depending on how many objects are in your datastore and how many tags are in each list on average.
I propose a different design. Keep tags in lists on your objects (or
"documents"), but create a new model, "bucket." Have a bucket store a
list of keys to documents, and on bucket creation, name the bucket's
key after a particular tag.
Consider indexing a document for a particular tag. Do two things:
store that tag on the list of tags in the document object, then insert
that document's key into the document key list stored in the bucket
corresponding to that tag (creating that bucket if it doesn't already
exist).
Consider querying for all documents corresponding to a particular
tag. Simply get the bucket corresponding to that tag by key (super
fast!), then resolve that bucket's list of document keys to document
objects. :-)
You're scratching the surface of search engine design. This design
has worked well for me and may work well for you too. For a reference
implementation, see the code for my app:
Example models ("document" corresponds to "Bookmark" and "bucket"
corresponds to "Keychain"):
http://code.google.com/p/grab-it/source/browse/trunk/models.py
Example querying (the _search_bookmarks_generic function):
http://code.google.com/p/grab-it/source/browse/trunk/logic.py#114
Example indexing and unindexing (the _index_bookmark and
_unindex_bookmark methods):
http://code.google.com/p/grab-it/source/browse/trunk/handlers.py#316
Have-a-lot-of-fun-ly yours,
Raj ;-)
On Aug 28, 2009, at 11:56 AM, Doug wrote:
>
> Hi,
>
> I'm very new to GAE and was curous if there is a performance penalty
> with regards to queries and exploding indices, etc. etc... if one
> implements tags (a list of category names for a record if you will) as
> dynamic properties vs. just list<db.Category>.
>
> As a dynamic property I was thinking a possibility would be:
> obj.tag_moe="moe"
> obj.tag_curly="curly"
> (Of course, this assumes I can specify the name of a dynamic tag at
> creation time)
>
> so if I wanted to query a database for tags "xxx" and "yyy", could I
> do:
> SELECT * FROM myModel WHERE tag_moe = "moe" AND tag_curly = "curly"
> ORDER by date
>
>
> Or... as a list of tags a possibility would be:
> obj.tags = ["moe", "curly"]
>
> and the query
> SELECT * FROM myModel WHERE tags = "moe" AND tags = "curly" ORDER by
> date
>
>
> Thanks for any insight
> -d
>
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google
> Groups "Google App Engine" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en
> -~----------~----~----~----~------~----~------~--~---
>
smime.p7s
Description: S/MIME cryptographic signature
