Re: [web2py] Re: tags and speed

andrej burja Fri, 19 Oct 2012 04:20:14 -0700

hi

Thank you for your explanation. It is really helpful  
The address of the chapter in the book is "Efficient search by tag" and i 
choose this approach because i thought that search would be the most 
important thing of the application (that was right)
The table tag is similar like in your example 2, but only one tag per row, 
 so: 
game_id  tag
1             excitement
1             funny


I learned some lessons:
- do usability testing ass soon as possible (it came as a surprise for me 
that using Datatables is the most useful)
- be prepared for lots of data: my initial assumption was 200 games, and 
now i have more than 1000.

I will try with cache first. 

andrej


On Friday, October 19, 2012 1:21:37 AM UTC+2, Niphlod wrote:
>
> old mean bad thing about normalized vs denormalized model.
> Don't know what you read in the cookbok, but here's the thing.
> You have a game pacman that is tagged as arcade, and a game invaders 
> tagged as horror
>
> First things first: you may want to change "horror" to "needs parent 
> around" later in the future. That's why you may want to create an external 
> tags table and reference it in your games one.
> tags table
> 1 arcade
> 2 horror
>
> game_name tags
> pacman   1
> invaders  2
>
> To display
> pacman arcade
> invaders horror
> you just need a simple join. Changing record 2 of the tags table allows 
> you to have
> pacman arcade
> invaders needs parents around
>
> Next problem on the line: you want multiple tags for a single game 
> (invaders needs both arcade and horror tags). 
> Welcome to normalization hell. Books have been written about it in the 
> last 60-70 years, with terms like 3NF and Cardinality all around ^_^
>
> Let's take this by examples
>
> 1) you change the tags table like this
> id tag_name game_id
> 1 arcade 1
> 1 arcade 2
> 2 horror 2
> getting tags for a single game is fast, change "horror" to "needs parents 
> around" is fast, getting all possible tags is enough fast (depending on the 
> number of rows of the tags table)
> but
> fetch the right tags for every game can be slow
>
> 2) you create a games table like this 
> id game_name tags
> 1 pacman |arcade|
> 2 invaders |arcade|horror|
> getting tags for a single game is fast, updating a tag for a game is quite 
> fast
> but
> changing all "horror" to "needs parents around" can be fast only if done 
> outside web2py with a manual replace, getting all possible tags can be slow
>
> 3) you create a games table like this
> id game_name tags_id
> 1 pacman |1|
> 2 invaders |1|2|
> getting tags for a single game is fast, updating a tag for a game is quite 
> fast, changing "horrors" is fast
> but
> changing all "horror" to "needs parents around" can be fast only if done 
> outside web2py with a manual replace, getting all possible tags can be slow
>
> So, with no need to change tag names from "horror" to "needs parents 
> around", I'd say the right way for displaying your table is 2) 
> (list:string), else 3) (list:reference). 
> If you need to "suggest" previously entered tags (i.e. you don't want to 
> end having "arcade", "arcady", "arcadian", "'rcade" spat all around), I'd 
> still go for 2) + one separate table (or a cached list) holding all 
> previously entered tags, just to speedup the "suggestion" phase (hoping 
> users won't screw up).
>  
> Method 1) is faster only when:
> - tags are more than the games
> - "cleaning normalization" logics are heavy
> - you need to update tag names often
> - the only need is displaying tags for a single game (direct query on a 
> single table, the tags one)
> - you need to compute something like "how many times the arcade tag is 
> applied to all my games" (i.e. for a tag cloud)
>
> There are several other methods, and all of them in the end "do the job". 
> You just need to see what are your requirements and choose carefully. Your 
> agenda is:
> - spend the less time possible with queries you have to invoke often
> - "pay the price" of your model in high computations for other things (or, 
> let's save the results of those computations externally once every hour)
>  
>
>
>

--

Re: [web2py] Re: tags and speed

Reply via email to