On Wednesday, October 8, 2014 1:00:56 PM UTC+2, Narūnas Krasauskas wrote:
>
>
> I have never said anything like you quoted, what I said though was: "users 
> who can get to the search page ideally would be able to search/see all the 
> records". Meaning, that users has access to the 1+m records, however when 
> they define search query, if one is not accurate enough, it can potentially 
> return really large datasets, therefore I would like to limit output to the 
> defined number of rows.
>
> Imagine you type in the google search "web2py" and hit enter. My browser 
> yields ~373 000 matches, however I can only browse through the first 10 
> pages, that is equal to ~ 100 matches accessible to me, despite the fact 
> that there are 372 900 other ones. My query was not accurate enough and it 
> is obvious to me, obvious to google, that I will not click 37 290 more 
> times, to browse among other matches, I must redefine my query. Does it 
> make sense?
>
> What you have suggested is simply incorrect.
>
>
seems we're in different worlds (or that any issue raised slightly changes 
the original requirements). 
This will be my last addition to this thread. 
Your last "issue" (4) became evident because users in grid CAN order the 
resultset as they please. google doesn't let you do that: ordering is 
strictly fixed and there's a top limit of 1000 elements for every query 
(~100 pages)
tl;dr
- if users has access to 1M records, and they can see everything, it 
doesn't matter how sloppy they are with searches. If the returned dataset 
for the query is - potentially - 1000 records, grid will by default FETCH 
EXACTLY 20 records (paginate default). If users click on SORT, grid will by 
default FETCH EXACTLY 20 records.
- if your table has 1M records and for a sloppy search (i.e. "web2py") your 
database takes one hour to fetch 20 records, web2py can't do anything about 
it
- if your table has 1M records and for a sloppy search your database takes 
2 seconds to fetch 20 records, but 40 minutes to fetch the exact count, and 
you're not concerned about having the count to be exact, figure your own 
logic and pass it to cache_count (that's probably what google does, a rough 
estimate). That's the main point of this original thread posts.
- if you are concerned by NOT LETTING users reach page 4, pass cache_count 
= 80 (or inspect request.vars.page)
- if you are concerned to show users that you have 1M records but they can 
see only 80 of them, grid is not the right tool (or you can meddle with 
javascript fixing the "total records" display at the top)

summary of the summary: you simply can't trim a dataset to whatever you 
like and provide real ordering and let the user see the same set of records 
WITHOUT trimming the entire dataset beforehand.
translation of the summary on issue 4) with evidence on the statement "from 
the end of full dataset": 
given a dataset like [1,2,3,4,.....1000]
and your issue of displaying ALWAYS and IRREGARDLESS of ordering JUST 
[1,2,3,4]
you CAN'T pass [1,2,3,4.....1000] to the grid and expect that for every 
ordering only a combination of [1,2,3,4] (in whatever order) will be shown.
If you want to do like google, that basically "sells" a 
[1,2,3,4,.....373000] while providing a [1,2,3,4.....1000], no matter what, 
AND let users choose an order, it means that "the dataset" is no longer 
373000 records. it's 1000 (you can't order a 373000 dataset, then limit it 
to 1000, and expect that the same 1000 records will be shown). And so you 
can approach it doing:
- use cache_count = 3730000
- instead of db.table pass db.table.id.contains([1,2,3,4....1000])

summary of the summary of the summary: the order of operations in a 
database (or a dataset) is filtering ("sloppy search") --> ordering --> 
limiting. There's no way around that, is simple logic.
If you want instead to do emulate a "filtering --> limiting --> ordering" 
behaviour, a database query won't fit the bill, because database follow the 
previously explained logic ..... and so you need to do it in 2 pass: 
database filtering --> fixed dataset rebuild (implicitely limited) --> 
ordering.

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to