Thank you for the feedback everyone.

The main reason I fetch them all first is to make sure I'm not inserting 
duplicate records. We have a lot of files that have thousands of records 
and sometimes they're duplicates. I hash a few columns from each record and 
if the value is the same then I don't insert the record. If there is a more 
efficient way to do this please let me know.

On Monday, March 16, 2015 at 6:26:50 PM UTC-4, Niphlod wrote:
>
> I have 300m+ records too, but I don't need to fetch all of those in a 
> single query. 
> It's not a matter of indexes or anything else: the "system hang" you're 
> incurring to is not the one that the database takes from the moment you 
> send the query to the first row returned, but the time you take to 
> "accumulate" 1m row into memory.
>
> On Monday, March 16, 2015 at 10:46:39 PM UTC+1, Dane Wright wrote:
>>
>> I have a table which currently contains 10m+ records (local government 
>> spending in www.appgov.org). A native SQL count(*) will take 3-5 secs 
>> but  reading all the records like this via the DAL will result in the 
>> system hanging. I try not to read too many of these records in any one 
>> online transaction by allowing only selections and calculating subtotals 
>> offline. I also create MySQL indexes outside of Web2py to speed things up 
>> where necessary. Of course I also can't ever say db.laspend.id>0 in the 
>> Database Admin facility either!
>>
>>>
>>>

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to