Re: [web2py] Re: self join on postgres

Derek Tue, 08 Jan 2013 09:07:23 -0800

No way to set an index... perhaps you could use caching.

On Tuesday, January 8, 2013 1:27:46 AM UTC-7, Paolo valleri wrote:
>
> Thanks for this explanations! To add more info, I can say that the 
> elements on left column are naturally ordered, I have thousands of 
> records(~5000record/day) for each station but only a few 
> station_id(~5-10stations totally). 
> Instead of making a complex query, a cleaner approach would be to run the 
> query with only the join and then do a loop in python to remove the 
> unnecessary lines.
>
> Moreover, right now, the worst think is that since I switched to postgres 
> the initial query (the one with only the leftjoin) is taking minutes to be 
> executed (almost 4m) against just few seconds on sqlite. I set an index on 
> the mac field through pgAdmin and the results are really better, 
> unfortunately this hack is not portable, is there a way to define indexes 
> from web2py?
>
> Paolo
>
>
> 2013/1/8 Niphlod <nip...@gmail.com <javascript:>>
>
>> after a full day at work I may lack the usual fantasy, but what you're 
>> trying to do can't be even achieved by a "relatively simple" windowed 
>> function....
>> What you're trying to do is recursing..... way out of DAL reach (if you 
>> want to do it in a single query)
>> Trying to explain better....
>>
>> Real data helps.....
>> For every station_id record with the same mac address you want to find 
>> the "min gathered_on" record from the same table (with another station_id) 
>> and "subtract it" for every next possible match.
>>
>> One thing is requiring
>>
>> "2013-01-21 11:23:35";"a";127167;"2013-01-21 11:23:45";"a";127168
>> "2013-01-21 11:23:00";"a";127169;"2013-01-21 11:23:45";"a";127168
>> That can be accomplished by something like this
>>
>> select * from (
>>     select    start_point.gathered_on,start_point.mac,start_point.id,
>> end_point.gathered_on,end_point.mac,end_point.id, 
>> row_number() over (partition by start_point.id order by end_point.
>> gathered_on) as filter_field
>> from record as start_point
>> inner join
>> record as end_point
>> on start_point.mac = end_point.mac
>> and start_point.gathered_on <= end_point.gathered_on
>> where start_point.station_id = 13
>> and end_point.station_id = 14
>> ) as q
>> where q.filter_field = 1
>>
>> because for the record 127167 the next record with another station_id is 
>> 127168, but then for the 127169 you don't want the 127168, you want 127170 
>> because 127168 "has been booked before" by 127169. 
>>
>> Honestly, (beware of the lack of fantasy :P) I'd do a loop in python 
>> instead of using recursing in the db itself unless you have zillions of 
>> "windows" (i.e. you have 1000 station_id = 13 and 1000 station_id = 14, and 
>> 1000 distinct station_id), just because it's more readable than what it 
>> would be needed in raw sql....
>>
>> -- 
>>  
>>  
>>  
>>
>
>
>
> -- 
>  Paolo 
>

--

Re: [web2py] Re: self join on postgres

Reply via email to