[PERFORM] disagreeing query planners
With three databases running the same query, I am receiving greatly differing results from 2 of the query planners. -db2 and db3 are slonied copies of db1. The servers have identical postgresql.conf files but the server hardware differs. -all appropriate columns are indexed -vacuum analyze is run nightly on all dbs Here is a simplified version of the query: EXPLAIN ANALYZE SELECT COUNT(DISTINCT(m_object_paper.id)) FROM m_object_paper, m_assignment, m_class, r_comment_rubric_user_object WHERE m_object_paper.assignment=m_assignment.id AND m_assignment.class=m_class.id AND m_class.account = 36123 AND m_object_paper.id = r_comment_rubric_user_object.objectid; db1 displays a concise query plan of nested loops and index scans executing in 85 ms. However, db2's query plan consists of sequential scans and takes 3500 ms to complete. The strange part is this. Last week, db1 and db3 were in agreement and executing the more efficient plan. Now, db3 is in agreement with db2 with the less efficient, slower plan. Are we missing something, what could cause this disagreement? Thanks ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How to interpret this explain analyse?
Greg Stark wrote: > > Kevin Brown <[EMAIL PROTECTED]> writes: > > Also, one has to ask what the consequences are of assuming a value too > > low versus too high. Which ends up being worse? > > This is one of the things the planner really cannot know. Ultimately it's the > kind of thing for which hints really are necessary. Oracle distinguishes > between the "minimize total time" versus "minimize startup time" with > /*+ ALL_ROWS */ and /*+ FIRST_ROWS */ hints, for example. Well, the planner *can* know the actual value to use in this case, or at least a close approximation, but the system would have to gather some information about cursors during fetches. At the very least, it will know how many rows were actually fetched by the cursor in question, and it will also hopefully know how many rows were returned by the query being executed. Store the ratio of the two in a list, then store the list itself into a table (or something) at backend exit time. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] seq scan cache vs. index cache smackdown
Josh Berkus wrote: Now you can see why other DBMSs don't use the OS disk cache. ... ...as long as we use the OS disk cache, we can't eliminate checkpoint spikes, at least on Linux. Wouldn't the VM settings like the ones under /proc/sys/vm and/or the commit=XXX mount option if using ext3 be a good place to control this? It seems if you wanted, by setting /proc/sys/vm/dirty_background_ratio and /proc/sys/vm/dirty_expire_centisecs very low you'd be constantly flushing dirty pages. Has anyone experimented with these kinds of values: /proc/sys/vm/dirty_ratio /* the generator of dirty data writes back at this ratio */ /proc/sys/vm/dirty_background_ratio /* start background writeback */ /proc/sys/vm/dirty_writeback_centisecs /* the interval between [some style of] writebacks */ /proc/sys/vm/dirty_expire_centisecs /* the number of centiseconds that data is allowed to remain dirty I tried these to workaround the opposite kind of problem on a laptop running linux under vmware I wanted to avoid having it do writes quickly to make each individual transaction go faster; at the expense of a big spike in IO that the sales guy would trigger explicitly before talking a while. Setting each of those very high and using a commit=600 mount option made the whole demo run with very little IO except for the explicit sync; but I never took the time to understand which setting mattered to me or why. It seems inevitable that Postgres will eventually eliminate that redundant layer of buffering. Since mmap is not workable, that means using O_DIRECT to read table and index data. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match