Josh Berkus writes:
> Tom, how does our heuristic sampling work? Is it pure random sampling, or
> page sampling?
Manfred probably remembers better than I do, but I think the idea is
to approximate pure random sampling as best we can without actually
examining every page of the table.
Hi Tom,
Thanks! That's exactly what it was. There was a discrepancy in the
data that turned this into an endless loop. Everything has been
running smoothly since I made a change.
Thanks so much,
Richard
On Apr 23, 2005, at 12:50 PM, Tom Lane wrote:
Richard Plotkin <[EMAIL PROTECTED]> writes:
On Sat, Apr 23, 2005 at 01:00:40AM -0400, Tom Lane wrote:
> "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> >> Feel free to propose better cost equations.
>
> > Where would I look in code to see what's used now?
>
> All the gold is hidden in src/backend/optimizer/path/costsize.c.
>
>
Folks,
> I wonder if this paper has anything that might help:
> http://www.stat.washington.edu/www/research/reports/1999/tr355.ps - if I
> were more of a statistician I might be able to answer :-)
Actually, that paper looks *really* promising. Does anyone here have enough
math to solve for D(s
Andrew,
> The math in the paper does not seem to look at very low levels of q (=
> sample to pop ratio).
Yes, I think that's the failing. Mind you, I did more testing and found out
that for D/N ratios of 0.1 to 0.3, the formula only works within 5x accuracy
(which I would consider acceptable)
Here is my opinion.
I hope this helps.
Maybe there is no one good formula:
On boolean type, there are at most 3 distinct values.
There is an upper bound for fornames in one country.
There is an upper bound for last names in one country.
There is a fixed number of states and postal codes in one coun
Here is, how you can receive all one billion rows with
pieces of 2048 rows. This changes PostgreSQL and ODBC behaviour:
Change ODBC data source configuration in the following way:
Fetch = 2048
UseDeclareFetch = 1
It does not create core dumps with 32 bit computers with billions of rows!
This is a b