Another way to look at the problem is: How do I sample a subset of size K efficiently? A query like

   SAMPLE 1000 OF
   (SELECT * FROM mydata WHERE <some condition>)

should return 1000 random rows from the select statement so that two consecutive evaluations of the query would only with very little probability return the same 1000 rows.
(Yes, I know that "SAMPLE 1000 OF" is not valid SQL)

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to