2016-09-12 15:16 GMT-03:00 Merlin Moncure <mmonc...@gmail.com>: > On Mon, Sep 12, 2016 at 9:03 AM, Vinicius Segalin <vinisega...@gmail.com> > wrote: > > Hi everyone, > > > > I'm trying to find a way to predict query runtime (I don't need to be > > extremely precise). I've been reading some papers about it, and people > are > > using machine learning to do so. For the feature vector, they use what > the > > DBMS's query planner provide, such as operators and their cost. The > thing is > > that I haven't found any work using PostgreSQL, so I'm struggling to > adapt > > it. > > My question is if anyone is aware of a work that uses machine learning > and > > PostgreSQL to predict query runtime, or maybe some other method to > perform > > this. > > Well, postgres estimates the query runtime in the form of an expected > 'cost', where the cost is an arbitrary measure based on time > complexity of query plan. It shouldn't be too difficult to correlate > estimated cost to runtime cost.
That's what I though too. At least it makes sense, I guess. But sometimes logic doesn't work, so I think only giving it a try will say. > A statistical analysis of that > correlation would be incredibly useful work although generating sample > datasets would be a major challenge. > > merlin > Indeed. I'm using TPC-B along with pgbench to have some data to test (while I don't have real data), but I'm having a hard time creating queries that give me (very) different performance results so I can train my ML algorithm.