2016-09-12 15:16 GMT-03:00 Merlin Moncure <mmonc...@gmail.com>:

> On Mon, Sep 12, 2016 at 9:03 AM, Vinicius Segalin <vinisega...@gmail.com>
> wrote:
> > Hi everyone,
> >
> > I'm trying to find a way to predict query runtime (I don't need to be
> > extremely precise). I've been reading some papers about it, and people
> are
> > using machine learning to do so. For the feature vector, they use what
> the
> > DBMS's query planner provide, such as operators and their cost. The
> thing is
> > that I haven't found any work using PostgreSQL, so I'm struggling to
> adapt
> > it.
> > My question is if anyone is aware of a work that uses machine learning
> and
> > PostgreSQL to predict query runtime, or maybe some other method to
> perform
> > this.
>
> Well, postgres estimates the query runtime in the form of an expected
> 'cost', where the cost is an arbitrary measure based on time
> complexity of query plan.   It shouldn't be too difficult to correlate
> estimated cost to runtime cost.


That's what I though too. At least it makes sense, I guess. But sometimes
logic doesn't work, so I think only giving it a try will say.


> A statistical analysis of that
> correlation would be incredibly useful work although generating sample
> datasets would be a major challenge.
>
> merlin
>

Indeed. I'm using TPC-B along with pgbench to have some data to test (while
I don't have real data), but I'm having a hard time creating queries that
give me (very) different performance results so I can train my ML algorithm.

Reply via email to