On 05 Jun 00:20, Albert Cervera i Areny wrote:
> 2015-06-04 19:11 GMT+02:00 Jordi Esteve <[email protected]>:
> > I'm analysing postgres performance in a big database (aprox. 1-2 million or
> > records of account.invoice.line and stock.move) and I have figured out that
> > slow queries are executed to get data related to m2m fields (I suppose the
> > same happens for o2m fields). I think the problem is how the search_domain
> > is build in these cases. For example, the m2m field invoice_lines in
> > stock.move model is asked with a domain like that:
> >
> > [('stock_move', 'in', [1826865, 1826864]), ('invoice_line.id', '!=', None),
> > ('invoice_line.company', '=', 1)]
> >
> > The second condition ('invoice_line.id', '!=', None) is added by the get()
> > of the many2many field.
> > The third condition ('invoice_line.company', '=', 1) is added by a company
> > rule.
> >
> > This domain is converted to the following SQL, and if account_invoice_line
> > table has 1 million records it is very slow (10-20 seconds) to execute and
> > consumes a lot of postgres resources because the subqueries returns
> > thousands (nearly million) of records:
> >
> > SELECT "a"."id" AS "id", "a"."stock_move" AS "stock_move",
> > "a"."invoice_line" AS "invoice_line", ...
> >   FROM "account_invoice_line-stock_move" AS "a"
> >   LEFT JOIN "account_invoice_line" AS "b" ON ("b"."id" = "a"."invoice_line")
> >   WHERE (("a"."stock_move" IN (1826865, 1826864))
> >     AND ("a"."invoice_line" IN (SELECT "c"."id" AS "id" FROM
> > "account_invoice_line" AS "c" WHERE ((("c"."id" IS NOT NULL))
> >     AND ("c"."id" IN (SELECT "d"."id" AS "id" FROM "account_invoice_line" AS
> > "d" WHERE (((("d"."company" = 1))) AND true)))))))
> >   ORDER BY "b"."description" ASC;
> >
> >
> > I don't know if it will be room to improve how a domain is converted to an
> > SQL expression. For example, based in the previous example, one like this
> > that applies the related conditions directly to the JOIN table instead of
> > the original table with subqueries.
> >
> > SELECT "a"."id" AS "id", "a"."stock_move" AS "stock_move",
> > "a"."invoice_line" AS "invoice_line", ...
> >   FROM "account_invoice_line-stock_move" AS "a"
> >   LEFT JOIN "account_invoice_line" AS "b" ON ("b"."id" = "a"."invoice_line")
> >   WHERE (("a"."stock_move" IN (1826865, 1826864))
> >     AND ("b"."id" IS NOT NULL)
> >     AND (("b"."company" = 1) AND true))
> >   ORDER BY "b"."description" ASC;
> 
> This query is not equivalent to the generated one, you'd have to make
> something like
> 
> LEFT JOIN (select .....)
> 
> and not use the "b".* in the main where clause, otherwise the LEFT
> nature of the join is lost.

Agree on point of view of purely SQL but for Tryton they will have the
same result because "id" is required and "invoice_line" is a foreign
key.

> Anyway, I don't know how hard it would be to make tryton generate a
> better query.

I think both queries have their advantage indeed it depends on the size
of each tables.
In our case, a join is faster because the database will make a join
between a small set of tuple from "account_invoice_line-stock_move" with
a large table "account_invoice_line". Instead of retrieving almost all
the tuples from "account_invoice_line" to make the IN test.
But if the table size were different then a IN clause could have better
performance.

So I think the improvement should add a parameter on the Many2One
definition to define which kind of "nested" search strategy it should
use.

-- 
Cédric Krier - B2CK SPRL
Email/Jabber: [email protected]
Tel: +32 472 54 46 59
Website: http://www.b2ck.com/

Reply via email to