On 05/06/15 00:20, Albert Cervera i Areny wrote:
2015-06-04 19:11 GMT+02:00 Jordi Esteve <[email protected]>:
I'm analysing postgres performance in a big database (aprox. 1-2 million or
records of account.invoice.line and stock.move) and I have figured out that
slow queries are executed to get data related to m2m fields (I suppose the
same happens for o2m fields). I think the problem is how the search_domain
is build in these cases. For example, the m2m field invoice_lines in
stock.move model is asked with a domain like that:
[('stock_move', 'in', [1826865, 1826864]), ('invoice_line.id', '!=', None),
('invoice_line.company', '=', 1)]
The second condition ('invoice_line.id', '!=', None) is added by the get()
of the many2many field.
The third condition ('invoice_line.company', '=', 1) is added by a company
rule.
This domain is converted to the following SQL, and if account_invoice_line
table has 1 million records it is very slow (10-20 seconds) to execute and
consumes a lot of postgres resources because the subqueries returns
thousands (nearly million) of records:
SELECT "a"."id" AS "id", "a"."stock_move" AS "stock_move",
"a"."invoice_line" AS "invoice_line", ...
FROM "account_invoice_line-stock_move" AS "a"
LEFT JOIN "account_invoice_line" AS "b" ON ("b"."id" = "a"."invoice_line")
WHERE (("a"."stock_move" IN (1826865, 1826864))
AND ("a"."invoice_line" IN (SELECT "c"."id" AS "id" FROM
"account_invoice_line" AS "c" WHERE ((("c"."id" IS NOT NULL))
AND ("c"."id" IN (SELECT "d"."id" AS "id" FROM "account_invoice_line" AS
"d" WHERE (((("d"."company" = 1))) AND true)))))))
ORDER BY "b"."description" ASC;
I don't know if it will be room to improve how a domain is converted to an
SQL expression. For example, based in the previous example, one like this
that applies the related conditions directly to the JOIN table instead of
the original table with subqueries.
SELECT "a"."id" AS "id", "a"."stock_move" AS "stock_move",
"a"."invoice_line" AS "invoice_line", ...
FROM "account_invoice_line-stock_move" AS "a"
LEFT JOIN "account_invoice_line" AS "b" ON ("b"."id" = "a"."invoice_line")
WHERE (("a"."stock_move" IN (1826865, 1826864))
AND ("b"."id" IS NOT NULL)
AND (("b"."company" = 1) AND true))
ORDER BY "b"."description" ASC;
This query is not equivalent to the generated one, you'd have to make
something like
LEFT JOIN (select .....)
and not use the "b".* in the main where clause, otherwise the LEFT
nature of the join is lost.
Anyway, I don't know how hard it would be to make tryton generate a
better query but if the generated query is very slow it may be because
postgres requires a lot of temporary memory and PostgreSQL is not
properly tuned. You could try increasing work_mem which is usually
extremely low by default.
Also we've experience very poor performance with this kind of queries
with older versions of PostgreSQL <= 9.0 I guess. Newer versions
managed that pretty well.
Yes, you are right, with several GB of RAM assigned to postgres and
postgres tunned correctly the time consumed decreases drastically. I was
doing tests in my local PC with lower resources and I was surprised with
this behaviour.
I don't know if it is worth improve this queries with Cedric's
suggestion of defining the kind of "nested" search strategy to use, it
will make code for search_domain() more complex. But solving the problem
by more RAM resources + tunning postgres hides the real problem.
--
Jordi Esteve
Consultor Zikzakmedia SL
[email protected]
Mòbil 679 170 693
Zikzakmedia SL
St. Jaume, 9, baixos, 2a
08720 Vilafranca del Penedès
Tel 93 890 2108