On 05 Jun 00:20, Albert Cervera i Areny wrote: > 2015-06-04 19:11 GMT+02:00 Jordi Esteve <[email protected]>: > > I'm analysing postgres performance in a big database (aprox. 1-2 million or > > records of account.invoice.line and stock.move) and I have figured out that > > slow queries are executed to get data related to m2m fields (I suppose the > > same happens for o2m fields). I think the problem is how the search_domain > > is build in these cases. For example, the m2m field invoice_lines in > > stock.move model is asked with a domain like that: > > > > [('stock_move', 'in', [1826865, 1826864]), ('invoice_line.id', '!=', None), > > ('invoice_line.company', '=', 1)] > > > > The second condition ('invoice_line.id', '!=', None) is added by the get() > > of the many2many field. > > The third condition ('invoice_line.company', '=', 1) is added by a company > > rule. > > > > This domain is converted to the following SQL, and if account_invoice_line > > table has 1 million records it is very slow (10-20 seconds) to execute and > > consumes a lot of postgres resources because the subqueries returns > > thousands (nearly million) of records: > > > > SELECT "a"."id" AS "id", "a"."stock_move" AS "stock_move", > > "a"."invoice_line" AS "invoice_line", ... > > FROM "account_invoice_line-stock_move" AS "a" > > LEFT JOIN "account_invoice_line" AS "b" ON ("b"."id" = "a"."invoice_line") > > WHERE (("a"."stock_move" IN (1826865, 1826864)) > > AND ("a"."invoice_line" IN (SELECT "c"."id" AS "id" FROM > > "account_invoice_line" AS "c" WHERE ((("c"."id" IS NOT NULL)) > > AND ("c"."id" IN (SELECT "d"."id" AS "id" FROM "account_invoice_line" AS > > "d" WHERE (((("d"."company" = 1))) AND true))))))) > > ORDER BY "b"."description" ASC; > > > > > > I don't know if it will be room to improve how a domain is converted to an > > SQL expression. For example, based in the previous example, one like this > > that applies the related conditions directly to the JOIN table instead of > > the original table with subqueries. > > > > SELECT "a"."id" AS "id", "a"."stock_move" AS "stock_move", > > "a"."invoice_line" AS "invoice_line", ... > > FROM "account_invoice_line-stock_move" AS "a" > > LEFT JOIN "account_invoice_line" AS "b" ON ("b"."id" = "a"."invoice_line") > > WHERE (("a"."stock_move" IN (1826865, 1826864)) > > AND ("b"."id" IS NOT NULL) > > AND (("b"."company" = 1) AND true)) > > ORDER BY "b"."description" ASC; > > This query is not equivalent to the generated one, you'd have to make > something like > > LEFT JOIN (select .....) > > and not use the "b".* in the main where clause, otherwise the LEFT > nature of the join is lost.
Agree on point of view of purely SQL but for Tryton they will have the same result because "id" is required and "invoice_line" is a foreign key. > Anyway, I don't know how hard it would be to make tryton generate a > better query. I think both queries have their advantage indeed it depends on the size of each tables. In our case, a join is faster because the database will make a join between a small set of tuple from "account_invoice_line-stock_move" with a large table "account_invoice_line". Instead of retrieving almost all the tuples from "account_invoice_line" to make the IN test. But if the table size were different then a IN clause could have better performance. So I think the improvement should add a parameter on the Many2One definition to define which kind of "nested" search strategy it should use. -- Cédric Krier - B2CK SPRL Email/Jabber: [email protected] Tel: +32 472 54 46 59 Website: http://www.b2ck.com/
