On Thu, 2005-02-10 at 02:37 +0000, Brian B. wrote: > I am loading some spam/ham data/tokens, to be used for the dspam anti-spam > software, into PostgreSQL. After a few hours of inserting and updating the > existing data, the backend crashes with a signal 10 (bus error). I am also > running an ANALYZE command to help keep the indexes up to date every 10 > minutes.
As far as I can tell, ANALYZE is not at issue here. > I have rebuilt PostgreSQL with debugging enabled and have a > resulting core dump. I am not sure who to send this to, however. The > backtrace using the core file is as follows: > > #0 0x80c4b52 in transformExpr (pstate=0x86fa630, expr=0x85ec148) > at parse_expr.c:81 > #1 0x80c5126 in transformExpr (pstate=0x86fa630, expr=0x85ec1d0) > at parse_expr.c:270 > #2 0x80c5126 in transformExpr (pstate=0x86fa630, expr=0x85ec258) > at parse_expr.c:270 > #3 0x80c5126 in transformExpr (pstate=0x86fa630, expr=0x85ec2e0) > at parse_expr.c:270 > > [... repeat above several thousand times ...] It seems what's happening here is that dspam is submitting a query with many thousands of elements in the IN clause. In the parser, we transform "foo IN (a, b, c)" into "foo = a OR foo = b OR foo = c", and then recurse for each element of the OR expression and eventually run out of stack space. (Note that this will actually be worse in HEAD, since a refactoring I applied will mean we consume two stack frames for each expression.) A workaround would be to increase PostgreSQL's stack size. Perhaps it would be worth considering representing IN lists as a distinct expression type, at least in the parser. Then the transformExpr() code would look like: foreach (element of IN list) transformExpr(element); ... do whatever else ... so we wouldn't need to recurse. We could then transform the new expression type into a list of OR clauses at this point. -Neil ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend