On Sun, Aug 7, 2022 at 7:05 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Even on a modern Linux: > > $ size src/backend/parser/gram.o > text data bss dec hex filename > 656568 0 0 656568 a04b8 src/backend/parser/gram.o > $ size src/interfaces/ecpg/preproc/preproc.o > text data bss dec hex filename > 912005 188 7348 919541 e07f5 src/interfaces/ecpg/preproc/preproc.o > > So there's something pretty bloated there. It doesn't seem like > ecpg's additional productions should justify a nigh 50% code > size increase.
Comparing gram.o with preproc.o: $ objdump -t src/backend/parser/gram.o | grep yy | grep -v UND | awk '{print $5, $6}' | sort -r | head -n3 000000000003a24a yytable 000000000003a24a yycheck 0000000000013672 base_yyparse $ objdump -t src/interfaces/ecpg/preproc/preproc.o | grep yy | grep -v UND | awk '{print $5, $6}' | sort -r | head -n3 000000000004d8e2 yytable 000000000004d8e2 yycheck 000000000002841e base_yyparse The largest lookup tables are ~25% bigger (other tables are trivial in comparison), and the function base_yyparse is about double the size, most of which is a giant switch statement with 2510 / 3912 cases, respectively. That difference does seem excessive. I've long wondered if it would be possible / feasible to have more strict separation for each C, ECPG commands, and SQL. That sounds like a huge amount of work, though. Playing around with the compiler flags on preproc.c, I get these compile times, gcc memory usage as reported by /usr/bin/time -v , and symbol sizes (non-debug build): -O2: time 8.0s Maximum resident set size (kbytes): 255884 -O1: time 6.3s Maximum resident set size (kbytes): 170636 000000000004d8e2 yytable 000000000004d8e2 yycheck 00000000000292de base_yyparse -O0: time 2.9s Maximum resident set size (kbytes): 153148 000000000004d8e2 yytable 000000000004d8e2 yycheck 000000000003585e base_yyparse Note that -O0 bloats the binary probably because it's not using a jump table anymore. O1 might be worth it just to reduce build times for slower animals, even if Noah reported this didn't help the issue upthread. I suspect it wouldn't slow down production use much since the output needs to be compiled anyway. -- John Naylor EDB: http://www.enterprisedb.com