[HACKERS] Re: patch review : Add ability to constrain backend temporary file space

Mark Kirkwood Tue, 31 May 2011 17:44:23 -0700

On 01/06/11 09:24, Cédric Villemain wrote:

Hello


here is a partial review of your patch, better than keeping it
sleeping in the commitfest queue I hope.

  Submission review
================

     * The patch is not in context diff format.
     * The patch apply, but contains some extra whitespace.
     * Documentation is here but not explicit about 'temp tables',
maybe worth adding that this won't limit temporary table size ?
     * There is no test provided. One can be expected to check that the
feature work.

Code review
=========

* in fd.c, I think that "temporary_files_size -=
(double)vfdP->fileSize;" should be done later in the function once we
have successfully unlink the file, not before.

* I am not sure it is better to add a fileSize like you did or use
relationgetnumberofblock() when file is about to be truncated or
unlinked, this way the seekPos should be enough to increase the global
counter.

* temporary_files_size, I think it is better to have a number of pages
à la postgresql than a kilobyte size

* max_temp_files_size, I'll prefer an approach like shared_buffers
GUC: you can use pages, or KB, MB, ...


Simple Feature test
==============

either explain buffers is wrong or the patch is wrong:
cedric=# explain (analyze,buffers) select * from foo  order by 1 desc ;
                                                    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
  Sort  (cost=10260.02..10495.82 rows=94320 width=4) (actual
time=364.373..518.940 rows=100000 loops=1)
    Sort Key: generate_series
    Sort Method: external merge  Disk: 1352kB
    Buffers: local hit=393, temp read=249 written=249
    ->   Seq Scan on foo  (cost=0.00..1336.20 rows=94320 width=4)
(actual time=0.025..138.754 rows=100000 loops=1)
          Buffers: local hit=393
  Total runtime: 642.874 ms
(7 rows)

cedric=# set max_temp_files_size to 1900;
SET
cedric=# explain (analyze,buffers) select * from foo  order by 1 desc ;
ERROR:  aborting due to exceeding max temp files size
STATEMENT:  explain (analyze,buffers) select * from foo  order by 1 desc ;
ERROR:  aborting due to exceeding max temp files size

Do you have some testing method I can apply to track that without
explain (analyze, buffers) before going to low-level monitoring ?

Architecture review
==============

max_temp_files_size is used for the global space used per backend.
Based on how work_mem work I expect something like "work_disk" to
limit per file and maybe a backend_work_disk (and yes maybe a
backend_work_mem ?!) per backend.
So I propose to rename the current GUC to something like backend_work_disk.

Patch is not large and easy to read.
I like the idea and it sounds useful.


Hi Cédric,

Thanks for reviewing!

The diff format is odd - I'm sure I told git to do a context diff,however running 'file' on the v2 patch results it it saying 'HTMLdocument'... hmm, I'll check more carefully for the next one I do.

Yeah, I agree about explicitly mentioning how it does not constrainttemp tables.


Hmm, test - good idea - I'll look into it.

Re the code comments - I agree with most of them. However with respectto the Guc units, I copied the setup for work_mem as that seemed themost related. Also I used bytes for the internal variable in fd.c tolimit the number of arithmetic operations while comparing.

For testing I set 'log_temp_files = 0' in postgresql.conf and thenumbers seemed to agree exactly - I didn't think to use EXPLAIN +buffers... not sure why they would disagree. Have a go withlog_temp_files set and see what you find!

I like yhour suggesting for the name. Given that 'work_mem' is perbackend, I'm leaning towards the idea of 'work_disk' being sufficientlydescriptive.


Thanks again, and I'll come up with a v3 patch.

Mark



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: patch review : Add ability to constrain backend temporary file space

Reply via email to