2011/5/5 Mitsuru IWASAKI <iwas...@jp.freebsd.org>: > Hi, > >> I think that PgFincore (http://pgfoundry.org/projects/pgfincore/) >> provides similar functionality. Are you familiar with that? If so, >> could you contrast your approach with that one? > > I'm not familiar with PgFincore at all sorry, but I got source code > and documents and read through them just now. > # and I'm a novice on postgres actually... > The target both is to reduce physical I/O, but their approaches and > gains are different. > My understanding is like this; > > +---------------------+ +---------------------+ > | Postgres(backend) | | Postgres | > | +-----------------+ | | | > | | DB Buffer Cache | | | | > | | (shared buffers)| | | | > | |*my target | | | | > | +-----------------+ | | | > | ^ ^ | | | > | | | | | | > | v v | | | > | +-----------------+ | | +-----------------+ | > | | buffer manager | | | | pgfincore | | > | +-----------------+ | | +-----------------+ | > +---^------^----------+ +----------^----------+ > | |smgrread() |posix_fadvise() > |read()| | userland > ================================================================== > | | | kernel > | +-------------+-------------+ > | | > | v > | +------------------------+ > | | File System | > | | +-----------------+ | > +------>| | FS Buffer Cache | | > | |*PgFincore target| | > | +-----------------+ | > | ^ ^ | > +----|-------|-----------+ > | | > ================================================================== > | | hardware > +---------|-------|----------------+ > | | v Physical Disk | > | | +------------------+ | > | | | base/16384/24598 | | > | v +------------------+ | > | +------------------------------+ | > | |Buffer Cache Hibernation Files| | > | +------------------------------+ | > +----------------------------------+ >
littel detail, pgfincore store its data per relation in a file, like you do. I rewrote a bit that, and it will store its data directly in postgresql tables, as well as it will be able to restore the cache from raw bitstring. > In summary, PgFincore's target is File System Buffer Cache, Buffer > Cache Hibernation's target is DB Buffer Cache(shared buffers). Correct. (btw I am very happy of your idea and that you get time to do it) > > PgFincore is trying to preload database file by posix_fadvise() into > File System Buffer Cache, not into DB Buffer Cache(shared buffers). > On query execution, buffer manager will get DB buffer blocks by > smgrread() from file system unless necessary blocks exist in DB Buffer > Cache. At this point, physical reads may not happen because part of > (or entire) database file is already loaded into FS Buffer Cache. > > The gain depends on the file system, especially size of File System > Buffer Cache. > Preloading database file is equivalent to following command in short. > $ cat base/16384/24598 > /dev/null Not exactly. it exists 2 calls : * pgfadv_WILLNEED * pgfadv_WILLNEED_snapshot The former ask to load each segment of a relation *but* the kernel can decide to not do that or load only part of each segment. (so it is not as brutal as cat file > /dev/null ) The later read *exactly* each blocks required in each segment, not all blocks except if all were in cache while doing the snapshot. (this one is the part of the snapshot/restore combo) > > I think PgFincore is good for data warehouse in applications. Pgfincore with bitstring storage in a table allow streaming to HotStandbys and get better response in case of switch-over/fail-over by doing some house-keeping on the HotStandby and keep it really hot ;) Even web applications have large database today .... (they is more, but it is no the subject) > > > Buffer Cache Hibernation, my approach, is more simple and straight forward. > It try to save/load the contents of DB Buffer Cache(shared buffers) using > regular files(called Buffer Cache Hibernation Files). > At startup, buffer manager will load DB buffer blocks into DB Buffer > Cache from Buffer Cache Hibernation Files which was saved at the last > shutdown. Note that database file will not be read, so it is not > cached in File System Buffer Cache at all. Only contents of DB Buffer > Cache are filled. Therefore, the DB buffer cache miss penalty would > be larger than PgFincore's. > > The gain depends on the size of shared buffers, and how often the > similar queries are executed before and after restarting. > > Buffer Cache Hibernation is good for OLTP in applications. It is very helpfull for debugging and analysis purpose, also, IIUC. I may prefer the per relation approach (so you can snapshot and restore only the interesting tables/index). Given what I read in your patch it looks easy to do, isn't it ? I also prefer the idea to keep a map of the Buffer Cache (yes, like what I do with pgfincore) than storing the data directly and reading it directly. This later part semmes a bit dangerous to me, even if it looks sane from a normal postgresql stop/start process. > > > I think that PgFincore and Buffer Cache Hibernation is not exclusive, > they can co-work together in different caching levels. Yes. > > > > Sorry for my poor english skill, but I'm doing my best :) better than me, and anyway your patch remain very easy to read in all case. > > Thanks > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers