How to see/calculate size of index in memory?
Hi! I would have following question, if someone could help. Question 1: How to see/calculate size of index in memory? BTree, hash index. I can see size of index e.g. with pg_relation_size FROM pg_class (after reindex). Does that tell size of index on disk? I would be interested how big part of index is in memory. (Whole index?) PG10/PG11. Best Regards, Sam
To keep indexes in memory, is large enough effective_cache_size enough?
Hi! Related to my other email (size of index in memory), Other questions, Q: To keep _index(es)_ in memory, is large enough effective_cache_size enough? Q: Size of shared_buffers does not matter regarding keeping index in memory? Or have I missed something, does it matter (to keep indexes in memory)? Background info: I have plans to use hash indexes: very large amount of data in db tables, but (e.g. hash) indexes could be kept in memory. I am using PostgreSQL 10. I could start to use PostgreSQL 11, after it has been released. Best Regards, Sam
Re: How to see/calculate size of index in memory?
Hello You can use pg_buffercache contrib module: https://www.postgresql.org/docs/current/static/pgbuffercache.html pg_relation_size - yes, its full size on disk regardless buffer cache regards, Sergei
Re: To keep indexes in memory, is large enough effective_cache_size enough?
Hi effective_cache_size is not cache. It is just approx value for query planner: how many data can be found in RAM (both in shared_buffers and OS page cache) > Q: Size of shared_buffers does not matter regarding keeping index in memory? shared_buffers is cache for both tables and indexes pages. All data in tables and indexes are split to chunks 8 kb each - pages (usually 8kb, it can be redefined during source compilation). Shared buffers cache is fully automatic, active used pages keeps in memory, lower used pages may be evicted. You can not pin any table or index to shared buffers. regards, Sergei
Re: To keep indexes in memory, is large enough effective_cache_size enough?
Hi! Is is possible to force PostgreSQL to keep an index in memory? The data in db table columns is not needed to be kept in memory, only the index. (hash index.) It would sound optimal in our scenario.I think Oracle has capability to keep index in memory (in-memory db functionality). But does PostgreSQL have such a functionality? (I keep searching.) I have read: Tuning Your PostgreSQL Server - PostgreSQL wiki (effective_cache_size, shared_buffers) I have seen responses to: PostgreSQL Index Caching Should I actually set shared_buffers to tens of gigabytes also, if I want to keep one very big index in memory? I ma also reading a PG book. Best Regards, Sam On Wednesday, September 19, 2018 11:40 AM, Sam R. wrote: Hi! Related to my other email (size of index in memory), Other questions, Q: To keep _index(es)_ in memory, is large enough effective_cache_size enough? Q: Size of shared_buffers does not matter regarding keeping index in memory? Or have I missed something, does it matter (to keep indexes in memory)? Background info: I have plans to use hash indexes: very large amount of data in db tables, but (e.g. hash) indexes could be kept in memory. I am using PostgreSQL 10. I could start to use PostgreSQL 11, after it has been released. Best Regards, Sam
Re: To keep indexes in memory, is large enough effective_cache_size enough?
Sergei wrote: > You can not pin any table or index to shared buffers. Thanks, this is answer to my other question! In our case, this might be an important feature. (Index in memory, other data / columns not.) > shared_buffers is cache for both tables and indexes pages. Ok. So, we should set also shared_buffers big. BR Sam On Wednesday, September 19, 2018 12:10 PM, Sergei Kornilov wrote: Hi effective_cache_size is not cache. It is just approx value for query planner: how many data can be found in RAM (both in shared_buffers and OS page cache) > Q: Size of shared_buffers does not matter regarding keeping index in memory? shared_buffers is cache for both tables and indexes pages. All data in tables and indexes are split to chunks 8 kb each - pages (usually 8kb, it can be redefined during source compilation). Shared buffers cache is fully automatic, active used pages keeps in memory, lower used pages may be evicted. You can not pin any table or index to shared buffers. regards, Sergei
Re: To keep indexes in memory, is large enough effective_cache_size enough?
On 19 September 2018 at 21:18, Sam R. wrote: > Ok. So, we should set also shared_buffers big. It might not be quite as beneficial as you might think. If your database is larger than RAM often having a smaller shared_buffers setting yields better performance. The reason is that if you have a very large shared_buffers that the same buffers can end up cached in the kernel page cache and shared buffers. If you have a smaller shared buffers setting then the chances of that double buffering are reduced and the chances of finding a page cached somewhere increases. However, if your database is quite small and you can afford to fit all your data in shared buffers, with enough free RAM for everything else, then you might benefit from a large shared buffers, but it's important to also consider that some operations, such as DROP TABLE can become slow of shared buffers is very large. You might get more specific recommendations if you mention how much RAM the server has and how big the data is now and will be in the future. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: To keep indexes in memory, is large enough effective_cache_size enough?
Does a large shared_buffers impact checkpoint performance negatively? I was under the impression that everything inside shared_buffers must be written during a checkpoint.
Re: To keep indexes in memory, is large enough effective_cache_size enough?
On 19 September 2018 at 22:12, Kaixi Luo wrote: > Does a large shared_buffers impact checkpoint performance negatively? I was > under the impression that everything inside shared_buffers must be written > during a checkpoint. Only the dirty buffers get written. Also having too small a shared buffers can mean that buffers must be written more than they'd otherwise need to be. If a buffer must be evicted from shared buffers to make way for a new buffer then the chances of having to evict a dirty buffer increases with smaller shared buffers. Obviously, this dirty buffer needs to be written out before the new buffer can be loaded in. In a worst-case scenario, a backend performing a query would have to do this. pg_stat_bgwriter is your friend. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: To keep indexes in memory, is large enough effective_cache_size enough?
Hi! Thanks for all of the comments! David wrote:> if you mention > how muchRAM the server has and how big the data is now Let's say for example: RAM: 64 GB Data: 500 GB - 1.5 TB, for example. ( RAM: Less would of course be better, e.g. 32 GB, but we could maybe go for an even little bit bigger value than 64 GB, if needed to. ) BR Sam On Wednesday, September 19, 2018 1:11 PM, David Rowley wrote: On 19 September 2018 at 21:18, Sam R. wrote: > Ok. So, we should set also shared_buffers big. It might not be quite as beneficial as you might think. If your database is larger than RAM often having a smaller shared_buffers setting yields better performance. The reason is that if you have a very large shared_buffers that the same buffers can end up cached in the kernel page cache and shared buffers. If you have a smaller shared buffers setting then the chances of that double buffering are reduced and the chances of finding a page cached somewhere increases. However, if your database is quite small and you can afford to fit all your data in shared buffers, with enough free RAM for everything else, then you might benefit from a large shared buffers, but it's important to also consider that some operations, such as DROP TABLE can become slow of shared buffers is very large. You might get more specific recommendations if you mention how much RAM the server has and how big the data is now and will be in the future. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: To keep indexes in memory, is large enough effective_cache_size enough?
Size of the index of one huge table has been e.g. 16-20 GB (after REINDEX). Size of such an index is quite big. BR Samuli On Wednesday, September 19, 2018 2:01 PM, Sam R. wrote: Hi! Thanks for all of the comments! David wrote:> if you mention > how muchRAM the server has and how big the data is now Let's say for example: RAM: 64 GB Data: 500 GB - 1.5 TB, for example. ( RAM: Less would of course be better, e.g. 32 GB, but we could maybe go for an even little bit bigger value than 64 GB, if needed to. ) BR Sam On Wednesday, September 19, 2018 1:11 PM, David Rowley wrote: ...
Re: To keep indexes in memory, is large enough effective_cache_size enough?
On Wed, Sep 19, 2018 at 5:19 AM Sam R. wrote: > Hi! > > Is is possible to force PostgreSQL to keep an index in memory? > It might be possible to put the indexes in a separate tablespace, then do something at the file-system level to to force the OS cache to keep pages for that FS in memory. > The data in db table columns is not needed to be kept in memory, only the > index. (hash index.) > This sounds like speculation. Do you have hard evidence that this is actually the case? > > It would sound optimal in our scenario. > I think Oracle has capability to keep index in memory (in-memory db > functionality). But does PostgreSQL have such a functionality? (I keep > searching.) > There are a lot of Oracle capabilities which encourage people to micromanage the server in ways that are almost never actually productive. Should I actually set shared_buffers to tens of gigabytes also, if I want > to keep one very big index in memory? > If your entire database fits in RAM, then it could be useful to set shared_buffers high enough to fit the entire database. If fitting the entire database in RAM is hopeless, 10s of gigabytes is probably too much, unless you have 100s of GB of RAM. PostgreSQL doesn't do direct IO, but rather uses the OS file cache extensively. This leads to double-buffering, where a page is read from disk and stored in the OS file cache, then handed over to PostgreSQL where it is also stored in shared_buffers. That means that 1/2 of RAM is often the worse value for shared_buffers. You would want it to be either something like 1/20 to 1/10 of RAM, or something like 9/10 or 19/20 of RAM, so that you concentrate pages into one of the caches or the other. The low fraction of RAM is the more generally useful option. The high fraction of RAM is useful when you have very high write loads, particularly intensive index updating--and in that case you probably need someone to intensively monitor and baby-sit the database. Cheers, Jeff
Re: To keep indexes in memory, is large enough effective_cache_size enough?
Thanks for the comments! Sam wrote: >> The data in db table columns is not needed to be kept in memory, only the >> index. (hash index.) Jeff Janes wrote: > This sounds like speculation. Do you have hard evidence that this is > actually the case? In our case the "ID" is randomly generated random number. (Large ID.) It is not a "sequential" number, but random. In generation phase, it is a very large random number. Our application may not even generate the random ID. We use hash index over the ID. At the moment, in "pure theory", we will read randomly through the hash index.So, no one will be able to know what part of the data (from the table) should be kept in memory. Side note: Of course there may be (even many) use cases, where same data is read again and again. Still: I am thinking now from a very theoretical point of view (which we may still apply in practice). In generic: I am not certain how PostgreSQL or hash indexes work in detail, so my claim / wish of keeping only the index in memory may be faulty. (This is one reason for these discussions.) BR Sam
Re: To keep indexes in memory, is large enough effective_cache_size enough?
I believe you can use pg_prewarm to pin index or table to cache. https://www.postgresql.org/docs/current/static/pgprewarm.html On Wed, 19 Sep 2018 at 22:50, Sam R. wrote: > Thanks for the comments! > > Sam wrote: > > >> The data in db table columns is not needed to be kept in memory, only > the index. (hash index.) > > > Jeff Janes wrote: > > This sounds like speculation. Do you have hard evidence that this is > actually the case? > > In our case the "ID" is randomly generated random number. (Large ID.) > It is not a "sequential" number, but random. > > In generation phase, it is a very large random number. Our application may > not even generate the random ID. > > We use hash index over the ID. > > At the moment, in "pure theory", we will read randomly through the hash > index. > So, no one will be able to know what part of the data (from the table) > should be kept in memory. > > Side note: Of course there may be (even many) use cases, where same data > is read again and again. Still: I am thinking now from a very theoretical > point of view (which we may still apply in practice). > > In generic: > I am not certain how PostgreSQL or hash indexes work in detail, so my > claim / wish of keeping only the index in memory may be faulty. (This is > one reason for these discussions.) > > BR Sam > > > -- Regards, Ang Wei Shan
Re: To keep indexes in memory, is large enough effective_cache_size enough?
On 20 September 2018 at 15:19, Wei Shan wrote: > I believe you can use pg_prewarm to pin index or table to cache. > > https://www.postgresql.org/docs/current/static/pgprewarm.html I think the key sentence in the document you linked to is: "Prewarmed data also enjoys no special protection from cache evictions, so it is possible that other system activity may evict the newly prewarmed blocks shortly after they are read" So this is not pinning. It's merely loading buffers into shared buffers in the hope that they might be around long enough for you to make the most of that effort. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services